8 Types of Data Analysis

types of data analysis in research

Data analysis is an aspect of  data science and data analytics that is all about analyzing data for different kinds of purposes. The data analysis process involves inspecting, cleaning, transforming and modeling data to draw useful insights from it.

What Are the Different Types of Data Analysis?

  • Descriptive analysis
  • Diagnostic analysis
  • Exploratory analysis
  • Inferential analysis
  • Predictive analysis
  • Causal analysis
  • Mechanistic analysis
  • Prescriptive analysis

With its multiple facets, methodologies and techniques, data analysis is used in a variety of fields, including business, science and social science, among others. As businesses thrive under the influence of technological advancements in data analytics, data analysis plays a huge role in  decision-making , providing a better, faster and more efficacious system that minimizes risks and reduces  human biases .

That said, there are different kinds of data analysis catered with different goals. We’ll examine each one below.

Two Camps of Data Analysis

Data analysis can be divided into two camps, according to the book  R for Data Science :

  • Hypothesis Generation — This involves looking deeply at the data and combining your domain knowledge to generate hypotheses about why the data behaves the way it does.
  • Hypothesis Confirmation — This involves using a precise mathematical model to generate falsifiable predictions with statistical sophistication to confirm your prior hypotheses.

Types of Data Analysis

Data analysis can be separated and organized into types, arranged in an increasing order of complexity.

1. Descriptive Analysis

The goal of descriptive analysis is to describe or summarize a set of data. Here’s what you need to know:

  • Descriptive analysis is the very first analysis performed in the data analysis process.
  • It generates simple summaries about samples and measurements.
  • It involves common, descriptive statistics like measures of central tendency, variability, frequency and position.

Descriptive Analysis Example

Take the  Covid-19 statistics page on Google, for example. The line graph is a pure summary of the cases/deaths, a presentation and description of the population of a particular country infected by the virus.

Descriptive analysis is the first step in analysis where you summarize and describe the data you have using descriptive statistics, and the result is a simple presentation of your data.

More on Data Analysis: Data Analyst vs. Data Scientist: Similarities and Differences Explained

2. Diagnostic Analysis 

Diagnostic analysis seeks to answer the question “Why did this happen?” by taking a more in-depth look at data to uncover subtle patterns. Here’s what you need to know:

  • Diagnostic analysis typically comes after descriptive analysis, taking initial findings and investigating why certain patterns in data happen. 
  • Diagnostic analysis may involve analyzing other related data sources, including past data, to reveal more insights into current data trends.  
  • Diagnostic analysis is ideal for further exploring patterns in data to explain anomalies.  

Diagnostic Analysis Example

A footwear store wants to review its website traffic levels over the previous 12 months. Upon compiling and assessing the data, the company’s marketing team finds that June experienced above-average levels of traffic while July and August witnessed slightly lower levels of traffic. 

To find out why this difference occurred, the marketing team takes a deeper look. Team members break down the data to focus on specific categories of footwear. In the month of June, they discovered that pages featuring sandals and other beach-related footwear received a high number of views while these numbers dropped in July and August. 

Marketers may also review other factors like seasonal changes and company sales events to see if other variables could have contributed to this trend.   

3. Exploratory Analysis (EDA)

Exploratory analysis involves examining or exploring data and finding relationships between variables that were previously unknown. Here’s what you need to know:

  • EDA helps you discover relationships between measures in your data, which are not evidence for the existence of the correlation, as denoted by the phrase, “ Correlation doesn’t imply causation .”
  • It’s useful for discovering new connections and forming hypotheses. It drives design planning and data collection.

Exploratory Analysis Example

Climate change is an increasingly important topic as the global temperature has gradually risen over the years. One example of an exploratory data analysis on climate change involves taking the rise in temperature over the years from 1950 to 2020 and the increase of human activities and industrialization to find relationships from the data. For example, you may increase the number of factories, cars on the road and airplane flights to see how that correlates with the rise in temperature.

Exploratory analysis explores data to find relationships between measures without identifying the cause. It’s most useful when formulating hypotheses.

4. Inferential Analysis

Inferential analysis involves using a small sample of data to infer information about a larger population of data.

The goal of statistical modeling itself is all about using a small amount of information to extrapolate and generalize information to a larger group. Here’s what you need to know:

  • Inferential analysis involves using estimated data that is representative of a population and gives a measure of uncertainty or standard deviation to your estimation.
  • The  accuracy of inference depends heavily on your sampling scheme. If the sample isn’t representative of the population, the generalization will be inaccurate. This is known as the  central limit theorem .

Inferential Analysis Example

The idea of drawing an inference about the population at large with a smaller sample size is intuitive. Many statistics you see on the media and the internet are inferential; a prediction of an event based on a small sample. For example, a psychological study on the benefits of sleep might have a total of 500 people involved. When they followed up with the candidates, the candidates reported to have better overall attention spans and well-being with seven-to-nine hours of sleep, while those with less sleep and more sleep than the given range suffered from reduced attention spans and energy. This study drawn from 500 people was just a tiny portion of the 7 billion people in the world, and is thus an inference of the larger population.

Inferential analysis extrapolates and generalizes the information of the larger group with a smaller sample to generate analysis and predictions.

5. Predictive Analysis

Predictive analysis involves using historical or current data to find patterns and make predictions about the future. Here’s what you need to know:

  • The accuracy of the predictions depends on the input variables.
  • Accuracy also depends on the types of models. A linear model might work well in some cases, and in other cases it might not.
  • Using a variable to predict another one doesn’t denote a causal relationship.

Predictive Analysis Example

The 2020 US election is a popular topic and many  prediction models are built to predict the winning candidate. FiveThirtyEight did this to forecast the 2016 and 2020 elections. Prediction analysis for an election would require input variables such as historical polling data, trends and current polling data in order to return a good prediction. Something as large as an election wouldn’t just be using a linear model, but a complex model with certain tunings to best serve its purpose.

Predictive analysis takes data from the past and present to make predictions about the future.

More on Data: Explaining the Empirical for Normal Distribution

6. Causal Analysis

Causal analysis looks at the cause and effect of relationships between variables and is focused on finding the cause of a correlation. Here’s what you need to know:

  • To find the cause, you have to question whether the observed correlations driving your conclusion are valid. Just looking at the surface data won’t help you discover the hidden mechanisms underlying the correlations.
  • Causal analysis is applied in randomized studies focused on identifying causation.
  • Causal analysis is the gold standard in data analysis and scientific studies where the cause of phenomenon is to be extracted and singled out, like separating wheat from chaff.
  • Good data is hard to find and requires expensive research and studies. These studies are analyzed in aggregate (multiple groups), and the observed relationships are just average effects (mean) of the whole population. This means the results might not apply to everyone.

Causal Analysis Example  

Say you want to test out whether a new drug improves human strength and focus. To do that, you perform randomized control trials for the drug to test its effect. You compare the sample of candidates for your new drug against the candidates receiving a mock control drug through a few tests focused on strength and overall focus and attention. This will allow you to observe how the drug affects the outcome.

Causal analysis is about finding out the causal relationship between variables, and examining how a change in one variable affects another.

7. Mechanistic Analysis

Mechanistic analysis is used to understand exact changes in variables that lead to other changes in other variables. Here’s what you need to know:

  • It’s applied in physical or engineering sciences, situations that require high precision and little room for error, only noise in data is measurement error.
  • It’s designed to understand a biological or behavioral process, the pathophysiology of a disease or the mechanism of action of an intervention. 

Mechanistic Analysis Example

Many graduate-level research and complex topics are suitable examples, but to put it in simple terms, let’s say an experiment is done to simulate safe and effective nuclear fusion to power the world. A mechanistic analysis of the study would entail a precise balance of controlling and manipulating variables with highly accurate measures of both variables and the desired outcomes. It’s this intricate and meticulous modus operandi toward these big topics that allows for scientific breakthroughs and advancement of society.

Mechanistic analysis is in some ways a predictive analysis, but modified to tackle studies that require high precision and meticulous methodologies for physical or engineering science .

8. Prescriptive Analysis 

Prescriptive analysis compiles insights from other previous data analyses and determines actions that teams or companies can take to prepare for predicted trends. Here’s what you need to know: 

  • Prescriptive analysis may come right after predictive analysis, but it may involve combining many different data analyses. 
  • Companies need advanced technology and plenty of resources to conduct prescriptive analysis. AI systems that process data and adjust automated tasks are an example of the technology required to perform prescriptive analysis.  

Prescriptive Analysis Example

Prescriptive analysis is pervasive in everyday life, driving the curated content users consume on social media. On platforms like TikTok and Instagram, algorithms can apply prescriptive analysis to review past content a user has engaged with and the kinds of behaviors they exhibited with specific posts. Based on these factors, an algorithm seeks out similar content that is likely to elicit the same response and recommends it on a user’s personal feed. 

When to Use the Different Types of Data Analysis 

  • Descriptive analysis summarizes the data at hand and presents your data in a comprehensible way.
  • Diagnostic analysis takes a more detailed look at data to reveal why certain patterns occur, making it a good method for explaining anomalies. 
  • Exploratory data analysis helps you discover correlations and relationships between variables in your data.
  • Inferential analysis is for generalizing the larger population with a smaller sample size of data.
  • Predictive analysis helps you make predictions about the future with data.
  • Causal analysis emphasizes finding the cause of a correlation between variables.
  • Mechanistic analysis is for measuring the exact changes in variables that lead to other changes in other variables.
  • Prescriptive analysis combines insights from different data analyses to develop a course of action teams and companies can take to capitalize on predicted outcomes. 

A few important tips to remember about data analysis include:

  • Correlation doesn’t imply causation.
  • EDA helps discover new connections and form hypotheses.
  • Accuracy of inference depends on the sampling scheme.
  • A good prediction depends on the right input variables.
  • A simple linear model with enough data usually does the trick.
  • Using a variable to predict another doesn’t denote causal relationships.
  • Good data is hard to find, and to produce it requires expensive research.
  • Results from studies are done in aggregate and are average effects and might not apply to everyone.​

Built In’s expert contributor network publishes thoughtful, solutions-oriented stories written by innovative tech professionals. It is the tech industry’s definitive destination for sharing compelling, first-person accounts of problem-solving on the road to innovation.

Great Companies Need Great People. That's Where We Come In.

Analyst Answers

Data & Finance for Work & Life

data analysis types, methods, and techniques tree diagram

Data Analysis: Types, Methods & Techniques (a Complete List)

( Updated Version )

While the term sounds intimidating, “data analysis” is nothing more than making sense of information in a table. It consists of filtering, sorting, grouping, and manipulating data tables with basic algebra and statistics.

In fact, you don’t need experience to understand the basics. You have already worked with data extensively in your life, and “analysis” is nothing more than a fancy word for good sense and basic logic.

Over time, people have intuitively categorized the best logical practices for treating data. These categories are what we call today types , methods , and techniques .

This article provides a comprehensive list of types, methods, and techniques, and explains the difference between them.

For a practical intro to data analysis (including types, methods, & techniques), check out our Intro to Data Analysis eBook for free.

Descriptive, Diagnostic, Predictive, & Prescriptive Analysis

If you Google “types of data analysis,” the first few results will explore descriptive , diagnostic , predictive , and prescriptive analysis. Why? Because these names are easy to understand and are used a lot in “the real world.”

Descriptive analysis is an informational method, diagnostic analysis explains “why” a phenomenon occurs, predictive analysis seeks to forecast the result of an action, and prescriptive analysis identifies solutions to a specific problem.

That said, these are only four branches of a larger analytical tree.

Good data analysts know how to position these four types within other analytical methods and tactics, allowing them to leverage strengths and weaknesses in each to uproot the most valuable insights.

Let’s explore the full analytical tree to understand how to appropriately assess and apply these four traditional types.

Tree diagram of Data Analysis Types, Methods, and Techniques

Here’s a picture to visualize the structure and hierarchy of data analysis types, methods, and techniques.

If it’s too small you can view the picture in a new tab . Open it to follow along!

types of data analysis in research

Note: basic descriptive statistics such as mean , median , and mode , as well as standard deviation , are not shown because most people are already familiar with them. In the diagram, they would fall under the “descriptive” analysis type.

Tree Diagram Explained

The highest-level classification of data analysis is quantitative vs qualitative . Quantitative implies numbers while qualitative implies information other than numbers.

Quantitative data analysis then splits into mathematical analysis and artificial intelligence (AI) analysis . Mathematical types then branch into descriptive , diagnostic , predictive , and prescriptive .

Methods falling under mathematical analysis include clustering , classification , forecasting , and optimization . Qualitative data analysis methods include content analysis , narrative analysis , discourse analysis , framework analysis , and/or grounded theory .

Moreover, mathematical techniques include regression , Nïave Bayes , Simple Exponential Smoothing , cohorts , factors , linear discriminants , and more, whereas techniques falling under the AI type include artificial neural networks , decision trees , evolutionary programming , and fuzzy logic . Techniques under qualitative analysis include text analysis , coding , idea pattern analysis , and word frequency .

It’s a lot to remember! Don’t worry, once you understand the relationship and motive behind all these terms, it’ll be like riding a bike.

We’ll move down the list from top to bottom and I encourage you to open the tree diagram above in a new tab so you can follow along .

But first, let’s just address the elephant in the room: what’s the difference between methods and techniques anyway?

Difference between methods and techniques

Though often used interchangeably, methods ands techniques are not the same. By definition, methods are the process by which techniques are applied, and techniques are the practical application of those methods.

For example, consider driving. Methods include staying in your lane, stopping at a red light, and parking in a spot. Techniques include turning the steering wheel, braking, and pushing the gas pedal.

Data sets: observations and fields

It’s important to understand the basic structure of data tables to comprehend the rest of the article. A data set consists of one far-left column containing observations, then a series of columns containing the fields (aka “traits” or “characteristics”) that describe each observations. For example, imagine we want a data table for fruit. It might look like this:

Now let’s turn to types, methods, and techniques. Each heading below consists of a description, relative importance, the nature of data it explores, and the motivation for using it.

Quantitative Analysis

  • It accounts for more than 50% of all data analysis and is by far the most widespread and well-known type of data analysis.
  • As you have seen, it holds descriptive, diagnostic, predictive, and prescriptive methods, which in turn hold some of the most important techniques available today, such as clustering and forecasting.
  • It can be broken down into mathematical and AI analysis.
  • Importance : Very high . Quantitative analysis is a must for anyone interesting in becoming or improving as a data analyst.
  • Nature of Data: data treated under quantitative analysis is, quite simply, quantitative. It encompasses all numeric data.
  • Motive: to extract insights. (Note: we’re at the top of the pyramid, this gets more insightful as we move down.)

Qualitative Analysis

  • It accounts for less than 30% of all data analysis and is common in social sciences .
  • It can refer to the simple recognition of qualitative elements, which is not analytic in any way, but most often refers to methods that assign numeric values to non-numeric data for analysis.
  • Because of this, some argue that it’s ultimately a quantitative type.
  • Importance: Medium. In general, knowing qualitative data analysis is not common or even necessary for corporate roles. However, for researchers working in social sciences, its importance is very high .
  • Nature of Data: data treated under qualitative analysis is non-numeric. However, as part of the analysis, analysts turn non-numeric data into numbers, at which point many argue it is no longer qualitative analysis.
  • Motive: to extract insights. (This will be more important as we move down the pyramid.)

Mathematical Analysis

  • Description: mathematical data analysis is a subtype of qualitative data analysis that designates methods and techniques based on statistics, algebra, and logical reasoning to extract insights. It stands in opposition to artificial intelligence analysis.
  • Importance: Very High. The most widespread methods and techniques fall under mathematical analysis. In fact, it’s so common that many people use “quantitative” and “mathematical” analysis interchangeably.
  • Nature of Data: numeric. By definition, all data under mathematical analysis are numbers.
  • Motive: to extract measurable insights that can be used to act upon.

Artificial Intelligence & Machine Learning Analysis

  • Description: artificial intelligence and machine learning analyses designate techniques based on the titular skills. They are not traditionally mathematical, but they are quantitative since they use numbers. Applications of AI & ML analysis techniques are developing, but they’re not yet mainstream enough to show promise across the field.
  • Importance: Medium . As of today (September 2020), you don’t need to be fluent in AI & ML data analysis to be a great analyst. BUT, if it’s a field that interests you, learn it. Many believe that in 10 year’s time its importance will be very high .
  • Nature of Data: numeric.
  • Motive: to create calculations that build on themselves in order and extract insights without direct input from a human.

Descriptive Analysis

  • Description: descriptive analysis is a subtype of mathematical data analysis that uses methods and techniques to provide information about the size, dispersion, groupings, and behavior of data sets. This may sounds complicated, but just think about mean, median, and mode: all three are types of descriptive analysis. They provide information about the data set. We’ll look at specific techniques below.
  • Importance: Very high. Descriptive analysis is among the most commonly used data analyses in both corporations and research today.
  • Nature of Data: the nature of data under descriptive statistics is sets. A set is simply a collection of numbers that behaves in predictable ways. Data reflects real life, and there are patterns everywhere to be found. Descriptive analysis describes those patterns.
  • Motive: the motive behind descriptive analysis is to understand how numbers in a set group together, how far apart they are from each other, and how often they occur. As with most statistical analysis, the more data points there are, the easier it is to describe the set.

Diagnostic Analysis

  • Description: diagnostic analysis answers the question “why did it happen?” It is an advanced type of mathematical data analysis that manipulates multiple techniques, but does not own any single one. Analysts engage in diagnostic analysis when they try to explain why.
  • Importance: Very high. Diagnostics are probably the most important type of data analysis for people who don’t do analysis because they’re valuable to anyone who’s curious. They’re most common in corporations, as managers often only want to know the “why.”
  • Nature of Data : data under diagnostic analysis are data sets. These sets in themselves are not enough under diagnostic analysis. Instead, the analyst must know what’s behind the numbers in order to explain “why.” That’s what makes diagnostics so challenging yet so valuable.
  • Motive: the motive behind diagnostics is to diagnose — to understand why.

Predictive Analysis

  • Description: predictive analysis uses past data to project future data. It’s very often one of the first kinds of analysis new researchers and corporate analysts use because it is intuitive. It is a subtype of the mathematical type of data analysis, and its three notable techniques are regression, moving average, and exponential smoothing.
  • Importance: Very high. Predictive analysis is critical for any data analyst working in a corporate environment. Companies always want to know what the future will hold — especially for their revenue.
  • Nature of Data: Because past and future imply time, predictive data always includes an element of time. Whether it’s minutes, hours, days, months, or years, we call this time series data . In fact, this data is so important that I’ll mention it twice so you don’t forget: predictive analysis uses time series data .
  • Motive: the motive for investigating time series data with predictive analysis is to predict the future in the most analytical way possible.

Prescriptive Analysis

  • Description: prescriptive analysis is a subtype of mathematical analysis that answers the question “what will happen if we do X?” It’s largely underestimated in the data analysis world because it requires diagnostic and descriptive analyses to be done before it even starts. More than simple predictive analysis, prescriptive analysis builds entire data models to show how a simple change could impact the ensemble.
  • Importance: High. Prescriptive analysis is most common under the finance function in many companies. Financial analysts use it to build a financial model of the financial statements that show how that data will change given alternative inputs.
  • Nature of Data: the nature of data in prescriptive analysis is data sets. These data sets contain patterns that respond differently to various inputs. Data that is useful for prescriptive analysis contains correlations between different variables. It’s through these correlations that we establish patterns and prescribe action on this basis. This analysis cannot be performed on data that exists in a vacuum — it must be viewed on the backdrop of the tangibles behind it.
  • Motive: the motive for prescriptive analysis is to establish, with an acceptable degree of certainty, what results we can expect given a certain action. As you might expect, this necessitates that the analyst or researcher be aware of the world behind the data, not just the data itself.

Clustering Method

  • Description: the clustering method groups data points together based on their relativeness closeness to further explore and treat them based on these groupings. There are two ways to group clusters: intuitively and statistically (or K-means).
  • Importance: Very high. Though most corporate roles group clusters intuitively based on management criteria, a solid understanding of how to group them mathematically is an excellent descriptive and diagnostic approach to allow for prescriptive analysis thereafter.
  • Nature of Data : the nature of data useful for clustering is sets with 1 or more data fields. While most people are used to looking at only two dimensions (x and y), clustering becomes more accurate the more fields there are.
  • Motive: the motive for clustering is to understand how data sets group and to explore them further based on those groups.
  • Here’s an example set:

types of data analysis in research

Classification Method

  • Description: the classification method aims to separate and group data points based on common characteristics . This can be done intuitively or statistically.
  • Importance: High. While simple on the surface, classification can become quite complex. It’s very valuable in corporate and research environments, but can feel like its not worth the work. A good analyst can execute it quickly to deliver results.
  • Nature of Data: the nature of data useful for classification is data sets. As we will see, it can be used on qualitative data as well as quantitative. This method requires knowledge of the substance behind the data, not just the numbers themselves.
  • Motive: the motive for classification is group data not based on mathematical relationships (which would be clustering), but by predetermined outputs. This is why it’s less useful for diagnostic analysis, and more useful for prescriptive analysis.

Forecasting Method

  • Description: the forecasting method uses time past series data to forecast the future.
  • Importance: Very high. Forecasting falls under predictive analysis and is arguably the most common and most important method in the corporate world. It is less useful in research, which prefers to understand the known rather than speculate about the future.
  • Nature of Data: data useful for forecasting is time series data, which, as we’ve noted, always includes a variable of time.
  • Motive: the motive for the forecasting method is the same as that of prescriptive analysis: the confidently estimate future values.

Optimization Method

  • Description: the optimization method maximized or minimizes values in a set given a set of criteria. It is arguably most common in prescriptive analysis. In mathematical terms, it is maximizing or minimizing a function given certain constraints.
  • Importance: Very high. The idea of optimization applies to more analysis types than any other method. In fact, some argue that it is the fundamental driver behind data analysis. You would use it everywhere in research and in a corporation.
  • Nature of Data: the nature of optimizable data is a data set of at least two points.
  • Motive: the motive behind optimization is to achieve the best result possible given certain conditions.

Content Analysis Method

  • Description: content analysis is a method of qualitative analysis that quantifies textual data to track themes across a document. It’s most common in academic fields and in social sciences, where written content is the subject of inquiry.
  • Importance: High. In a corporate setting, content analysis as such is less common. If anything Nïave Bayes (a technique we’ll look at below) is the closest corporations come to text. However, it is of the utmost importance for researchers. If you’re a researcher, check out this article on content analysis .
  • Nature of Data: data useful for content analysis is textual data.
  • Motive: the motive behind content analysis is to understand themes expressed in a large text

Narrative Analysis Method

  • Description: narrative analysis is a method of qualitative analysis that quantifies stories to trace themes in them. It’s differs from content analysis because it focuses on stories rather than research documents, and the techniques used are slightly different from those in content analysis (very nuances and outside the scope of this article).
  • Importance: Low. Unless you are highly specialized in working with stories, narrative analysis rare.
  • Nature of Data: the nature of the data useful for the narrative analysis method is narrative text.
  • Motive: the motive for narrative analysis is to uncover hidden patterns in narrative text.

Discourse Analysis Method

  • Description: the discourse analysis method falls under qualitative analysis and uses thematic coding to trace patterns in real-life discourse. That said, real-life discourse is oral, so it must first be transcribed into text.
  • Importance: Low. Unless you are focused on understand real-world idea sharing in a research setting, this kind of analysis is less common than the others on this list.
  • Nature of Data: the nature of data useful in discourse analysis is first audio files, then transcriptions of those audio files.
  • Motive: the motive behind discourse analysis is to trace patterns of real-world discussions. (As a spooky sidenote, have you ever felt like your phone microphone was listening to you and making reading suggestions? If it was, the method was discourse analysis.)

Framework Analysis Method

  • Description: the framework analysis method falls under qualitative analysis and uses similar thematic coding techniques to content analysis. However, where content analysis aims to discover themes, framework analysis starts with a framework and only considers elements that fall in its purview.
  • Importance: Low. As with the other textual analysis methods, framework analysis is less common in corporate settings. Even in the world of research, only some use it. Strangely, it’s very common for legislative and political research.
  • Nature of Data: the nature of data useful for framework analysis is textual.
  • Motive: the motive behind framework analysis is to understand what themes and parts of a text match your search criteria.

Grounded Theory Method

  • Description: the grounded theory method falls under qualitative analysis and uses thematic coding to build theories around those themes.
  • Importance: Low. Like other qualitative analysis techniques, grounded theory is less common in the corporate world. Even among researchers, you would be hard pressed to find many using it. Though powerful, it’s simply too rare to spend time learning.
  • Nature of Data: the nature of data useful in the grounded theory method is textual.
  • Motive: the motive of grounded theory method is to establish a series of theories based on themes uncovered from a text.

Clustering Technique: K-Means

  • Description: k-means is a clustering technique in which data points are grouped in clusters that have the closest means. Though not considered AI or ML, it inherently requires the use of supervised learning to reevaluate clusters as data points are added. Clustering techniques can be used in diagnostic, descriptive, & prescriptive data analyses.
  • Importance: Very important. If you only take 3 things from this article, k-means clustering should be part of it. It is useful in any situation where n observations have multiple characteristics and we want to put them in groups.
  • Nature of Data: the nature of data is at least one characteristic per observation, but the more the merrier.
  • Motive: the motive for clustering techniques such as k-means is to group observations together and either understand or react to them.

Regression Technique

  • Description: simple and multivariable regressions use either one independent variable or combination of multiple independent variables to calculate a correlation to a single dependent variable using constants. Regressions are almost synonymous with correlation today.
  • Importance: Very high. Along with clustering, if you only take 3 things from this article, regression techniques should be part of it. They’re everywhere in corporate and research fields alike.
  • Nature of Data: the nature of data used is regressions is data sets with “n” number of observations and as many variables as are reasonable. It’s important, however, to distinguish between time series data and regression data. You cannot use regressions or time series data without accounting for time. The easier way is to use techniques under the forecasting method.
  • Motive: The motive behind regression techniques is to understand correlations between independent variable(s) and a dependent one.

Nïave Bayes Technique

  • Description: Nïave Bayes is a classification technique that uses simple probability to classify items based previous classifications. In plain English, the formula would be “the chance that thing with trait x belongs to class c depends on (=) the overall chance of trait x belonging to class c, multiplied by the overall chance of class c, divided by the overall chance of getting trait x.” As a formula, it’s P(c|x) = P(x|c) * P(c) / P(x).
  • Importance: High. Nïave Bayes is a very common, simplistic classification techniques because it’s effective with large data sets and it can be applied to any instant in which there is a class. Google, for example, might use it to group webpages into groups for certain search engine queries.
  • Nature of Data: the nature of data for Nïave Bayes is at least one class and at least two traits in a data set.
  • Motive: the motive behind Nïave Bayes is to classify observations based on previous data. It’s thus considered part of predictive analysis.

Cohorts Technique

  • Description: cohorts technique is a type of clustering method used in behavioral sciences to separate users by common traits. As with clustering, it can be done intuitively or mathematically, the latter of which would simply be k-means.
  • Importance: Very high. With regard to resembles k-means, the cohort technique is more of a high-level counterpart. In fact, most people are familiar with it as a part of Google Analytics. It’s most common in marketing departments in corporations, rather than in research.
  • Nature of Data: the nature of cohort data is data sets in which users are the observation and other fields are used as defining traits for each cohort.
  • Motive: the motive for cohort analysis techniques is to group similar users and analyze how you retain them and how the churn.

Factor Technique

  • Description: the factor analysis technique is a way of grouping many traits into a single factor to expedite analysis. For example, factors can be used as traits for Nïave Bayes classifications instead of more general fields.
  • Importance: High. While not commonly employed in corporations, factor analysis is hugely valuable. Good data analysts use it to simplify their projects and communicate them more clearly.
  • Nature of Data: the nature of data useful in factor analysis techniques is data sets with a large number of fields on its observations.
  • Motive: the motive for using factor analysis techniques is to reduce the number of fields in order to more quickly analyze and communicate findings.

Linear Discriminants Technique

  • Description: linear discriminant analysis techniques are similar to regressions in that they use one or more independent variable to determine a dependent variable; however, the linear discriminant technique falls under a classifier method since it uses traits as independent variables and class as a dependent variable. In this way, it becomes a classifying method AND a predictive method.
  • Importance: High. Though the analyst world speaks of and uses linear discriminants less commonly, it’s a highly valuable technique to keep in mind as you progress in data analysis.
  • Nature of Data: the nature of data useful for the linear discriminant technique is data sets with many fields.
  • Motive: the motive for using linear discriminants is to classify observations that would be otherwise too complex for simple techniques like Nïave Bayes.

Exponential Smoothing Technique

  • Description: exponential smoothing is a technique falling under the forecasting method that uses a smoothing factor on prior data in order to predict future values. It can be linear or adjusted for seasonality. The basic principle behind exponential smoothing is to use a percent weight (value between 0 and 1 called alpha) on more recent values in a series and a smaller percent weight on less recent values. The formula is f(x) = current period value * alpha + previous period value * 1-alpha.
  • Importance: High. Most analysts still use the moving average technique (covered next) for forecasting, though it is less efficient than exponential moving, because it’s easy to understand. However, good analysts will have exponential smoothing techniques in their pocket to increase the value of their forecasts.
  • Nature of Data: the nature of data useful for exponential smoothing is time series data . Time series data has time as part of its fields .
  • Motive: the motive for exponential smoothing is to forecast future values with a smoothing variable.

Moving Average Technique

  • Description: the moving average technique falls under the forecasting method and uses an average of recent values to predict future ones. For example, to predict rainfall in April, you would take the average of rainfall from January to March. It’s simple, yet highly effective.
  • Importance: Very high. While I’m personally not a huge fan of moving averages due to their simplistic nature and lack of consideration for seasonality, they’re the most common forecasting technique and therefore very important.
  • Nature of Data: the nature of data useful for moving averages is time series data .
  • Motive: the motive for moving averages is to predict future values is a simple, easy-to-communicate way.

Neural Networks Technique

  • Description: neural networks are a highly complex artificial intelligence technique that replicate a human’s neural analysis through a series of hyper-rapid computations and comparisons that evolve in real time. This technique is so complex that an analyst must use computer programs to perform it.
  • Importance: Medium. While the potential for neural networks is theoretically unlimited, it’s still little understood and therefore uncommon. You do not need to know it by any means in order to be a data analyst.
  • Nature of Data: the nature of data useful for neural networks is data sets of astronomical size, meaning with 100s of 1000s of fields and the same number of row at a minimum .
  • Motive: the motive for neural networks is to understand wildly complex phenomenon and data to thereafter act on it.

Decision Tree Technique

  • Description: the decision tree technique uses artificial intelligence algorithms to rapidly calculate possible decision pathways and their outcomes on a real-time basis. It’s so complex that computer programs are needed to perform it.
  • Importance: Medium. As with neural networks, decision trees with AI are too little understood and are therefore uncommon in corporate and research settings alike.
  • Nature of Data: the nature of data useful for the decision tree technique is hierarchical data sets that show multiple optional fields for each preceding field.
  • Motive: the motive for decision tree techniques is to compute the optimal choices to make in order to achieve a desired result.

Evolutionary Programming Technique

  • Description: the evolutionary programming technique uses a series of neural networks, sees how well each one fits a desired outcome, and selects only the best to test and retest. It’s called evolutionary because is resembles the process of natural selection by weeding out weaker options.
  • Importance: Medium. As with the other AI techniques, evolutionary programming just isn’t well-understood enough to be usable in many cases. It’s complexity also makes it hard to explain in corporate settings and difficult to defend in research settings.
  • Nature of Data: the nature of data in evolutionary programming is data sets of neural networks, or data sets of data sets.
  • Motive: the motive for using evolutionary programming is similar to decision trees: understanding the best possible option from complex data.
  • Video example :

Fuzzy Logic Technique

  • Description: fuzzy logic is a type of computing based on “approximate truths” rather than simple truths such as “true” and “false.” It is essentially two tiers of classification. For example, to say whether “Apples are good,” you need to first classify that “Good is x, y, z.” Only then can you say apples are good. Another way to see it helping a computer see truth like humans do: “definitely true, probably true, maybe true, probably false, definitely false.”
  • Importance: Medium. Like the other AI techniques, fuzzy logic is uncommon in both research and corporate settings, which means it’s less important in today’s world.
  • Nature of Data: the nature of fuzzy logic data is huge data tables that include other huge data tables with a hierarchy including multiple subfields for each preceding field.
  • Motive: the motive of fuzzy logic to replicate human truth valuations in a computer is to model human decisions based on past data. The obvious possible application is marketing.

Text Analysis Technique

  • Description: text analysis techniques fall under the qualitative data analysis type and use text to extract insights.
  • Importance: Medium. Text analysis techniques, like all the qualitative analysis type, are most valuable for researchers.
  • Nature of Data: the nature of data useful in text analysis is words.
  • Motive: the motive for text analysis is to trace themes in a text across sets of very long documents, such as books.

Coding Technique

  • Description: the coding technique is used in textual analysis to turn ideas into uniform phrases and analyze the number of times and the ways in which those ideas appear. For this reason, some consider it a quantitative technique as well. You can learn more about coding and the other qualitative techniques here .
  • Importance: Very high. If you’re a researcher working in social sciences, coding is THE analysis techniques, and for good reason. It’s a great way to add rigor to analysis. That said, it’s less common in corporate settings.
  • Nature of Data: the nature of data useful for coding is long text documents.
  • Motive: the motive for coding is to make tracing ideas on paper more than an exercise of the mind by quantifying it and understanding is through descriptive methods.

Idea Pattern Technique

  • Description: the idea pattern analysis technique fits into coding as the second step of the process. Once themes and ideas are coded, simple descriptive analysis tests may be run. Some people even cluster the ideas!
  • Importance: Very high. If you’re a researcher, idea pattern analysis is as important as the coding itself.
  • Nature of Data: the nature of data useful for idea pattern analysis is already coded themes.
  • Motive: the motive for the idea pattern technique is to trace ideas in otherwise unmanageably-large documents.

Word Frequency Technique

  • Description: word frequency is a qualitative technique that stands in opposition to coding and uses an inductive approach to locate specific words in a document in order to understand its relevance. Word frequency is essentially the descriptive analysis of qualitative data because it uses stats like mean, median, and mode to gather insights.
  • Importance: High. As with the other qualitative approaches, word frequency is very important in social science research, but less so in corporate settings.
  • Nature of Data: the nature of data useful for word frequency is long, informative documents.
  • Motive: the motive for word frequency is to locate target words to determine the relevance of a document in question.

Types of data analysis in research

Types of data analysis in research methodology include every item discussed in this article. As a list, they are:

  • Quantitative
  • Qualitative
  • Mathematical
  • Machine Learning and AI
  • Descriptive
  • Prescriptive
  • Classification
  • Forecasting
  • Optimization
  • Grounded theory
  • Artificial Neural Networks
  • Decision Trees
  • Evolutionary Programming
  • Fuzzy Logic
  • Text analysis
  • Idea Pattern Analysis
  • Word Frequency Analysis
  • Nïave Bayes
  • Exponential smoothing
  • Moving average
  • Linear discriminant

Types of data analysis in qualitative research

As a list, the types of data analysis in qualitative research are the following methods:

Types of data analysis in quantitative research

As a list, the types of data analysis in quantitative research are:

Data analysis methods

As a list, data analysis methods are:

  • Content (qualitative)
  • Narrative (qualitative)
  • Discourse (qualitative)
  • Framework (qualitative)
  • Grounded theory (qualitative)

Quantitative data analysis methods

As a list, quantitative data analysis methods are:

Tabular View of Data Analysis Types, Methods, and Techniques

About the author.

Noah is the founder & Editor-in-Chief at AnalystAnswers. He is a transatlantic professional and entrepreneur with 5+ years of corporate finance and data analytics experience, as well as 3+ years in consumer financial products and business software. He started AnalystAnswers to provide aspiring professionals with accessible explanations of otherwise dense finance and data concepts. Noah believes everyone can benefit from an analytical mindset in growing digital world. When he's not busy at work, Noah likes to explore new European cities, exercise, and spend time with friends and family.

File available immediately.

types of data analysis in research

Notice: JavaScript is required for this content.

Your Modern Business Guide To Data Analysis Methods And Techniques

Data analysis methods and techniques blog post by datapine

Table of Contents

1) What Is Data Analysis?

2) Why Is Data Analysis Important?

3) What Is The Data Analysis Process?

4) Types Of Data Analysis Methods

5) Top Data Analysis Techniques To Apply

6) Quality Criteria For Data Analysis

7) Data Analysis Limitations & Barriers

8) Data Analysis Skills

9) Data Analysis In The Big Data Environment

In our data-rich age, understanding how to analyze and extract true meaning from our business’s digital insights is one of the primary drivers of success.

Despite the colossal volume of data we create every day, a mere 0.5% is actually analyzed and used for data discovery , improvement, and intelligence. While that may not seem like much, considering the amount of digital information we have at our fingertips, half a percent still accounts for a vast amount of data.

With so much data and so little time, knowing how to collect, curate, organize, and make sense of all of this potentially business-boosting information can be a minefield – but online data analysis is the solution.

In science, data analysis uses a more complex approach with advanced techniques to explore and experiment with data. On the other hand, in a business context, data is used to make data-driven decisions that will enable the company to improve its overall performance. In this post, we will cover the analysis of data from an organizational point of view while still going through the scientific and statistical foundations that are fundamental to understanding the basics of data analysis. 

To put all of that into perspective, we will answer a host of important analytical questions, explore analytical methods and techniques, while demonstrating how to perform analysis in the real world with a 17-step blueprint for success.

What Is Data Analysis?

Data analysis is the process of collecting, modeling, and analyzing data using various statistical and logical methods and techniques. Businesses rely on analytics processes and tools to extract insights that support strategic and operational decision-making.

All these various methods are largely based on two core areas: quantitative and qualitative research.

To explain the key differences between qualitative and quantitative research, here’s a video for your viewing pleasure:

Gaining a better understanding of different techniques and methods in quantitative research as well as qualitative insights will give your analyzing efforts a more clearly defined direction, so it’s worth taking the time to allow this particular knowledge to sink in. Additionally, you will be able to create a comprehensive analytical report that will skyrocket your analysis.

Apart from qualitative and quantitative categories, there are also other types of data that you should be aware of before dividing into complex data analysis processes. These categories include: 

  • Big data: Refers to massive data sets that need to be analyzed using advanced software to reveal patterns and trends. It is considered to be one of the best analytical assets as it provides larger volumes of data at a faster rate. 
  • Metadata: Putting it simply, metadata is data that provides insights about other data. It summarizes key information about specific data that makes it easier to find and reuse for later purposes. 
  • Real time data: As its name suggests, real time data is presented as soon as it is acquired. From an organizational perspective, this is the most valuable data as it can help you make important decisions based on the latest developments. Our guide on real time analytics will tell you more about the topic. 
  • Machine data: This is more complex data that is generated solely by a machine such as phones, computers, or even websites and embedded systems, without previous human interaction.

Why Is Data Analysis Important?

Before we go into detail about the categories of analysis along with its methods and techniques, you must understand the potential that analyzing data can bring to your organization.

  • Informed decision-making : From a management perspective, you can benefit from analyzing your data as it helps you make decisions based on facts and not simple intuition. For instance, you can understand where to invest your capital, detect growth opportunities, predict your income, or tackle uncommon situations before they become problems. Through this, you can extract relevant insights from all areas in your organization, and with the help of dashboard software , present the data in a professional and interactive way to different stakeholders.
  • Reduce costs : Another great benefit is to reduce costs. With the help of advanced technologies such as predictive analytics, businesses can spot improvement opportunities, trends, and patterns in their data and plan their strategies accordingly. In time, this will help you save money and resources on implementing the wrong strategies. And not just that, by predicting different scenarios such as sales and demand you can also anticipate production and supply. 
  • Target customers better : Customers are arguably the most crucial element in any business. By using analytics to get a 360° vision of all aspects related to your customers, you can understand which channels they use to communicate with you, their demographics, interests, habits, purchasing behaviors, and more. In the long run, it will drive success to your marketing strategies, allow you to identify new potential customers, and avoid wasting resources on targeting the wrong people or sending the wrong message. You can also track customer satisfaction by analyzing your client’s reviews or your customer service department’s performance.

What Is The Data Analysis Process?

Data analysis process graphic

When we talk about analyzing data there is an order to follow in order to extract the needed conclusions. The analysis process consists of 5 key stages. We will cover each of them more in detail later in the post, but to start providing the needed context to understand what is coming next, here is a rundown of the 5 essential steps of data analysis. 

  • Identify: Before you get your hands dirty with data, you first need to identify why you need it in the first place. The identification is the stage in which you establish the questions you will need to answer. For example, what is the customer's perception of our brand? Or what type of packaging is more engaging to our potential customers? Once the questions are outlined you are ready for the next step. 
  • Collect: As its name suggests, this is the stage where you start collecting the needed data. Here, you define which sources of data you will use and how you will use them. The collection of data can come in different forms such as internal or external sources, surveys, interviews, questionnaires, and focus groups, among others.  An important note here is that the way you collect the data will be different in a quantitative and qualitative scenario. 
  • Clean: Once you have the necessary data it is time to clean it and leave it ready for analysis. Not all the data you collect will be useful, when collecting big amounts of data in different formats it is very likely that you will find yourself with duplicate or badly formatted data. To avoid this, before you start working with your data you need to make sure to erase any white spaces, duplicate records, or formatting errors. This way you avoid hurting your analysis with bad-quality data. 
  • Analyze : With the help of various techniques such as statistical analysis, regressions, neural networks, text analysis, and more, you can start analyzing and manipulating your data to extract relevant conclusions. At this stage, you find trends, correlations, variations, and patterns that can help you answer the questions you first thought of in the identify stage. Various technologies in the market assist researchers and average users with the management of their data. Some of them include business intelligence and visualization software, predictive analytics, and data mining, among others. 
  • Interpret: Last but not least you have one of the most important steps: it is time to interpret your results. This stage is where the researcher comes up with courses of action based on the findings. For example, here you would understand if your clients prefer packaging that is red or green, plastic or paper, etc. Additionally, at this stage, you can also find some limitations and work on them. 

Now that you have a basic understanding of the key data analysis steps, let’s look at the top 17 essential methods.

17 Essential Types Of Data Analysis Methods

Before diving into the 17 essential types of methods, it is important that we go over really fast through the main analysis categories. Starting with the category of descriptive up to prescriptive analysis, the complexity and effort of data evaluation increases, but also the added value for the company.

a) Descriptive analysis - What happened.

The descriptive analysis method is the starting point for any analytic reflection, and it aims to answer the question of what happened? It does this by ordering, manipulating, and interpreting raw data from various sources to turn it into valuable insights for your organization.

Performing descriptive analysis is essential, as it enables us to present our insights in a meaningful way. Although it is relevant to mention that this analysis on its own will not allow you to predict future outcomes or tell you the answer to questions like why something happened, it will leave your data organized and ready to conduct further investigations.

b) Exploratory analysis - How to explore data relationships.

As its name suggests, the main aim of the exploratory analysis is to explore. Prior to it, there is still no notion of the relationship between the data and the variables. Once the data is investigated, exploratory analysis helps you to find connections and generate hypotheses and solutions for specific problems. A typical area of ​​application for it is data mining.

c) Diagnostic analysis - Why it happened.

Diagnostic data analytics empowers analysts and executives by helping them gain a firm contextual understanding of why something happened. If you know why something happened as well as how it happened, you will be able to pinpoint the exact ways of tackling the issue or challenge.

Designed to provide direct and actionable answers to specific questions, this is one of the world’s most important methods in research, among its other key organizational functions such as retail analytics , e.g.

c) Predictive analysis - What will happen.

The predictive method allows you to look into the future to answer the question: what will happen? In order to do this, it uses the results of the previously mentioned descriptive, exploratory, and diagnostic analysis, in addition to machine learning (ML) and artificial intelligence (AI). Through this, you can uncover future trends, potential problems or inefficiencies, connections, and casualties in your data.

With predictive analysis, you can unfold and develop initiatives that will not only enhance your various operational processes but also help you gain an all-important edge over the competition. If you understand why a trend, pattern, or event happened through data, you will be able to develop an informed projection of how things may unfold in particular areas of the business.

e) Prescriptive analysis - How will it happen.

Another of the most effective types of analysis methods in research. Prescriptive data techniques cross over from predictive analysis in the way that it revolves around using patterns or trends to develop responsive, practical business strategies.

By drilling down into prescriptive analysis, you will play an active role in the data consumption process by taking well-arranged sets of visual data and using it as a powerful fix to emerging issues in a number of key areas, including marketing, sales, customer experience, HR, fulfillment, finance, logistics analytics , and others.

Top 17 data analysis methods

As mentioned at the beginning of the post, data analysis methods can be divided into two big categories: quantitative and qualitative. Each of these categories holds a powerful analytical value that changes depending on the scenario and type of data you are working with. Below, we will discuss 17 methods that are divided into qualitative and quantitative approaches. 

Without further ado, here are the 17 essential types of data analysis methods with some use cases in the business world: 

A. Quantitative Methods 

To put it simply, quantitative analysis refers to all methods that use numerical data or data that can be turned into numbers (e.g. category variables like gender, age, etc.) to extract valuable insights. It is used to extract valuable conclusions about relationships, differences, and test hypotheses. Below we discuss some of the key quantitative methods. 

1. Cluster analysis

The action of grouping a set of data elements in a way that said elements are more similar (in a particular sense) to each other than to those in other groups – hence the term ‘cluster.’ Since there is no target variable when clustering, the method is often used to find hidden patterns in the data. The approach is also used to provide additional context to a trend or dataset.

Let's look at it from an organizational perspective. In a perfect world, marketers would be able to analyze each customer separately and give them the best-personalized service, but let's face it, with a large customer base, it is timely impossible to do that. That's where clustering comes in. By grouping customers into clusters based on demographics, purchasing behaviors, monetary value, or any other factor that might be relevant for your company, you will be able to immediately optimize your efforts and give your customers the best experience based on their needs.

2. Cohort analysis

This type of data analysis approach uses historical data to examine and compare a determined segment of users' behavior, which can then be grouped with others with similar characteristics. By using this methodology, it's possible to gain a wealth of insight into consumer needs or a firm understanding of a broader target group.

Cohort analysis can be really useful for performing analysis in marketing as it will allow you to understand the impact of your campaigns on specific groups of customers. To exemplify, imagine you send an email campaign encouraging customers to sign up for your site. For this, you create two versions of the campaign with different designs, CTAs, and ad content. Later on, you can use cohort analysis to track the performance of the campaign for a longer period of time and understand which type of content is driving your customers to sign up, repurchase, or engage in other ways.  

A useful tool to start performing cohort analysis method is Google Analytics. You can learn more about the benefits and limitations of using cohorts in GA in this useful guide . In the bottom image, you see an example of how you visualize a cohort in this tool. The segments (devices traffic) are divided into date cohorts (usage of devices) and then analyzed week by week to extract insights into performance.

Cohort analysis chart example from google analytics

3. Regression analysis

Regression uses historical data to understand how a dependent variable's value is affected when one (linear regression) or more independent variables (multiple regression) change or stay the same. By understanding each variable's relationship and how it developed in the past, you can anticipate possible outcomes and make better decisions in the future.

Let's bring it down with an example. Imagine you did a regression analysis of your sales in 2019 and discovered that variables like product quality, store design, customer service, marketing campaigns, and sales channels affected the overall result. Now you want to use regression to analyze which of these variables changed or if any new ones appeared during 2020. For example, you couldn’t sell as much in your physical store due to COVID lockdowns. Therefore, your sales could’ve either dropped in general or increased in your online channels. Through this, you can understand which independent variables affected the overall performance of your dependent variable, annual sales.

If you want to go deeper into this type of analysis, check out this article and learn more about how you can benefit from regression.

4. Neural networks

The neural network forms the basis for the intelligent algorithms of machine learning. It is a form of analytics that attempts, with minimal intervention, to understand how the human brain would generate insights and predict values. Neural networks learn from each and every data transaction, meaning that they evolve and advance over time.

A typical area of application for neural networks is predictive analytics. There are BI reporting tools that have this feature implemented within them, such as the Predictive Analytics Tool from datapine. This tool enables users to quickly and easily generate all kinds of predictions. All you have to do is select the data to be processed based on your KPIs, and the software automatically calculates forecasts based on historical and current data. Thanks to its user-friendly interface, anyone in your organization can manage it; there’s no need to be an advanced scientist. 

Here is an example of how you can use the predictive analysis tool from datapine:

Example on how to use predictive analytics tool from datapine

**click to enlarge**

5. Factor analysis

The factor analysis also called “dimension reduction” is a type of data analysis used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. The aim here is to uncover independent latent variables, an ideal method for streamlining specific segments.

A good way to understand this data analysis method is a customer evaluation of a product. The initial assessment is based on different variables like color, shape, wearability, current trends, materials, comfort, the place where they bought the product, and frequency of usage. Like this, the list can be endless, depending on what you want to track. In this case, factor analysis comes into the picture by summarizing all of these variables into homogenous groups, for example, by grouping the variables color, materials, quality, and trends into a brother latent variable of design.

If you want to start analyzing data using factor analysis we recommend you take a look at this practical guide from UCLA.

6. Data mining

A method of data analysis that is the umbrella term for engineering metrics and insights for additional value, direction, and context. By using exploratory statistical evaluation, data mining aims to identify dependencies, relations, patterns, and trends to generate advanced knowledge.  When considering how to analyze data, adopting a data mining mindset is essential to success - as such, it’s an area that is worth exploring in greater detail.

An excellent use case of data mining is datapine intelligent data alerts . With the help of artificial intelligence and machine learning, they provide automated signals based on particular commands or occurrences within a dataset. For example, if you’re monitoring supply chain KPIs , you could set an intelligent alarm to trigger when invalid or low-quality data appears. By doing so, you will be able to drill down deep into the issue and fix it swiftly and effectively.

In the following picture, you can see how the intelligent alarms from datapine work. By setting up ranges on daily orders, sessions, and revenues, the alarms will notify you if the goal was not completed or if it exceeded expectations.

Example on how to use intelligent alerts from datapine

7. Time series analysis

As its name suggests, time series analysis is used to analyze a set of data points collected over a specified period of time. Although analysts use this method to monitor the data points in a specific interval of time rather than just monitoring them intermittently, the time series analysis is not uniquely used for the purpose of collecting data over time. Instead, it allows researchers to understand if variables changed during the duration of the study, how the different variables are dependent, and how did it reach the end result. 

In a business context, this method is used to understand the causes of different trends and patterns to extract valuable insights. Another way of using this method is with the help of time series forecasting. Powered by predictive technologies, businesses can analyze various data sets over a period of time and forecast different future events. 

A great use case to put time series analysis into perspective is seasonality effects on sales. By using time series forecasting to analyze sales data of a specific product over time, you can understand if sales rise over a specific period of time (e.g. swimwear during summertime, or candy during Halloween). These insights allow you to predict demand and prepare production accordingly.  

8. Decision Trees 

The decision tree analysis aims to act as a support tool to make smart and strategic decisions. By visually displaying potential outcomes, consequences, and costs in a tree-like model, researchers and company users can easily evaluate all factors involved and choose the best course of action. Decision trees are helpful to analyze quantitative data and they allow for an improved decision-making process by helping you spot improvement opportunities, reduce costs, and enhance operational efficiency and production.

But how does a decision tree actually works? This method works like a flowchart that starts with the main decision that you need to make and branches out based on the different outcomes and consequences of each decision. Each outcome will outline its own consequences, costs, and gains and, at the end of the analysis, you can compare each of them and make the smartest decision. 

Businesses can use them to understand which project is more cost-effective and will bring more earnings in the long run. For example, imagine you need to decide if you want to update your software app or build a new app entirely.  Here you would compare the total costs, the time needed to be invested, potential revenue, and any other factor that might affect your decision.  In the end, you would be able to see which of these two options is more realistic and attainable for your company or research.

9. Conjoint analysis 

Last but not least, we have the conjoint analysis. This approach is usually used in surveys to understand how individuals value different attributes of a product or service and it is one of the most effective methods to extract consumer preferences. When it comes to purchasing, some clients might be more price-focused, others more features-focused, and others might have a sustainable focus. Whatever your customer's preferences are, you can find them with conjoint analysis. Through this, companies can define pricing strategies, packaging options, subscription packages, and more. 

A great example of conjoint analysis is in marketing and sales. For instance, a cupcake brand might use conjoint analysis and find that its clients prefer gluten-free options and cupcakes with healthier toppings over super sugary ones. Thus, the cupcake brand can turn these insights into advertisements and promotions to increase sales of this particular type of product. And not just that, conjoint analysis can also help businesses segment their customers based on their interests. This allows them to send different messaging that will bring value to each of the segments. 

10. Correspondence Analysis

Also known as reciprocal averaging, correspondence analysis is a method used to analyze the relationship between categorical variables presented within a contingency table. A contingency table is a table that displays two (simple correspondence analysis) or more (multiple correspondence analysis) categorical variables across rows and columns that show the distribution of the data, which is usually answers to a survey or questionnaire on a specific topic. 

This method starts by calculating an “expected value” which is done by multiplying row and column averages and dividing it by the overall original value of the specific table cell. The “expected value” is then subtracted from the original value resulting in a “residual number” which is what allows you to extract conclusions about relationships and distribution. The results of this analysis are later displayed using a map that represents the relationship between the different values. The closest two values are in the map, the bigger the relationship. Let’s put it into perspective with an example. 

Imagine you are carrying out a market research analysis about outdoor clothing brands and how they are perceived by the public. For this analysis, you ask a group of people to match each brand with a certain attribute which can be durability, innovation, quality materials, etc. When calculating the residual numbers, you can see that brand A has a positive residual for innovation but a negative one for durability. This means that brand A is not positioned as a durable brand in the market, something that competitors could take advantage of. 

11. Multidimensional Scaling (MDS)

MDS is a method used to observe the similarities or disparities between objects which can be colors, brands, people, geographical coordinates, and more. The objects are plotted using an “MDS map” that positions similar objects together and disparate ones far apart. The (dis) similarities between objects are represented using one or more dimensions that can be observed using a numerical scale. For example, if you want to know how people feel about the COVID-19 vaccine, you can use 1 for “don’t believe in the vaccine at all”  and 10 for “firmly believe in the vaccine” and a scale of 2 to 9 for in between responses.  When analyzing an MDS map the only thing that matters is the distance between the objects, the orientation of the dimensions is arbitrary and has no meaning at all. 

Multidimensional scaling is a valuable technique for market research, especially when it comes to evaluating product or brand positioning. For instance, if a cupcake brand wants to know how they are positioned compared to competitors, it can define 2-3 dimensions such as taste, ingredients, shopping experience, or more, and do a multidimensional scaling analysis to find improvement opportunities as well as areas in which competitors are currently leading. 

Another business example is in procurement when deciding on different suppliers. Decision makers can generate an MDS map to see how the different prices, delivery times, technical services, and more of the different suppliers differ and pick the one that suits their needs the best. 

A final example proposed by a research paper on "An Improved Study of Multilevel Semantic Network Visualization for Analyzing Sentiment Word of Movie Review Data". Researchers picked a two-dimensional MDS map to display the distances and relationships between different sentiments in movie reviews. They used 36 sentiment words and distributed them based on their emotional distance as we can see in the image below where the words "outraged" and "sweet" are on opposite sides of the map, marking the distance between the two emotions very clearly.

Example of multidimensional scaling analysis

Aside from being a valuable technique to analyze dissimilarities, MDS also serves as a dimension-reduction technique for large dimensional data. 

B. Qualitative Methods

Qualitative data analysis methods are defined as the observation of non-numerical data that is gathered and produced using methods of observation such as interviews, focus groups, questionnaires, and more. As opposed to quantitative methods, qualitative data is more subjective and highly valuable in analyzing customer retention and product development.

12. Text analysis

Text analysis, also known in the industry as text mining, works by taking large sets of textual data and arranging them in a way that makes it easier to manage. By working through this cleansing process in stringent detail, you will be able to extract the data that is truly relevant to your organization and use it to develop actionable insights that will propel you forward.

Modern software accelerate the application of text analytics. Thanks to the combination of machine learning and intelligent algorithms, you can perform advanced analytical processes such as sentiment analysis. This technique allows you to understand the intentions and emotions of a text, for example, if it's positive, negative, or neutral, and then give it a score depending on certain factors and categories that are relevant to your brand. Sentiment analysis is often used to monitor brand and product reputation and to understand how successful your customer experience is. To learn more about the topic check out this insightful article .

By analyzing data from various word-based sources, including product reviews, articles, social media communications, and survey responses, you will gain invaluable insights into your audience, as well as their needs, preferences, and pain points. This will allow you to create campaigns, services, and communications that meet your prospects’ needs on a personal level, growing your audience while boosting customer retention. There are various other “sub-methods” that are an extension of text analysis. Each of them serves a more specific purpose and we will look at them in detail next. 

13. Content Analysis

This is a straightforward and very popular method that examines the presence and frequency of certain words, concepts, and subjects in different content formats such as text, image, audio, or video. For example, the number of times the name of a celebrity is mentioned on social media or online tabloids. It does this by coding text data that is later categorized and tabulated in a way that can provide valuable insights, making it the perfect mix of quantitative and qualitative analysis.

There are two types of content analysis. The first one is the conceptual analysis which focuses on explicit data, for instance, the number of times a concept or word is mentioned in a piece of content. The second one is relational analysis, which focuses on the relationship between different concepts or words and how they are connected within a specific context. 

Content analysis is often used by marketers to measure brand reputation and customer behavior. For example, by analyzing customer reviews. It can also be used to analyze customer interviews and find directions for new product development. It is also important to note, that in order to extract the maximum potential out of this analysis method, it is necessary to have a clearly defined research question. 

14. Thematic Analysis

Very similar to content analysis, thematic analysis also helps in identifying and interpreting patterns in qualitative data with the main difference being that the first one can also be applied to quantitative analysis. The thematic method analyzes large pieces of text data such as focus group transcripts or interviews and groups them into themes or categories that come up frequently within the text. It is a great method when trying to figure out peoples view’s and opinions about a certain topic. For example, if you are a brand that cares about sustainability, you can do a survey of your customers to analyze their views and opinions about sustainability and how they apply it to their lives. You can also analyze customer service calls transcripts to find common issues and improve your service. 

Thematic analysis is a very subjective technique that relies on the researcher’s judgment. Therefore,  to avoid biases, it has 6 steps that include familiarization, coding, generating themes, reviewing themes, defining and naming themes, and writing up. It is also important to note that, because it is a flexible approach, the data can be interpreted in multiple ways and it can be hard to select what data is more important to emphasize. 

15. Narrative Analysis 

A bit more complex in nature than the two previous ones, narrative analysis is used to explore the meaning behind the stories that people tell and most importantly, how they tell them. By looking into the words that people use to describe a situation you can extract valuable conclusions about their perspective on a specific topic. Common sources for narrative data include autobiographies, family stories, opinion pieces, and testimonials, among others. 

From a business perspective, narrative analysis can be useful to analyze customer behaviors and feelings towards a specific product, service, feature, or others. It provides unique and deep insights that can be extremely valuable. However, it has some drawbacks.  

The biggest weakness of this method is that the sample sizes are usually very small due to the complexity and time-consuming nature of the collection of narrative data. Plus, the way a subject tells a story will be significantly influenced by his or her specific experiences, making it very hard to replicate in a subsequent study. 

16. Discourse Analysis

Discourse analysis is used to understand the meaning behind any type of written, verbal, or symbolic discourse based on its political, social, or cultural context. It mixes the analysis of languages and situations together. This means that the way the content is constructed and the meaning behind it is significantly influenced by the culture and society it takes place in. For example, if you are analyzing political speeches you need to consider different context elements such as the politician's background, the current political context of the country, the audience to which the speech is directed, and so on. 

From a business point of view, discourse analysis is a great market research tool. It allows marketers to understand how the norms and ideas of the specific market work and how their customers relate to those ideas. It can be very useful to build a brand mission or develop a unique tone of voice. 

17. Grounded Theory Analysis

Traditionally, researchers decide on a method and hypothesis and start to collect the data to prove that hypothesis. The grounded theory is the only method that doesn’t require an initial research question or hypothesis as its value lies in the generation of new theories. With the grounded theory method, you can go into the analysis process with an open mind and explore the data to generate new theories through tests and revisions. In fact, it is not necessary to collect the data and then start to analyze it. Researchers usually start to find valuable insights as they are gathering the data. 

All of these elements make grounded theory a very valuable method as theories are fully backed by data instead of initial assumptions. It is a great technique to analyze poorly researched topics or find the causes behind specific company outcomes. For example, product managers and marketers might use the grounded theory to find the causes of high levels of customer churn and look into customer surveys and reviews to develop new theories about the causes. 

How To Analyze Data? Top 17 Data Analysis Techniques To Apply

17 top data analysis techniques by datapine

Now that we’ve answered the questions “what is data analysis’”, why is it important, and covered the different data analysis types, it’s time to dig deeper into how to perform your analysis by working through these 17 essential techniques.

1. Collaborate your needs

Before you begin analyzing or drilling down into any techniques, it’s crucial to sit down collaboratively with all key stakeholders within your organization, decide on your primary campaign or strategic goals, and gain a fundamental understanding of the types of insights that will best benefit your progress or provide you with the level of vision you need to evolve your organization.

2. Establish your questions

Once you’ve outlined your core objectives, you should consider which questions will need answering to help you achieve your mission. This is one of the most important techniques as it will shape the very foundations of your success.

To help you ask the right things and ensure your data works for you, you have to ask the right data analysis questions .

3. Data democratization

After giving your data analytics methodology some real direction, and knowing which questions need answering to extract optimum value from the information available to your organization, you should continue with democratization.

Data democratization is an action that aims to connect data from various sources efficiently and quickly so that anyone in your organization can access it at any given moment. You can extract data in text, images, videos, numbers, or any other format. And then perform cross-database analysis to achieve more advanced insights to share with the rest of the company interactively.  

Once you have decided on your most valuable sources, you need to take all of this into a structured format to start collecting your insights. For this purpose, datapine offers an easy all-in-one data connectors feature to integrate all your internal and external sources and manage them at your will. Additionally, datapine’s end-to-end solution automatically updates your data, allowing you to save time and focus on performing the right analysis to grow your company.

data connectors from datapine

4. Think of governance 

When collecting data in a business or research context you always need to think about security and privacy. With data breaches becoming a topic of concern for businesses, the need to protect your client's or subject’s sensitive information becomes critical. 

To ensure that all this is taken care of, you need to think of a data governance strategy. According to Gartner , this concept refers to “ the specification of decision rights and an accountability framework to ensure the appropriate behavior in the valuation, creation, consumption, and control of data and analytics .” In simpler words, data governance is a collection of processes, roles, and policies, that ensure the efficient use of data while still achieving the main company goals. It ensures that clear roles are in place for who can access the information and how they can access it. In time, this not only ensures that sensitive information is protected but also allows for an efficient analysis as a whole. 

5. Clean your data

After harvesting from so many sources you will be left with a vast amount of information that can be overwhelming to deal with. At the same time, you can be faced with incorrect data that can be misleading to your analysis. The smartest thing you can do to avoid dealing with this in the future is to clean the data. This is fundamental before visualizing it, as it will ensure that the insights you extract from it are correct.

There are many things that you need to look for in the cleaning process. The most important one is to eliminate any duplicate observations; this usually appears when using multiple internal and external sources of information. You can also add any missing codes, fix empty fields, and eliminate incorrectly formatted data.

Another usual form of cleaning is done with text data. As we mentioned earlier, most companies today analyze customer reviews, social media comments, questionnaires, and several other text inputs. In order for algorithms to detect patterns, text data needs to be revised to avoid invalid characters or any syntax or spelling errors. 

Most importantly, the aim of cleaning is to prevent you from arriving at false conclusions that can damage your company in the long run. By using clean data, you will also help BI solutions to interact better with your information and create better reports for your organization.

6. Set your KPIs

Once you’ve set your sources, cleaned your data, and established clear-cut questions you want your insights to answer, you need to set a host of key performance indicators (KPIs) that will help you track, measure, and shape your progress in a number of key areas.

KPIs are critical to both qualitative and quantitative analysis research. This is one of the primary methods of data analysis you certainly shouldn’t overlook.

To help you set the best possible KPIs for your initiatives and activities, here is an example of a relevant logistics KPI : transportation-related costs. If you want to see more go explore our collection of key performance indicator examples .

Transportation costs logistics KPIs

7. Omit useless data

Having bestowed your data analysis tools and techniques with true purpose and defined your mission, you should explore the raw data you’ve collected from all sources and use your KPIs as a reference for chopping out any information you deem to be useless.

Trimming the informational fat is one of the most crucial methods of analysis as it will allow you to focus your analytical efforts and squeeze every drop of value from the remaining ‘lean’ information.

Any stats, facts, figures, or metrics that don’t align with your business goals or fit with your KPI management strategies should be eliminated from the equation.

8. Build a data management roadmap

While, at this point, this particular step is optional (you will have already gained a wealth of insight and formed a fairly sound strategy by now), creating a data governance roadmap will help your data analysis methods and techniques become successful on a more sustainable basis. These roadmaps, if developed properly, are also built so they can be tweaked and scaled over time.

Invest ample time in developing a roadmap that will help you store, manage, and handle your data internally, and you will make your analysis techniques all the more fluid and functional – one of the most powerful types of data analysis methods available today.

9. Integrate technology

There are many ways to analyze data, but one of the most vital aspects of analytical success in a business context is integrating the right decision support software and technology.

Robust analysis platforms will not only allow you to pull critical data from your most valuable sources while working with dynamic KPIs that will offer you actionable insights; it will also present them in a digestible, visual, interactive format from one central, live dashboard . A data methodology you can count on.

By integrating the right technology within your data analysis methodology, you’ll avoid fragmenting your insights, saving you time and effort while allowing you to enjoy the maximum value from your business’s most valuable insights.

For a look at the power of software for the purpose of analysis and to enhance your methods of analyzing, glance over our selection of dashboard examples .

10. Answer your questions

By considering each of the above efforts, working with the right technology, and fostering a cohesive internal culture where everyone buys into the different ways to analyze data as well as the power of digital intelligence, you will swiftly start to answer your most burning business questions. Arguably, the best way to make your data concepts accessible across the organization is through data visualization.

11. Visualize your data

Online data visualization is a powerful tool as it lets you tell a story with your metrics, allowing users across the organization to extract meaningful insights that aid business evolution – and it covers all the different ways to analyze data.

The purpose of analyzing is to make your entire organization more informed and intelligent, and with the right platform or dashboard, this is simpler than you think, as demonstrated by our marketing dashboard .

An executive dashboard example showcasing high-level marketing KPIs such as cost per lead, MQL, SQL, and cost per customer.

This visual, dynamic, and interactive online dashboard is a data analysis example designed to give Chief Marketing Officers (CMO) an overview of relevant metrics to help them understand if they achieved their monthly goals.

In detail, this example generated with a modern dashboard creator displays interactive charts for monthly revenues, costs, net income, and net income per customer; all of them are compared with the previous month so that you can understand how the data fluctuated. In addition, it shows a detailed summary of the number of users, customers, SQLs, and MQLs per month to visualize the whole picture and extract relevant insights or trends for your marketing reports .

The CMO dashboard is perfect for c-level management as it can help them monitor the strategic outcome of their marketing efforts and make data-driven decisions that can benefit the company exponentially.

12. Be careful with the interpretation

We already dedicated an entire post to data interpretation as it is a fundamental part of the process of data analysis. It gives meaning to the analytical information and aims to drive a concise conclusion from the analysis results. Since most of the time companies are dealing with data from many different sources, the interpretation stage needs to be done carefully and properly in order to avoid misinterpretations. 

To help you through the process, here we list three common practices that you need to avoid at all costs when looking at your data:

  • Correlation vs. causation: The human brain is formatted to find patterns. This behavior leads to one of the most common mistakes when performing interpretation: confusing correlation with causation. Although these two aspects can exist simultaneously, it is not correct to assume that because two things happened together, one provoked the other. A piece of advice to avoid falling into this mistake is never to trust just intuition, trust the data. If there is no objective evidence of causation, then always stick to correlation. 
  • Confirmation bias: This phenomenon describes the tendency to select and interpret only the data necessary to prove one hypothesis, often ignoring the elements that might disprove it. Even if it's not done on purpose, confirmation bias can represent a real problem, as excluding relevant information can lead to false conclusions and, therefore, bad business decisions. To avoid it, always try to disprove your hypothesis instead of proving it, share your analysis with other team members, and avoid drawing any conclusions before the entire analytical project is finalized.
  • Statistical significance: To put it in short words, statistical significance helps analysts understand if a result is actually accurate or if it happened because of a sampling error or pure chance. The level of statistical significance needed might depend on the sample size and the industry being analyzed. In any case, ignoring the significance of a result when it might influence decision-making can be a huge mistake.

13. Build a narrative

Now, we’re going to look at how you can bring all of these elements together in a way that will benefit your business - starting with a little something called data storytelling.

The human brain responds incredibly well to strong stories or narratives. Once you’ve cleansed, shaped, and visualized your most invaluable data using various BI dashboard tools , you should strive to tell a story - one with a clear-cut beginning, middle, and end.

By doing so, you will make your analytical efforts more accessible, digestible, and universal, empowering more people within your organization to use your discoveries to their actionable advantage.

14. Consider autonomous technology

Autonomous technologies, such as artificial intelligence (AI) and machine learning (ML), play a significant role in the advancement of understanding how to analyze data more effectively.

Gartner predicts that by the end of this year, 80% of emerging technologies will be developed with AI foundations. This is a testament to the ever-growing power and value of autonomous technologies.

At the moment, these technologies are revolutionizing the analysis industry. Some examples that we mentioned earlier are neural networks, intelligent alarms, and sentiment analysis.

15. Share the load

If you work with the right tools and dashboards, you will be able to present your metrics in a digestible, value-driven format, allowing almost everyone in the organization to connect with and use relevant data to their advantage.

Modern dashboards consolidate data from various sources, providing access to a wealth of insights in one centralized location, no matter if you need to monitor recruitment metrics or generate reports that need to be sent across numerous departments. Moreover, these cutting-edge tools offer access to dashboards from a multitude of devices, meaning that everyone within the business can connect with practical insights remotely - and share the load.

Once everyone is able to work with a data-driven mindset, you will catalyze the success of your business in ways you never thought possible. And when it comes to knowing how to analyze data, this kind of collaborative approach is essential.

16. Data analysis tools

In order to perform high-quality analysis of data, it is fundamental to use tools and software that will ensure the best results. Here we leave you a small summary of four fundamental categories of data analysis tools for your organization.

  • Business Intelligence: BI tools allow you to process significant amounts of data from several sources in any format. Through this, you can not only analyze and monitor your data to extract relevant insights but also create interactive reports and dashboards to visualize your KPIs and use them for your company's good. datapine is an amazing online BI software that is focused on delivering powerful online analysis features that are accessible to beginner and advanced users. Like this, it offers a full-service solution that includes cutting-edge analysis of data, KPIs visualization, live dashboards, reporting, and artificial intelligence technologies to predict trends and minimize risk.
  • Statistical analysis: These tools are usually designed for scientists, statisticians, market researchers, and mathematicians, as they allow them to perform complex statistical analyses with methods like regression analysis, predictive analysis, and statistical modeling. A good tool to perform this type of analysis is R-Studio as it offers a powerful data modeling and hypothesis testing feature that can cover both academic and general data analysis. This tool is one of the favorite ones in the industry, due to its capability for data cleaning, data reduction, and performing advanced analysis with several statistical methods. Another relevant tool to mention is SPSS from IBM. The software offers advanced statistical analysis for users of all skill levels. Thanks to a vast library of machine learning algorithms, text analysis, and a hypothesis testing approach it can help your company find relevant insights to drive better decisions. SPSS also works as a cloud service that enables you to run it anywhere.
  • SQL Consoles: SQL is a programming language often used to handle structured data in relational databases. Tools like these are popular among data scientists as they are extremely effective in unlocking these databases' value. Undoubtedly, one of the most used SQL software in the market is MySQL Workbench . This tool offers several features such as a visual tool for database modeling and monitoring, complete SQL optimization, administration tools, and visual performance dashboards to keep track of KPIs.
  • Data Visualization: These tools are used to represent your data through charts, graphs, and maps that allow you to find patterns and trends in the data. datapine's already mentioned BI platform also offers a wealth of powerful online data visualization tools with several benefits. Some of them include: delivering compelling data-driven presentations to share with your entire company, the ability to see your data online with any device wherever you are, an interactive dashboard design feature that enables you to showcase your results in an interactive and understandable way, and to perform online self-service reports that can be used simultaneously with several other people to enhance team productivity.

17. Refine your process constantly 

Last is a step that might seem obvious to some people, but it can be easily ignored if you think you are done. Once you have extracted the needed results, you should always take a retrospective look at your project and think about what you can improve. As you saw throughout this long list of techniques, data analysis is a complex process that requires constant refinement. For this reason, you should always go one step further and keep improving. 

Quality Criteria For Data Analysis

So far we’ve covered a list of methods and techniques that should help you perform efficient data analysis. But how do you measure the quality and validity of your results? This is done with the help of some science quality criteria. Here we will go into a more theoretical area that is critical to understanding the fundamentals of statistical analysis in science. However, you should also be aware of these steps in a business context, as they will allow you to assess the quality of your results in the correct way. Let’s dig in. 

  • Internal validity: The results of a survey are internally valid if they measure what they are supposed to measure and thus provide credible results. In other words , internal validity measures the trustworthiness of the results and how they can be affected by factors such as the research design, operational definitions, how the variables are measured, and more. For instance, imagine you are doing an interview to ask people if they brush their teeth two times a day. While most of them will answer yes, you can still notice that their answers correspond to what is socially acceptable, which is to brush your teeth at least twice a day. In this case, you can’t be 100% sure if respondents actually brush their teeth twice a day or if they just say that they do, therefore, the internal validity of this interview is very low. 
  • External validity: Essentially, external validity refers to the extent to which the results of your research can be applied to a broader context. It basically aims to prove that the findings of a study can be applied in the real world. If the research can be applied to other settings, individuals, and times, then the external validity is high. 
  • Reliability : If your research is reliable, it means that it can be reproduced. If your measurement were repeated under the same conditions, it would produce similar results. This means that your measuring instrument consistently produces reliable results. For example, imagine a doctor building a symptoms questionnaire to detect a specific disease in a patient. Then, various other doctors use this questionnaire but end up diagnosing the same patient with a different condition. This means the questionnaire is not reliable in detecting the initial disease. Another important note here is that in order for your research to be reliable, it also needs to be objective. If the results of a study are the same, independent of who assesses them or interprets them, the study can be considered reliable. Let’s see the objectivity criteria in more detail now. 
  • Objectivity: In data science, objectivity means that the researcher needs to stay fully objective when it comes to its analysis. The results of a study need to be affected by objective criteria and not by the beliefs, personality, or values of the researcher. Objectivity needs to be ensured when you are gathering the data, for example, when interviewing individuals, the questions need to be asked in a way that doesn't influence the results. Paired with this, objectivity also needs to be thought of when interpreting the data. If different researchers reach the same conclusions, then the study is objective. For this last point, you can set predefined criteria to interpret the results to ensure all researchers follow the same steps. 

The discussed quality criteria cover mostly potential influences in a quantitative context. Analysis in qualitative research has by default additional subjective influences that must be controlled in a different way. Therefore, there are other quality criteria for this kind of research such as credibility, transferability, dependability, and confirmability. You can see each of them more in detail on this resource . 

Data Analysis Limitations & Barriers

Analyzing data is not an easy task. As you’ve seen throughout this post, there are many steps and techniques that you need to apply in order to extract useful information from your research. While a well-performed analysis can bring various benefits to your organization it doesn't come without limitations. In this section, we will discuss some of the main barriers you might encounter when conducting an analysis. Let’s see them more in detail. 

  • Lack of clear goals: No matter how good your data or analysis might be if you don’t have clear goals or a hypothesis the process might be worthless. While we mentioned some methods that don’t require a predefined hypothesis, it is always better to enter the analytical process with some clear guidelines of what you are expecting to get out of it, especially in a business context in which data is utilized to support important strategic decisions. 
  • Objectivity: Arguably one of the biggest barriers when it comes to data analysis in research is to stay objective. When trying to prove a hypothesis, researchers might find themselves, intentionally or unintentionally, directing the results toward an outcome that they want. To avoid this, always question your assumptions and avoid confusing facts with opinions. You can also show your findings to a research partner or external person to confirm that your results are objective. 
  • Data representation: A fundamental part of the analytical procedure is the way you represent your data. You can use various graphs and charts to represent your findings, but not all of them will work for all purposes. Choosing the wrong visual can not only damage your analysis but can mislead your audience, therefore, it is important to understand when to use each type of data depending on your analytical goals. Our complete guide on the types of graphs and charts lists 20 different visuals with examples of when to use them. 
  • Flawed correlation : Misleading statistics can significantly damage your research. We’ve already pointed out a few interpretation issues previously in the post, but it is an important barrier that we can't avoid addressing here as well. Flawed correlations occur when two variables appear related to each other but they are not. Confusing correlations with causation can lead to a wrong interpretation of results which can lead to building wrong strategies and loss of resources, therefore, it is very important to identify the different interpretation mistakes and avoid them. 
  • Sample size: A very common barrier to a reliable and efficient analysis process is the sample size. In order for the results to be trustworthy, the sample size should be representative of what you are analyzing. For example, imagine you have a company of 1000 employees and you ask the question “do you like working here?” to 50 employees of which 49 say yes, which means 95%. Now, imagine you ask the same question to the 1000 employees and 950 say yes, which also means 95%. Saying that 95% of employees like working in the company when the sample size was only 50 is not a representative or trustworthy conclusion. The significance of the results is way more accurate when surveying a bigger sample size.   
  • Privacy concerns: In some cases, data collection can be subjected to privacy regulations. Businesses gather all kinds of information from their customers from purchasing behaviors to addresses and phone numbers. If this falls into the wrong hands due to a breach, it can affect the security and confidentiality of your clients. To avoid this issue, you need to collect only the data that is needed for your research and, if you are using sensitive facts, make it anonymous so customers are protected. The misuse of customer data can severely damage a business's reputation, so it is important to keep an eye on privacy. 
  • Lack of communication between teams : When it comes to performing data analysis on a business level, it is very likely that each department and team will have different goals and strategies. However, they are all working for the same common goal of helping the business run smoothly and keep growing. When teams are not connected and communicating with each other, it can directly affect the way general strategies are built. To avoid these issues, tools such as data dashboards enable teams to stay connected through data in a visually appealing way. 
  • Innumeracy : Businesses are working with data more and more every day. While there are many BI tools available to perform effective analysis, data literacy is still a constant barrier. Not all employees know how to apply analysis techniques or extract insights from them. To prevent this from happening, you can implement different training opportunities that will prepare every relevant user to deal with data. 

Key Data Analysis Skills

As you've learned throughout this lengthy guide, analyzing data is a complex task that requires a lot of knowledge and skills. That said, thanks to the rise of self-service tools the process is way more accessible and agile than it once was. Regardless, there are still some key skills that are valuable to have when working with data, we list the most important ones below.

  • Critical and statistical thinking: To successfully analyze data you need to be creative and think out of the box. Yes, that might sound like a weird statement considering that data is often tight to facts. However, a great level of critical thinking is required to uncover connections, come up with a valuable hypothesis, and extract conclusions that go a step further from the surface. This, of course, needs to be complemented by statistical thinking and an understanding of numbers. 
  • Data cleaning: Anyone who has ever worked with data before will tell you that the cleaning and preparation process accounts for 80% of a data analyst's work, therefore, the skill is fundamental. But not just that, not cleaning the data adequately can also significantly damage the analysis which can lead to poor decision-making in a business scenario. While there are multiple tools that automate the cleaning process and eliminate the possibility of human error, it is still a valuable skill to dominate. 
  • Data visualization: Visuals make the information easier to understand and analyze, not only for professional users but especially for non-technical ones. Having the necessary skills to not only choose the right chart type but know when to apply it correctly is key. This also means being able to design visually compelling charts that make the data exploration process more efficient. 
  • SQL: The Structured Query Language or SQL is a programming language used to communicate with databases. It is fundamental knowledge as it enables you to update, manipulate, and organize data from relational databases which are the most common databases used by companies. It is fairly easy to learn and one of the most valuable skills when it comes to data analysis. 
  • Communication skills: This is a skill that is especially valuable in a business environment. Being able to clearly communicate analytical outcomes to colleagues is incredibly important, especially when the information you are trying to convey is complex for non-technical people. This applies to in-person communication as well as written format, for example, when generating a dashboard or report. While this might be considered a “soft” skill compared to the other ones we mentioned, it should not be ignored as you most likely will need to share analytical findings with others no matter the context. 

Data Analysis In The Big Data Environment

Big data is invaluable to today’s businesses, and by using different methods for data analysis, it’s possible to view your data in a way that can help you turn insight into positive action.

To inspire your efforts and put the importance of big data into context, here are some insights that you should know:

  • By 2026 the industry of big data is expected to be worth approximately $273.4 billion.
  • 94% of enterprises say that analyzing data is important for their growth and digital transformation. 
  • Companies that exploit the full potential of their data can increase their operating margins by 60% .
  • We already told you the benefits of Artificial Intelligence through this article. This industry's financial impact is expected to grow up to $40 billion by 2025.

Data analysis concepts may come in many forms, but fundamentally, any solid methodology will help to make your business more streamlined, cohesive, insightful, and successful than ever before.

Key Takeaways From Data Analysis 

As we reach the end of our data analysis journey, we leave a small summary of the main methods and techniques to perform excellent analysis and grow your business.

17 Essential Types of Data Analysis Methods:

  • Cluster analysis
  • Cohort analysis
  • Regression analysis
  • Factor analysis
  • Neural Networks
  • Data Mining
  • Text analysis
  • Time series analysis
  • Decision trees
  • Conjoint analysis 
  • Correspondence Analysis
  • Multidimensional Scaling 
  • Content analysis 
  • Thematic analysis
  • Narrative analysis 
  • Grounded theory analysis
  • Discourse analysis 

Top 17 Data Analysis Techniques:

  • Collaborate your needs
  • Establish your questions
  • Data democratization
  • Think of data governance 
  • Clean your data
  • Set your KPIs
  • Omit useless data
  • Build a data management roadmap
  • Integrate technology
  • Answer your questions
  • Visualize your data
  • Interpretation of data
  • Consider autonomous technology
  • Build a narrative
  • Share the load
  • Data Analysis tools
  • Refine your process constantly 

We’ve pondered the data analysis definition and drilled down into the practical applications of data-centric analytics, and one thing is clear: by taking measures to arrange your data and making your metrics work for you, it’s possible to transform raw information into action - the kind of that will push your business to the next level.

Yes, good data analytics techniques result in enhanced business intelligence (BI). To help you understand this notion in more detail, read our exploration of business intelligence reporting .

And, if you’re ready to perform your own analysis, drill down into your facts and figures while interacting with your data on astonishing visuals, you can try our software for a free, 14-day trial .

Data Analysis

  • Introduction to Data Analysis
  • Quantitative Analysis Tools
  • Qualitative Analysis Tools
  • Mixed Methods Analysis
  • Geospatial Analysis
  • Further Reading

Profile Photo

What is Data Analysis?

According to the federal government, data analysis is "the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap, and evaluate data" ( Responsible Conduct in Data Management ). Important components of data analysis include searching for patterns, remaining unbiased in drawing inference from data, practicing responsible  data management , and maintaining "honest and accurate analysis" ( Responsible Conduct in Data Management ). 

In order to understand data analysis further, it can be helpful to take a step back and understand the question "What is data?". Many of us associate data with spreadsheets of numbers and values, however, data can encompass much more than that. According to the federal government, data is "The recorded factual material commonly accepted in the scientific community as necessary to validate research findings" ( OMB Circular 110 ). This broad definition can include information in many formats. 

Some examples of types of data are as follows:

  • Photographs 
  • Hand-written notes from field observation
  • Machine learning training data sets
  • Ethnographic interview transcripts
  • Sheet music
  • Scripts for plays and musicals 
  • Observations from laboratory experiments ( CMU Data 101 )

Thus, data analysis includes the processing and manipulation of these data sources in order to gain additional insight from data, answer a research question, or confirm a research hypothesis. 

Data analysis falls within the larger research data lifecycle, as seen below. 

( University of Virginia )

Why Analyze Data?

Through data analysis, a researcher can gain additional insight from data and draw conclusions to address the research question or hypothesis. Use of data analysis tools helps researchers understand and interpret data. 

What are the Types of Data Analysis?

Data analysis can be quantitative, qualitative, or mixed methods. 

Quantitative research typically involves numbers and "close-ended questions and responses" ( Creswell & Creswell, 2018 , p. 3). Quantitative research tests variables against objective theories, usually measured and collected on instruments and analyzed using statistical procedures ( Creswell & Creswell, 2018 , p. 4). Quantitative analysis usually uses deductive reasoning. 

Qualitative  research typically involves words and "open-ended questions and responses" ( Creswell & Creswell, 2018 , p. 3). According to Creswell & Creswell, "qualitative research is an approach for exploring and understanding the meaning individuals or groups ascribe to a social or human problem" ( 2018 , p. 4). Thus, qualitative analysis usually invokes inductive reasoning. 

Mixed methods  research uses methods from both quantitative and qualitative research approaches. Mixed methods research works under the "core assumption... that the integration of qualitative and quantitative data yields additional insight beyond the information provided by either the quantitative or qualitative data alone" ( Creswell & Creswell, 2018 , p. 4). 

  • Next: Planning >>
  • Last Updated: Apr 2, 2024 3:53 PM
  • URL: https://guides.library.georgetown.edu/data-analysis

Creative Commons

Two data analysts looking at the different types of data analysis a laptop screen

The 4 Types of Data Analysis [Ultimate Guide]

types of data analysis in research

The most successful businesses and organizations are those that constantly learn and adapt.

No matter what industry you’re operating in, it’s essential to understand what has happened in the past, what’s going on now, and to anticipate what might happen in the future. So how do companies do that?

The answer lies in data analytics . Most companies are collecting data all the time—but, in its raw form, this data doesn’t really mean anything. It’s what you do with the data that counts. Data analytics is the process of analyzing raw data in order to draw out patterns, trends, and insights that can tell you something meaningful about a particular area of the business. These insights are then used to make smart, data-driven decisions.

The kinds of insights you get from your data depends on the type of analysis you perform. In data analytics and data science, there are four main types of data analysis: Descriptive , diagnostic , predictive , and prescriptive .

In this post, we’ll explain each of the four and consider why they’re useful. If you’re interested in a particular type of analysis, jump straight to the relevant section using the clickable menu below.

  • Types of data analysis: Descriptive
  • Types of data analysis: Diagnostic
  • Types of data analysis: Predictive
  • Types of data analysis: Prescriptive
  • Key takeaways and further reading

So, what are the four main types of data analysis? Let’s find out.

The four main types of data analytics: Descriptive, diagnostic, predictive, and prescriptive

1. Types of data analysis: Descriptive (What happened?)

Descriptive analytics looks at what has happened in the past.

As the name suggests, the purpose of descriptive analytics is to simply describe what has happened; it doesn’t try to explain why this might have happened or to establish cause-and-effect relationships. The aim is solely to provide an easily digestible snapshot.

Google Analytics is a good example of descriptive analytics in action; it provides a simple overview of what’s been going on with your website, showing you how many people visited in a given time period, for example, or where your visitors came from. Similarly, tools like HubSpot will show you how many people opened a particular email or engaged with a certain campaign.

A screenshot taken from Google Analytics, showing one of the types of data analysis (descriptive) for a website

There are two main techniques used in descriptive analytics: Data aggregation and data mining.

Data aggregation

Data aggregation is the process of gathering data and presenting it in a summarized format.

Let’s imagine an ecommerce company collects all kinds of data relating to their customers and people who visit their website. The aggregate data, or summarized data, would provide an overview of this wider dataset—such as the average customer age, for example, or the average number of purchases made.

Data mining

Data mining is the analysis part . This is when the analyst explores the data in order to uncover any patterns or trends. The outcome of descriptive analysis is a visual representation of the data—as a bar graph, for example, or a pie chart.

So: Descriptive analytics condenses large volumes of data into a clear, simple overview of what has happened. This is often the starting point for more in-depth analysis, as we’ll now explore.

2. Types of data analysis: Diagnostic (Why did it happen?)

Diagnostic analytics seeks to delve deeper in order to understand why something happened. The main purpose of diagnostic analytics is to identify and respond to anomalies within your data . For example: If your descriptive analysis shows that there was a 20% drop in sales for the month of March, you’ll want to find out why. The next logical step is to perform a diagnostic analysis.

In order to get to the root cause, the analyst will start by identifying any additional data sources that might offer further insight into why the drop in sales occurred. They might drill down to find that, despite a healthy volume of website visitors and a good number of “add to cart” actions, very few customers proceeded to actually check out and make a purchase.

Upon further inspection, it comes to light that the majority of customers abandoned ship at the point of filling out their delivery address. Now we’re getting somewhere! It’s starting to look like there was a problem with the address form; perhaps it wasn’t loading properly on mobile, or was simply too long and frustrating. With a little bit of digging, you’re closer to finding an explanation for your data anomaly.

Diagnostic analytics isn’t just about fixing problems, though; you can also use it to see what’s driving positive results. Perhaps the data tells you that website traffic was through the roof in October—a whopping 60% increase compared to the previous month! When you drill down, it seems that this spike in traffic corresponds to a celebrity mentioning one of your skincare products in their Instagram story.

This opens your eyes to the power of influencer marketing , giving you something to think about for your future marketing strategy.

When running diagnostic analytics, there are a number of different techniques that you might employ, such as probability theory, regression analysis, filtering, and time-series analysis. You can learn more about each of these techniques in our introduction to data analytics .

So: While descriptive analytics looks at what happened, diagnostic analytics explores why it happened.

3. Types of data analysis: Predictive (What is likely to happen in the future?)

Predictive analytics seeks to predict what is likely to happen in the future. Based on past patterns and trends, data analysts can devise predictive models which estimate the likelihood of a future event or outcome. This is especially useful as it enables businesses to plan ahead.

Predictive models use the relationship between a set of variables to make predictions; for example, you might use the correlation between seasonality and sales figures to predict when sales are likely to drop. If your predictive model tells you that sales are likely to go down in summer, you might use this information to come up with a summer-related promotional campaign, or to decrease expenditure elsewhere to make up for the seasonal dip.

Perhaps you own a restaurant and want to predict how many takeaway orders you’re likely to get on a typical Saturday night. Based on what your predictive model tells you, you might decide to get an extra delivery driver on hand.

In addition to forecasting, predictive analytics is also used for classification. A commonly used classification algorithm is logistic regression, which is used to predict a binary outcome based on a set of independent variables. For example: A credit card company might use a predictive model, and specifically logistic regression, to predict whether or not a given customer will default on their payments—in other words, to classify them in one of two categories: “will default” or “will not default”.

Based on these predictions of what category the customer will fall into, the company can quickly assess who might be a good candidate for a credit card. You can learn more about logistic regression and other types of regression analysis .

Machine learning (ML)

Machine learning is a branch of predictive analytics. Just as humans use predictive analytics to devise models and forecast future outcomes, machine learning models are designed to recognize patterns in the data and automatically evolve in order to make accurate predictions. If you’re interested in learning more, there are some useful guides to the similarities and differences between (human-led) predictive analytics and machine learning .

Learn more in our full guide to machine learning .

As you can see, predictive analytics is used to forecast all sorts of future outcomes, and while it can never be one hundred percent accurate, it does eliminate much of the guesswork. This is crucial when it comes to making business decisions and determining the most appropriate course of action.

So: Predictive analytics builds on what happened in the past and why to predict what is likely to happen in the future.

4. Types of data analysis: Prescriptive (What’s the best course of action?)

Prescriptive analytics looks at what has happened, why it happened, and what might happen in order to determine what should be done next.

In other words, prescriptive analytics shows you how you can best take advantage of the future outcomes that have been predicted. What steps can you take to avoid a future problem? What can you do to capitalize on an emerging trend?

Prescriptive analytics is, without doubt, the most complex type of analysis, involving algorithms, machine learning, statistical methods, and computational modeling procedures. Essentially, a prescriptive model considers all the possible decision patterns or pathways a company might take, and their likely outcomes.

This enables you to see how each combination of conditions and decisions might impact the future, and allows you to measure the impact a certain decision might have. Based on all the possible scenarios and potential outcomes, the company can decide what is the best “route” or action to take.

An oft-cited example of prescriptive analytics in action is maps and traffic apps. When figuring out the best way to get you from A to B, Google Maps will consider all the possible modes of transport (e.g. bus, walking, or driving), the current traffic conditions and possible roadworks in order to calculate the best route. In much the same way, prescriptive models are used to calculate all the possible “routes” a company might take to reach their goals in order to determine the best possible option.

Knowing what actions to take for the best chances of success is a major advantage for any type of organization, so it’s no wonder that prescriptive analytics has a huge role to play in business.

So: Prescriptive analytics looks at what has happened, why it happened, and what might happen in order to determine the best course of action for the future.

5. Key takeaways and further reading

In some ways, data analytics is a bit like a treasure hunt; based on clues and insights from the past, you can work out what your next move should be.

With the right type of analysis, all kinds of businesses and organizations can use their data to make smarter decisions, invest more wisely, improve internal processes, and ultimately increase their chances of success. To summarize, there are four main types of data analysis to be aware of:

  • Descriptive analytics: What happened?
  • Diagnostic analytics: Why did it happen?
  • Predictive analytics: What is likely to happen in the future?
  • Prescriptive analytics: What is the best course of action to take?

Now you’re familiar with the different types of data analysis, you can start to explore specific analysis techniques, such as time series analysis, cohort analysis, and regression—to name just a few! We explore some of the most useful data analysis techniques in this guide .

If you’re not already familiar, it’s also worth learning about the different levels of measurement (nominal, ordinal, interval, and ratio) for data .

Ready for a hands-on introduction to the field? Give this free, five-day data analytics short course a go! And, if you’d like to learn more, check out some of these excellent free courses for beginners . Then, to see what it takes to start a career in the field, check out the following:

  • How to become a data analyst: Your five-step plan
  • What are the key skills every data analyst needs?
  • What’s it actually like to work as a data analyst?

Grad Coach

Qualitative Data Analysis Methods 101:

The “big 6” methods + examples.

By: Kerryn Warren (PhD) | Reviewed By: Eunice Rautenbach (D.Tech) | May 2020 (Updated April 2023)

Qualitative data analysis methods. Wow, that’s a mouthful. 

If you’re new to the world of research, qualitative data analysis can look rather intimidating. So much bulky terminology and so many abstract, fluffy concepts. It certainly can be a minefield!

Don’t worry – in this post, we’ll unpack the most popular analysis methods , one at a time, so that you can approach your analysis with confidence and competence – whether that’s for a dissertation, thesis or really any kind of research project.

Qualitative data analysis methods

What (exactly) is qualitative data analysis?

To understand qualitative data analysis, we need to first understand qualitative data – so let’s step back and ask the question, “what exactly is qualitative data?”.

Qualitative data refers to pretty much any data that’s “not numbers” . In other words, it’s not the stuff you measure using a fixed scale or complex equipment, nor do you analyse it using complex statistics or mathematics.

So, if it’s not numbers, what is it?

Words, you guessed? Well… sometimes , yes. Qualitative data can, and often does, take the form of interview transcripts, documents and open-ended survey responses – but it can also involve the interpretation of images and videos. In other words, qualitative isn’t just limited to text-based data.

So, how’s that different from quantitative data, you ask?

Simply put, qualitative research focuses on words, descriptions, concepts or ideas – while quantitative research focuses on numbers and statistics . Qualitative research investigates the “softer side” of things to explore and describe , while quantitative research focuses on the “hard numbers”, to measure differences between variables and the relationships between them. If you’re keen to learn more about the differences between qual and quant, we’ve got a detailed post over here .

qualitative data analysis vs quantitative data analysis

So, qualitative analysis is easier than quantitative, right?

Not quite. In many ways, qualitative data can be challenging and time-consuming to analyse and interpret. At the end of your data collection phase (which itself takes a lot of time), you’ll likely have many pages of text-based data or hours upon hours of audio to work through. You might also have subtle nuances of interactions or discussions that have danced around in your mind, or that you scribbled down in messy field notes. All of this needs to work its way into your analysis.

Making sense of all of this is no small task and you shouldn’t underestimate it. Long story short – qualitative analysis can be a lot of work! Of course, quantitative analysis is no piece of cake either, but it’s important to recognise that qualitative analysis still requires a significant investment in terms of time and effort.

Need a helping hand?

types of data analysis in research

In this post, we’ll explore qualitative data analysis by looking at some of the most common analysis methods we encounter. We’re not going to cover every possible qualitative method and we’re not going to go into heavy detail – we’re just going to give you the big picture. That said, we will of course includes links to loads of extra resources so that you can learn more about whichever analysis method interests you.

Without further delay, let’s get into it.

The “Big 6” Qualitative Analysis Methods 

There are many different types of qualitative data analysis, all of which serve different purposes and have unique strengths and weaknesses . We’ll start by outlining the analysis methods and then we’ll dive into the details for each.

The 6 most popular methods (or at least the ones we see at Grad Coach) are:

  • Content analysis
  • Narrative analysis
  • Discourse analysis
  • Thematic analysis
  • Grounded theory (GT)
  • Interpretive phenomenological analysis (IPA)

Let’s take a look at each of them…

QDA Method #1: Qualitative Content Analysis

Content analysis is possibly the most common and straightforward QDA method. At the simplest level, content analysis is used to evaluate patterns within a piece of content (for example, words, phrases or images) or across multiple pieces of content or sources of communication. For example, a collection of newspaper articles or political speeches.

With content analysis, you could, for instance, identify the frequency with which an idea is shared or spoken about – like the number of times a Kardashian is mentioned on Twitter. Or you could identify patterns of deeper underlying interpretations – for instance, by identifying phrases or words in tourist pamphlets that highlight India as an ancient country.

Because content analysis can be used in such a wide variety of ways, it’s important to go into your analysis with a very specific question and goal, or you’ll get lost in the fog. With content analysis, you’ll group large amounts of text into codes , summarise these into categories, and possibly even tabulate the data to calculate the frequency of certain concepts or variables. Because of this, content analysis provides a small splash of quantitative thinking within a qualitative method.

Naturally, while content analysis is widely useful, it’s not without its drawbacks . One of the main issues with content analysis is that it can be very time-consuming , as it requires lots of reading and re-reading of the texts. Also, because of its multidimensional focus on both qualitative and quantitative aspects, it is sometimes accused of losing important nuances in communication.

Content analysis also tends to concentrate on a very specific timeline and doesn’t take into account what happened before or after that timeline. This isn’t necessarily a bad thing though – just something to be aware of. So, keep these factors in mind if you’re considering content analysis. Every analysis method has its limitations , so don’t be put off by these – just be aware of them ! If you’re interested in learning more about content analysis, the video below provides a good starting point.

QDA Method #2: Narrative Analysis 

As the name suggests, narrative analysis is all about listening to people telling stories and analysing what that means . Since stories serve a functional purpose of helping us make sense of the world, we can gain insights into the ways that people deal with and make sense of reality by analysing their stories and the ways they’re told.

You could, for example, use narrative analysis to explore whether how something is being said is important. For instance, the narrative of a prisoner trying to justify their crime could provide insight into their view of the world and the justice system. Similarly, analysing the ways entrepreneurs talk about the struggles in their careers or cancer patients telling stories of hope could provide powerful insights into their mindsets and perspectives . Simply put, narrative analysis is about paying attention to the stories that people tell – and more importantly, the way they tell them.

Of course, the narrative approach has its weaknesses , too. Sample sizes are generally quite small due to the time-consuming process of capturing narratives. Because of this, along with the multitude of social and lifestyle factors which can influence a subject, narrative analysis can be quite difficult to reproduce in subsequent research. This means that it’s difficult to test the findings of some of this research.

Similarly, researcher bias can have a strong influence on the results here, so you need to be particularly careful about the potential biases you can bring into your analysis when using this method. Nevertheless, narrative analysis is still a very useful qualitative analysis method – just keep these limitations in mind and be careful not to draw broad conclusions . If you’re keen to learn more about narrative analysis, the video below provides a great introduction to this qualitative analysis method.

QDA Method #3: Discourse Analysis 

Discourse is simply a fancy word for written or spoken language or debate . So, discourse analysis is all about analysing language within its social context. In other words, analysing language – such as a conversation, a speech, etc – within the culture and society it takes place. For example, you could analyse how a janitor speaks to a CEO, or how politicians speak about terrorism.

To truly understand these conversations or speeches, the culture and history of those involved in the communication are important factors to consider. For example, a janitor might speak more casually with a CEO in a company that emphasises equality among workers. Similarly, a politician might speak more about terrorism if there was a recent terrorist incident in the country.

So, as you can see, by using discourse analysis, you can identify how culture , history or power dynamics (to name a few) have an effect on the way concepts are spoken about. So, if your research aims and objectives involve understanding culture or power dynamics, discourse analysis can be a powerful method.

Because there are many social influences in terms of how we speak to each other, the potential use of discourse analysis is vast . Of course, this also means it’s important to have a very specific research question (or questions) in mind when analysing your data and looking for patterns and themes, or you might land up going down a winding rabbit hole.

Discourse analysis can also be very time-consuming  as you need to sample the data to the point of saturation – in other words, until no new information and insights emerge. But this is, of course, part of what makes discourse analysis such a powerful technique. So, keep these factors in mind when considering this QDA method. Again, if you’re keen to learn more, the video below presents a good starting point.

QDA Method #4: Thematic Analysis

Thematic analysis looks at patterns of meaning in a data set – for example, a set of interviews or focus group transcripts. But what exactly does that… mean? Well, a thematic analysis takes bodies of data (which are often quite large) and groups them according to similarities – in other words, themes . These themes help us make sense of the content and derive meaning from it.

Let’s take a look at an example.

With thematic analysis, you could analyse 100 online reviews of a popular sushi restaurant to find out what patrons think about the place. By reviewing the data, you would then identify the themes that crop up repeatedly within the data – for example, “fresh ingredients” or “friendly wait staff”.

So, as you can see, thematic analysis can be pretty useful for finding out about people’s experiences , views, and opinions . Therefore, if your research aims and objectives involve understanding people’s experience or view of something, thematic analysis can be a great choice.

Since thematic analysis is a bit of an exploratory process, it’s not unusual for your research questions to develop , or even change as you progress through the analysis. While this is somewhat natural in exploratory research, it can also be seen as a disadvantage as it means that data needs to be re-reviewed each time a research question is adjusted. In other words, thematic analysis can be quite time-consuming – but for a good reason. So, keep this in mind if you choose to use thematic analysis for your project and budget extra time for unexpected adjustments.

Thematic analysis takes bodies of data and groups them according to similarities (themes), which help us make sense of the content.

QDA Method #5: Grounded theory (GT) 

Grounded theory is a powerful qualitative analysis method where the intention is to create a new theory (or theories) using the data at hand, through a series of “ tests ” and “ revisions ”. Strictly speaking, GT is more a research design type than an analysis method, but we’ve included it here as it’s often referred to as a method.

What’s most important with grounded theory is that you go into the analysis with an open mind and let the data speak for itself – rather than dragging existing hypotheses or theories into your analysis. In other words, your analysis must develop from the ground up (hence the name). 

Let’s look at an example of GT in action.

Assume you’re interested in developing a theory about what factors influence students to watch a YouTube video about qualitative analysis. Using Grounded theory , you’d start with this general overarching question about the given population (i.e., graduate students). First, you’d approach a small sample – for example, five graduate students in a department at a university. Ideally, this sample would be reasonably representative of the broader population. You’d interview these students to identify what factors lead them to watch the video.

After analysing the interview data, a general pattern could emerge. For example, you might notice that graduate students are more likely to read a post about qualitative methods if they are just starting on their dissertation journey, or if they have an upcoming test about research methods.

From here, you’ll look for another small sample – for example, five more graduate students in a different department – and see whether this pattern holds true for them. If not, you’ll look for commonalities and adapt your theory accordingly. As this process continues, the theory would develop . As we mentioned earlier, what’s important with grounded theory is that the theory develops from the data – not from some preconceived idea.

So, what are the drawbacks of grounded theory? Well, some argue that there’s a tricky circularity to grounded theory. For it to work, in principle, you should know as little as possible regarding the research question and population, so that you reduce the bias in your interpretation. However, in many circumstances, it’s also thought to be unwise to approach a research question without knowledge of the current literature . In other words, it’s a bit of a “chicken or the egg” situation.

Regardless, grounded theory remains a popular (and powerful) option. Naturally, it’s a very useful method when you’re researching a topic that is completely new or has very little existing research about it, as it allows you to start from scratch and work your way from the ground up .

Grounded theory is used to create a new theory (or theories) by using the data at hand, as opposed to existing theories and frameworks.

QDA Method #6:   Interpretive Phenomenological Analysis (IPA)

Interpretive. Phenomenological. Analysis. IPA . Try saying that three times fast…

Let’s just stick with IPA, okay?

IPA is designed to help you understand the personal experiences of a subject (for example, a person or group of people) concerning a major life event, an experience or a situation . This event or experience is the “phenomenon” that makes up the “P” in IPA. Such phenomena may range from relatively common events – such as motherhood, or being involved in a car accident – to those which are extremely rare – for example, someone’s personal experience in a refugee camp. So, IPA is a great choice if your research involves analysing people’s personal experiences of something that happened to them.

It’s important to remember that IPA is subject – centred . In other words, it’s focused on the experiencer . This means that, while you’ll likely use a coding system to identify commonalities, it’s important not to lose the depth of experience or meaning by trying to reduce everything to codes. Also, keep in mind that since your sample size will generally be very small with IPA, you often won’t be able to draw broad conclusions about the generalisability of your findings. But that’s okay as long as it aligns with your research aims and objectives.

Another thing to be aware of with IPA is personal bias . While researcher bias can creep into all forms of research, self-awareness is critically important with IPA, as it can have a major impact on the results. For example, a researcher who was a victim of a crime himself could insert his own feelings of frustration and anger into the way he interprets the experience of someone who was kidnapped. So, if you’re going to undertake IPA, you need to be very self-aware or you could muddy the analysis.

IPA can help you understand the personal experiences of a person or group concerning a major life event, an experience or a situation.

How to choose the right analysis method

In light of all of the qualitative analysis methods we’ve covered so far, you’re probably asking yourself the question, “ How do I choose the right one? ”

Much like all the other methodological decisions you’ll need to make, selecting the right qualitative analysis method largely depends on your research aims, objectives and questions . In other words, the best tool for the job depends on what you’re trying to build. For example:

  • Perhaps your research aims to analyse the use of words and what they reveal about the intention of the storyteller and the cultural context of the time.
  • Perhaps your research aims to develop an understanding of the unique personal experiences of people that have experienced a certain event, or
  • Perhaps your research aims to develop insight regarding the influence of a certain culture on its members.

As you can probably see, each of these research aims are distinctly different , and therefore different analysis methods would be suitable for each one. For example, narrative analysis would likely be a good option for the first aim, while grounded theory wouldn’t be as relevant. 

It’s also important to remember that each method has its own set of strengths, weaknesses and general limitations. No single analysis method is perfect . So, depending on the nature of your research, it may make sense to adopt more than one method (this is called triangulation ). Keep in mind though that this will of course be quite time-consuming.

As we’ve seen, all of the qualitative analysis methods we’ve discussed make use of coding and theme-generating techniques, but the intent and approach of each analysis method differ quite substantially. So, it’s very important to come into your research with a clear intention before you decide which analysis method (or methods) to use.

Start by reviewing your research aims , objectives and research questions to assess what exactly you’re trying to find out – then select a qualitative analysis method that fits. Never pick a method just because you like it or have experience using it – your analysis method (or methods) must align with your broader research aims and objectives.

No single analysis method is perfect, so it can often make sense to adopt more than one  method (this is called triangulation).

Let’s recap on QDA methods…

In this post, we looked at six popular qualitative data analysis methods:

  • First, we looked at content analysis , a straightforward method that blends a little bit of quant into a primarily qualitative analysis.
  • Then we looked at narrative analysis , which is about analysing how stories are told.
  • Next up was discourse analysis – which is about analysing conversations and interactions.
  • Then we moved on to thematic analysis – which is about identifying themes and patterns.
  • From there, we went south with grounded theory – which is about starting from scratch with a specific question and using the data alone to build a theory in response to that question.
  • And finally, we looked at IPA – which is about understanding people’s unique experiences of a phenomenon.

Of course, these aren’t the only options when it comes to qualitative data analysis, but they’re a great starting point if you’re dipping your toes into qualitative research for the first time.

If you’re still feeling a bit confused, consider our private coaching service , where we hold your hand through the research process to help you develop your best work.

types of data analysis in research

Psst... there’s more!

This post was based on one of our popular Research Bootcamps . If you're working on a research project, you'll definitely want to check this out ...

You Might Also Like:

Research design for qualitative and quantitative studies

84 Comments

Richard N

This has been very helpful. Thank you.

netaji

Thank you madam,

Mariam Jaiyeola

Thank you so much for this information

Nzube

I wonder it so clear for understand and good for me. can I ask additional query?

Lee

Very insightful and useful

Susan Nakaweesi

Good work done with clear explanations. Thank you.

Titilayo

Thanks so much for the write-up, it’s really good.

Hemantha Gunasekara

Thanks madam . It is very important .

Gumathandra

thank you very good

Pramod Bahulekar

This has been very well explained in simple language . It is useful even for a new researcher.

Derek Jansen

Great to hear that. Good luck with your qualitative data analysis, Pramod!

Adam Zahir

This is very useful information. And it was very a clear language structured presentation. Thanks a lot.

Golit,F.

Thank you so much.

Emmanuel

very informative sequential presentation

Shahzada

Precise explanation of method.

Alyssa

Hi, may we use 2 data analysis methods in our qualitative research?

Thanks for your comment. Most commonly, one would use one type of analysis method, but it depends on your research aims and objectives.

Dr. Manju Pandey

You explained it in very simple language, everyone can understand it. Thanks so much.

Phillip

Thank you very much, this is very helpful. It has been explained in a very simple manner that even a layman understands

Anne

Thank nicely explained can I ask is Qualitative content analysis the same as thematic analysis?

Thanks for your comment. No, QCA and thematic are two different types of analysis. This article might help clarify – https://onlinelibrary.wiley.com/doi/10.1111/nhs.12048

Rev. Osadare K . J

This is my first time to come across a well explained data analysis. so helpful.

Tina King

I have thoroughly enjoyed your explanation of the six qualitative analysis methods. This is very helpful. Thank you!

Bromie

Thank you very much, this is well explained and useful

udayangani

i need a citation of your book.

khutsafalo

Thanks a lot , remarkable indeed, enlighting to the best

jas

Hi Derek, What other theories/methods would you recommend when the data is a whole speech?

M

Keep writing useful artikel.

Adane

It is important concept about QDA and also the way to express is easily understandable, so thanks for all.

Carl Benecke

Thank you, this is well explained and very useful.

Ngwisa

Very helpful .Thanks.

Hajra Aman

Hi there! Very well explained. Simple but very useful style of writing. Please provide the citation of the text. warm regards

Hillary Mophethe

The session was very helpful and insightful. Thank you

This was very helpful and insightful. Easy to read and understand

Catherine

As a professional academic writer, this has been so informative and educative. Keep up the good work Grad Coach you are unmatched with quality content for sure.

Keep up the good work Grad Coach you are unmatched with quality content for sure.

Abdulkerim

Its Great and help me the most. A Million Thanks you Dr.

Emanuela

It is a very nice work

Noble Naade

Very insightful. Please, which of this approach could be used for a research that one is trying to elicit students’ misconceptions in a particular concept ?

Karen

This is Amazing and well explained, thanks

amirhossein

great overview

Tebogo

What do we call a research data analysis method that one use to advise or determining the best accounting tool or techniques that should be adopted in a company.

Catherine Shimechero

Informative video, explained in a clear and simple way. Kudos

Van Hmung

Waoo! I have chosen method wrong for my data analysis. But I can revise my work according to this guide. Thank you so much for this helpful lecture.

BRIAN ONYANGO MWAGA

This has been very helpful. It gave me a good view of my research objectives and how to choose the best method. Thematic analysis it is.

Livhuwani Reineth

Very helpful indeed. Thanku so much for the insight.

Storm Erlank

This was incredibly helpful.

Jack Kanas

Very helpful.

catherine

very educative

Wan Roslina

Nicely written especially for novice academic researchers like me! Thank you.

Talash

choosing a right method for a paper is always a hard job for a student, this is a useful information, but it would be more useful personally for me, if the author provide me with a little bit more information about the data analysis techniques in type of explanatory research. Can we use qualitative content analysis technique for explanatory research ? or what is the suitable data analysis method for explanatory research in social studies?

ramesh

that was very helpful for me. because these details are so important to my research. thank you very much

Kumsa Desisa

I learnt a lot. Thank you

Tesfa NT

Relevant and Informative, thanks !

norma

Well-planned and organized, thanks much! 🙂

Dr. Jacob Lubuva

I have reviewed qualitative data analysis in a simplest way possible. The content will highly be useful for developing my book on qualitative data analysis methods. Cheers!

Nyi Nyi Lwin

Clear explanation on qualitative and how about Case study

Ogobuchi Otuu

This was helpful. Thank you

Alicia

This was really of great assistance, it was just the right information needed. Explanation very clear and follow.

Wow, Thanks for making my life easy

C. U

This was helpful thanks .

Dr. Alina Atif

Very helpful…. clear and written in an easily understandable manner. Thank you.

Herb

This was so helpful as it was easy to understand. I’m a new to research thank you so much.

cissy

so educative…. but Ijust want to know which method is coding of the qualitative or tallying done?

Ayo

Thank you for the great content, I have learnt a lot. So helpful

Tesfaye

precise and clear presentation with simple language and thank you for that.

nneheng

very informative content, thank you.

Oscar Kuebutornye

You guys are amazing on YouTube on this platform. Your teachings are great, educative, and informative. kudos!

NG

Brilliant Delivery. You made a complex subject seem so easy. Well done.

Ankit Kumar

Beautifully explained.

Thanks a lot

Kidada Owen-Browne

Is there a video the captures the practical process of coding using automated applications?

Thanks for the comment. We don’t recommend using automated applications for coding, as they are not sufficiently accurate in our experience.

Mathewos Damtew

content analysis can be qualitative research?

Hend

THANK YOU VERY MUCH.

Dev get

Thank you very much for such a wonderful content

Kassahun Aman

do you have any material on Data collection

Prince .S. mpofu

What a powerful explanation of the QDA methods. Thank you.

Kassahun

Great explanation both written and Video. i have been using of it on a day to day working of my thesis project in accounting and finance. Thank you very much for your support.

BORA SAMWELI MATUTULI

very helpful, thank you so much

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Anaesth
  • v.60(9); 2016 Sep

Basic statistical tools in research and data analysis

Zulfiqar ali.

Department of Anaesthesiology, Division of Neuroanaesthesiology, Sheri Kashmir Institute of Medical Sciences, Soura, Srinagar, Jammu and Kashmir, India

S Bala Bhaskar

1 Department of Anaesthesiology and Critical Care, Vijayanagar Institute of Medical Sciences, Bellary, Karnataka, India

Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.

INTRODUCTION

Statistics is a branch of science that deals with the collection, organisation, analysis of data and drawing of inferences from the samples to the whole population.[ 1 ] This requires a proper design of the study, an appropriate selection of the study sample and choice of a suitable statistical test. An adequate knowledge of statistics is necessary for proper designing of an epidemiological study or a clinical trial. Improper statistical methods may result in erroneous conclusions which may lead to unethical practice.[ 2 ]

Variable is a characteristic that varies from one individual member of population to another individual.[ 3 ] Variables such as height and weight are measured by some type of scale, convey quantitative information and are called as quantitative variables. Sex and eye colour give qualitative information and are called as qualitative variables[ 3 ] [ Figure 1 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g001.jpg

Classification of variables

Quantitative variables

Quantitative or numerical data are subdivided into discrete and continuous measurements. Discrete numerical data are recorded as a whole number such as 0, 1, 2, 3,… (integer), whereas continuous data can assume any value. Observations that can be counted constitute the discrete data and observations that can be measured constitute the continuous data. Examples of discrete data are number of episodes of respiratory arrests or the number of re-intubations in an intensive care unit. Similarly, examples of continuous data are the serial serum glucose levels, partial pressure of oxygen in arterial blood and the oesophageal temperature.

A hierarchical scale of increasing precision can be used for observing and recording the data which is based on categorical, ordinal, interval and ratio scales [ Figure 1 ].

Categorical or nominal variables are unordered. The data are merely classified into categories and cannot be arranged in any particular order. If only two categories exist (as in gender male and female), it is called as a dichotomous (or binary) data. The various causes of re-intubation in an intensive care unit due to upper airway obstruction, impaired clearance of secretions, hypoxemia, hypercapnia, pulmonary oedema and neurological impairment are examples of categorical variables.

Ordinal variables have a clear ordering between the variables. However, the ordered data may not have equal intervals. Examples are the American Society of Anesthesiologists status or Richmond agitation-sedation scale.

Interval variables are similar to an ordinal variable, except that the intervals between the values of the interval variable are equally spaced. A good example of an interval scale is the Fahrenheit degree scale used to measure temperature. With the Fahrenheit scale, the difference between 70° and 75° is equal to the difference between 80° and 85°: The units of measurement are equal throughout the full range of the scale.

Ratio scales are similar to interval scales, in that equal differences between scale values have equal quantitative meaning. However, ratio scales also have a true zero point, which gives them an additional property. For example, the system of centimetres is an example of a ratio scale. There is a true zero point and the value of 0 cm means a complete absence of length. The thyromental distance of 6 cm in an adult may be twice that of a child in whom it may be 3 cm.

STATISTICS: DESCRIPTIVE AND INFERENTIAL STATISTICS

Descriptive statistics[ 4 ] try to describe the relationship between variables in a sample or population. Descriptive statistics provide a summary of data in the form of mean, median and mode. Inferential statistics[ 4 ] use a random sample of data taken from a population to describe and make inferences about the whole population. It is valuable when it is not possible to examine each member of an entire population. The examples if descriptive and inferential statistics are illustrated in Table 1 .

Example of descriptive and inferential statistics

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g002.jpg

Descriptive statistics

The extent to which the observations cluster around a central location is described by the central tendency and the spread towards the extremes is described by the degree of dispersion.

Measures of central tendency

The measures of central tendency are mean, median and mode.[ 6 ] Mean (or the arithmetic average) is the sum of all the scores divided by the number of scores. Mean may be influenced profoundly by the extreme variables. For example, the average stay of organophosphorus poisoning patients in ICU may be influenced by a single patient who stays in ICU for around 5 months because of septicaemia. The extreme values are called outliers. The formula for the mean is

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g003.jpg

where x = each observation and n = number of observations. Median[ 6 ] is defined as the middle of a distribution in a ranked data (with half of the variables in the sample above and half below the median value) while mode is the most frequently occurring variable in a distribution. Range defines the spread, or variability, of a sample.[ 7 ] It is described by the minimum and maximum values of the variables. If we rank the data and after ranking, group the observations into percentiles, we can get better information of the pattern of spread of the variables. In percentiles, we rank the observations into 100 equal parts. We can then describe 25%, 50%, 75% or any other percentile amount. The median is the 50 th percentile. The interquartile range will be the observations in the middle 50% of the observations about the median (25 th -75 th percentile). Variance[ 7 ] is a measure of how spread out is the distribution. It gives an indication of how close an individual observation clusters about the mean value. The variance of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g004.jpg

where σ 2 is the population variance, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The variance of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g005.jpg

where s 2 is the sample variance, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. The formula for the variance of a population has the value ‘ n ’ as the denominator. The expression ‘ n −1’ is known as the degrees of freedom and is one less than the number of parameters. Each observation is free to vary, except the last one which must be a defined value. The variance is measured in squared units. To make the interpretation of the data simple and to retain the basic unit of observation, the square root of variance is used. The square root of the variance is the standard deviation (SD).[ 8 ] The SD of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g006.jpg

where σ is the population SD, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The SD of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g007.jpg

where s is the sample SD, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. An example for calculation of variation and SD is illustrated in Table 2 .

Example of mean, variance, standard deviation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g008.jpg

Normal distribution or Gaussian distribution

Most of the biological variables usually cluster around a central value, with symmetrical positive and negative deviations about this point.[ 1 ] The standard normal distribution curve is a symmetrical bell-shaped. In a normal distribution curve, about 68% of the scores are within 1 SD of the mean. Around 95% of the scores are within 2 SDs of the mean and 99% within 3 SDs of the mean [ Figure 2 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g009.jpg

Normal distribution curve

Skewed distribution

It is a distribution with an asymmetry of the variables about its mean. In a negatively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the right of Figure 1 . In a positively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the left of the figure leading to a longer right tail.

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g010.jpg

Curves showing negatively skewed and positively skewed distribution

Inferential statistics

In inferential statistics, data are analysed from a sample to make inferences in the larger collection of the population. The purpose is to answer or test the hypotheses. A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. Hypothesis tests are thus procedures for making rational decisions about the reality of observed effects.

Probability is the measure of the likelihood that an event will occur. Probability is quantified as a number between 0 and 1 (where 0 indicates impossibility and 1 indicates certainty).

In inferential statistics, the term ‘null hypothesis’ ( H 0 ‘ H-naught ,’ ‘ H-null ’) denotes that there is no relationship (difference) between the population variables in question.[ 9 ]

Alternative hypothesis ( H 1 and H a ) denotes that a statement between the variables is expected to be true.[ 9 ]

The P value (or the calculated probability) is the probability of the event occurring by chance if the null hypothesis is true. The P value is a numerical between 0 and 1 and is interpreted by researchers in deciding whether to reject or retain the null hypothesis [ Table 3 ].

P values with interpretation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g011.jpg

If P value is less than the arbitrarily chosen value (known as α or the significance level), the null hypothesis (H0) is rejected [ Table 4 ]. However, if null hypotheses (H0) is incorrectly rejected, this is known as a Type I error.[ 11 ] Further details regarding alpha error, beta error and sample size calculation and factors influencing them are dealt with in another section of this issue by Das S et al .[ 12 ]

Illustration for null hypothesis

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g012.jpg

PARAMETRIC AND NON-PARAMETRIC TESTS

Numerical data (quantitative variables) that are normally distributed are analysed with parametric tests.[ 13 ]

Two most basic prerequisites for parametric statistical analysis are:

  • The assumption of normality which specifies that the means of the sample group are normally distributed
  • The assumption of equal variance which specifies that the variances of the samples and of their corresponding population are equal.

However, if the distribution of the sample is skewed towards one side or the distribution is unknown due to the small sample size, non-parametric[ 14 ] statistical techniques are used. Non-parametric tests are used to analyse ordinal and categorical data.

Parametric tests

The parametric tests assume that the data are on a quantitative (numerical) scale, with a normal distribution of the underlying population. The samples have the same variance (homogeneity of variances). The samples are randomly drawn from the population, and the observations within a group are independent of each other. The commonly used parametric tests are the Student's t -test, analysis of variance (ANOVA) and repeated measures ANOVA.

Student's t -test

Student's t -test is used to test the null hypothesis that there is no difference between the means of the two groups. It is used in three circumstances:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g013.jpg

where X = sample mean, u = population mean and SE = standard error of mean

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g014.jpg

where X 1 − X 2 is the difference between the means of the two groups and SE denotes the standard error of the difference.

  • To test if the population means estimated by two dependent samples differ significantly (the paired t -test). A usual setting for paired t -test is when measurements are made on the same subjects before and after a treatment.

The formula for paired t -test is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g015.jpg

where d is the mean difference and SE denotes the standard error of this difference.

The group variances can be compared using the F -test. The F -test is the ratio of variances (var l/var 2). If F differs significantly from 1.0, then it is concluded that the group variances differ significantly.

Analysis of variance

The Student's t -test cannot be used for comparison of three or more groups. The purpose of ANOVA is to test if there is any significant difference between the means of two or more groups.

In ANOVA, we study two variances – (a) between-group variability and (b) within-group variability. The within-group variability (error variance) is the variation that cannot be accounted for in the study design. It is based on random differences present in our samples.

However, the between-group (or effect variance) is the result of our treatment. These two estimates of variances are compared using the F-test.

A simplified formula for the F statistic is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g016.jpg

where MS b is the mean squares between the groups and MS w is the mean squares within groups.

Repeated measures analysis of variance

As with ANOVA, repeated measures ANOVA analyses the equality of means of three or more groups. However, a repeated measure ANOVA is used when all variables of a sample are measured under different conditions or at different points in time.

As the variables are measured from a sample at different points of time, the measurement of the dependent variable is repeated. Using a standard ANOVA in this case is not appropriate because it fails to model the correlation between the repeated measures: The data violate the ANOVA assumption of independence. Hence, in the measurement of repeated dependent variables, repeated measures ANOVA should be used.

Non-parametric tests

When the assumptions of normality are not met, and the sample means are not normally, distributed parametric tests can lead to erroneous results. Non-parametric tests (distribution-free test) are used in such situation as they do not require the normality assumption.[ 15 ] Non-parametric tests may fail to detect a significant difference when compared with a parametric test. That is, they usually have less power.

As is done for the parametric tests, the test statistic is compared with known values for the sampling distribution of that statistic and the null hypothesis is accepted or rejected. The types of non-parametric analysis techniques and the corresponding parametric analysis techniques are delineated in Table 5 .

Analogue of parametric and non-parametric tests

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g017.jpg

Median test for one sample: The sign test and Wilcoxon's signed rank test

The sign test and Wilcoxon's signed rank test are used for median tests of one sample. These tests examine whether one instance of sample data is greater or smaller than the median reference value.

This test examines the hypothesis about the median θ0 of a population. It tests the null hypothesis H0 = θ0. When the observed value (Xi) is greater than the reference value (θ0), it is marked as+. If the observed value is smaller than the reference value, it is marked as − sign. If the observed value is equal to the reference value (θ0), it is eliminated from the sample.

If the null hypothesis is true, there will be an equal number of + signs and − signs.

The sign test ignores the actual values of the data and only uses + or − signs. Therefore, it is useful when it is difficult to measure the values.

Wilcoxon's signed rank test

There is a major limitation of sign test as we lose the quantitative information of the given data and merely use the + or – signs. Wilcoxon's signed rank test not only examines the observed values in comparison with θ0 but also takes into consideration the relative sizes, adding more statistical power to the test. As in the sign test, if there is an observed value that is equal to the reference value θ0, this observed value is eliminated from the sample.

Wilcoxon's rank sum test ranks all data points in order, calculates the rank sum of each sample and compares the difference in the rank sums.

Mann-Whitney test

It is used to test the null hypothesis that two samples have the same median or, alternatively, whether observations in one sample tend to be larger than observations in the other.

Mann–Whitney test compares all data (xi) belonging to the X group and all data (yi) belonging to the Y group and calculates the probability of xi being greater than yi: P (xi > yi). The null hypothesis states that P (xi > yi) = P (xi < yi) =1/2 while the alternative hypothesis states that P (xi > yi) ≠1/2.

Kolmogorov-Smirnov test

The two-sample Kolmogorov-Smirnov (KS) test was designed as a generic method to test whether two random samples are drawn from the same distribution. The null hypothesis of the KS test is that both distributions are identical. The statistic of the KS test is a distance between the two empirical distributions, computed as the maximum absolute difference between their cumulative curves.

Kruskal-Wallis test

The Kruskal–Wallis test is a non-parametric test to analyse the variance.[ 14 ] It analyses if there is any difference in the median values of three or more independent samples. The data values are ranked in an increasing order, and the rank sums calculated followed by calculation of the test statistic.

Jonckheere test

In contrast to Kruskal–Wallis test, in Jonckheere test, there is an a priori ordering that gives it a more statistical power than the Kruskal–Wallis test.[ 14 ]

Friedman test

The Friedman test is a non-parametric test for testing the difference between several related samples. The Friedman test is an alternative for repeated measures ANOVAs which is used when the same parameter has been measured under different conditions on the same subjects.[ 13 ]

Tests to analyse the categorical data

Chi-square test, Fischer's exact test and McNemar's test are used to analyse the categorical or nominal variables. The Chi-square test compares the frequencies and tests whether the observed data differ significantly from that of the expected data if there were no differences between groups (i.e., the null hypothesis). It is calculated by the sum of the squared difference between observed ( O ) and the expected ( E ) data (or the deviation, d ) divided by the expected data by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g018.jpg

A Yates correction factor is used when the sample size is small. Fischer's exact test is used to determine if there are non-random associations between two categorical variables. It does not assume random sampling, and instead of referring a calculated statistic to a sampling distribution, it calculates an exact probability. McNemar's test is used for paired nominal data. It is applied to 2 × 2 table with paired-dependent samples. It is used to determine whether the row and column frequencies are equal (that is, whether there is ‘marginal homogeneity’). The null hypothesis is that the paired proportions are equal. The Mantel-Haenszel Chi-square test is a multivariate test as it analyses multiple grouping variables. It stratifies according to the nominated confounding variables and identifies any that affects the primary outcome variable. If the outcome variable is dichotomous, then logistic regression is used.

SOFTWARES AVAILABLE FOR STATISTICS, SAMPLE SIZE CALCULATION AND POWER ANALYSIS

Numerous statistical software systems are available currently. The commonly used software systems are Statistical Package for the Social Sciences (SPSS – manufactured by IBM corporation), Statistical Analysis System ((SAS – developed by SAS Institute North Carolina, United States of America), R (designed by Ross Ihaka and Robert Gentleman from R core team), Minitab (developed by Minitab Inc), Stata (developed by StataCorp) and the MS Excel (developed by Microsoft).

There are a number of web resources which are related to statistical power analyses. A few are:

  • StatPages.net – provides links to a number of online power calculators
  • G-Power – provides a downloadable power analysis program that runs under DOS
  • Power analysis for ANOVA designs an interactive site that calculates power or sample size needed to attain a given power for one effect in a factorial ANOVA design
  • SPSS makes a program called SamplePower. It gives an output of a complete report on the computer screen which can be cut and paste into another document.

It is important that a researcher knows the concepts of the basic statistical methods used for conduct of a research study. This will help to conduct an appropriately well-designed study leading to valid and reliable results. Inappropriate use of statistical techniques may lead to faulty conclusions, inducing errors and undermining the significance of the article. Bad statistics may lead to bad research, and bad research may lead to unethical practice. Hence, an adequate knowledge of statistics and the appropriate use of statistical tests are important. An appropriate knowledge about the basic statistical methods will go a long way in improving the research designs and producing quality medical research which can be utilised for formulating the evidence-based guidelines.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

types of data analysis in research

Home Market Research

Qualitative Data Analysis: What is it, Methods + Examples

Explore qualitative data analysis with diverse methods and real-world examples. Uncover the nuances of human experiences with this guide.

In a world rich with information and narrative, understanding the deeper layers of human experiences requires a unique vision that goes beyond numbers and figures. This is where the power of qualitative data analysis comes to light.

In this blog, we’ll learn about qualitative data analysis, explore its methods, and provide real-life examples showcasing its power in uncovering insights.

What is Qualitative Data Analysis?

Qualitative data analysis is a systematic process of examining non-numerical data to extract meaning, patterns, and insights.

In contrast to quantitative analysis, which focuses on numbers and statistical metrics, the qualitative study focuses on the qualitative aspects of data, such as text, images, audio, and videos. It seeks to understand every aspect of human experiences, perceptions, and behaviors by examining the data’s richness.

Companies frequently conduct this analysis on customer feedback. You can collect qualitative data from reviews, complaints, chat messages, interactions with support centers, customer interviews, case notes, or even social media comments. This kind of data holds the key to understanding customer sentiments and preferences in a way that goes beyond mere numbers.

Importance of Qualitative Data Analysis

Qualitative data analysis plays a crucial role in your research and decision-making process across various disciplines. Let’s explore some key reasons that underline the significance of this analysis:

In-Depth Understanding

It enables you to explore complex and nuanced aspects of a phenomenon, delving into the ‘how’ and ‘why’ questions. This method provides you with a deeper understanding of human behavior, experiences, and contexts that quantitative approaches might not capture fully.

Contextual Insight

You can use this analysis to give context to numerical data. It will help you understand the circumstances and conditions that influence participants’ thoughts, feelings, and actions. This contextual insight becomes essential for generating comprehensive explanations.

Theory Development

You can generate or refine hypotheses via qualitative data analysis. As you analyze the data attentively, you can form hypotheses, concepts, and frameworks that will drive your future research and contribute to theoretical advances.

Participant Perspectives

When performing qualitative research, you can highlight participant voices and opinions. This approach is especially useful for understanding marginalized or underrepresented people, as it allows them to communicate their experiences and points of view.

Exploratory Research

The analysis is frequently used at the exploratory stage of your project. It assists you in identifying important variables, developing research questions, and designing quantitative studies that will follow.

Types of Qualitative Data

When conducting qualitative research, you can use several qualitative data collection methods , and here you will come across many sorts of qualitative data that can provide you with unique insights into your study topic. These data kinds add new views and angles to your understanding and analysis.

Interviews and Focus Groups

Interviews and focus groups will be among your key methods for gathering qualitative data. Interviews are one-on-one talks in which participants can freely share their thoughts, experiences, and opinions.

Focus groups, on the other hand, are discussions in which members interact with one another, resulting in dynamic exchanges of ideas. Both methods provide rich qualitative data and direct access to participant perspectives.

Observations and Field Notes

Observations and field notes are another useful sort of qualitative data. You can immerse yourself in the research environment through direct observation, carefully documenting behaviors, interactions, and contextual factors.

These observations will be recorded in your field notes, providing a complete picture of the environment and the behaviors you’re researching. This data type is especially important for comprehending behavior in their natural setting.

Textual and Visual Data

Textual and visual data include a wide range of resources that can be qualitatively analyzed. Documents, written narratives, and transcripts from various sources, such as interviews or speeches, are examples of textual data.

Photographs, films, and even artwork provide a visual layer to your research. These forms of data allow you to investigate what is spoken and the underlying emotions, details, and symbols expressed by language or pictures.

When to Choose Qualitative Data Analysis over Quantitative Data Analysis

As you begin your research journey, understanding why the analysis of qualitative data is important will guide your approach to understanding complex events. If you analyze qualitative data, it will provide new insights that complement quantitative methodologies, which will give you a broader understanding of your study topic.

It is critical to know when to use qualitative analysis over quantitative procedures. You can prefer qualitative data analysis when:

  • Complexity Reigns: When your research questions involve deep human experiences, motivations, or emotions, qualitative research excels at revealing these complexities.
  • Exploration is Key: Qualitative analysis is ideal for exploratory research. It will assist you in understanding a new or poorly understood topic before formulating quantitative hypotheses.
  • Context Matters: If you want to understand how context affects behaviors or results, qualitative data analysis provides the depth needed to grasp these relationships.
  • Unanticipated Findings: When your study provides surprising new viewpoints or ideas, qualitative analysis helps you to delve deeply into these emerging themes.
  • Subjective Interpretation is Vital: When it comes to understanding people’s subjective experiences and interpretations, qualitative data analysis is the way to go.

You can make informed decisions regarding the right approach for your research objectives if you understand the importance of qualitative analysis and recognize the situations where it shines.

Qualitative Data Analysis Methods and Examples

Exploring various qualitative data analysis methods will provide you with a wide collection for making sense of your research findings. Once the data has been collected, you can choose from several analysis methods based on your research objectives and the data type you’ve collected.

There are five main methods for analyzing qualitative data. Each method takes a distinct approach to identifying patterns, themes, and insights within your qualitative data. They are:

Method 1: Content Analysis

Content analysis is a methodical technique for analyzing textual or visual data in a structured manner. In this method, you will categorize qualitative data by splitting it into manageable pieces and assigning the manual coding process to these units.

As you go, you’ll notice ongoing codes and designs that will allow you to conclude the content. This method is very beneficial for detecting common ideas, concepts, or themes in your data without losing the context.

Steps to Do Content Analysis

Follow these steps when conducting content analysis:

  • Collect and Immerse: Begin by collecting the necessary textual or visual data. Immerse yourself in this data to fully understand its content, context, and complexities.
  • Assign Codes and Categories: Assign codes to relevant data sections that systematically represent major ideas or themes. Arrange comparable codes into groups that cover the major themes.
  • Analyze and Interpret: Develop a structured framework from the categories and codes. Then, evaluate the data in the context of your research question, investigate relationships between categories, discover patterns, and draw meaning from these connections.

Benefits & Challenges

There are various advantages to using content analysis:

  • Structured Approach: It offers a systematic approach to dealing with large data sets and ensures consistency throughout the research.
  • Objective Insights: This method promotes objectivity, which helps to reduce potential biases in your study.
  • Pattern Discovery: Content analysis can help uncover hidden trends, themes, and patterns that are not always obvious.
  • Versatility: You can apply content analysis to various data formats, including text, internet content, images, etc.

However, keep in mind the challenges that arise:

  • Subjectivity: Even with the best attempts, a certain bias may remain in coding and interpretation.
  • Complexity: Analyzing huge data sets requires time and great attention to detail.
  • Contextual Nuances: Content analysis may not capture all of the contextual richness that qualitative data analysis highlights.

Example of Content Analysis

Suppose you’re conducting market research and looking at customer feedback on a product. As you collect relevant data and analyze feedback, you’ll see repeating codes like “price,” “quality,” “customer service,” and “features.” These codes are organized into categories such as “positive reviews,” “negative reviews,” and “suggestions for improvement.”

According to your findings, themes such as “price” and “customer service” stand out and show that pricing and customer service greatly impact customer satisfaction. This example highlights the power of content analysis for obtaining significant insights from large textual data collections.

Method 2: Thematic Analysis

Thematic analysis is a well-structured procedure for identifying and analyzing recurring themes in your data. As you become more engaged in the data, you’ll generate codes or short labels representing key concepts. These codes are then organized into themes, providing a consistent framework for organizing and comprehending the substance of the data.

The analysis allows you to organize complex narratives and perspectives into meaningful categories, which will allow you to identify connections and patterns that may not be visible at first.

Steps to Do Thematic Analysis

Follow these steps when conducting a thematic analysis:

  • Code and Group: Start by thoroughly examining the data and giving initial codes that identify the segments. To create initial themes, combine relevant codes.
  • Code and Group: Begin by engaging yourself in the data, assigning first codes to notable segments. To construct basic themes, group comparable codes together.
  • Analyze and Report: Analyze the data within each theme to derive relevant insights. Organize the topics into a consistent structure and explain your findings, along with data extracts that represent each theme.

Thematic analysis has various benefits:

  • Structured Exploration: It is a method for identifying patterns and themes in complex qualitative data.
  • Comprehensive knowledge: Thematic analysis promotes an in-depth understanding of the complications and meanings of the data.
  • Application Flexibility: This method can be customized to various research situations and data kinds.

However, challenges may arise, such as:

  • Interpretive Nature: Interpreting qualitative data in thematic analysis is vital, and it is critical to manage researcher bias.
  • Time-consuming: The study can be time-consuming, especially with large data sets.
  • Subjectivity: The selection of codes and topics might be subjective.

Example of Thematic Analysis

Assume you’re conducting a thematic analysis on job satisfaction interviews. Following your immersion in the data, you assign initial codes such as “work-life balance,” “career growth,” and “colleague relationships.” As you organize these codes, you’ll notice themes develop, such as “Factors Influencing Job Satisfaction” and “Impact on Work Engagement.”

Further investigation reveals the tales and experiences included within these themes and provides insights into how various elements influence job satisfaction. This example demonstrates how thematic analysis can reveal meaningful patterns and insights in qualitative data.

Method 3: Narrative Analysis

The narrative analysis involves the narratives that people share. You’ll investigate the histories in your data, looking at how stories are created and the meanings they express. This method is excellent for learning how people make sense of their experiences through narrative.

Steps to Do Narrative Analysis

The following steps are involved in narrative analysis:

  • Gather and Analyze: Start by collecting narratives, such as first-person tales, interviews, or written accounts. Analyze the stories, focusing on the plot, feelings, and characters.
  • Find Themes: Look for recurring themes or patterns in various narratives. Think about the similarities and differences between these topics and personal experiences.
  • Interpret and Extract Insights: Contextualize the narratives within their larger context. Accept the subjective nature of each narrative and analyze the narrator’s voice and style. Extract insights from the tales by diving into the emotions, motivations, and implications communicated by the stories.

There are various advantages to narrative analysis:

  • Deep Exploration: It lets you look deeply into people’s personal experiences and perspectives.
  • Human-Centered: This method prioritizes the human perspective, allowing individuals to express themselves.

However, difficulties may arise, such as:

  • Interpretive Complexity: Analyzing narratives requires dealing with the complexities of meaning and interpretation.
  • Time-consuming: Because of the richness and complexities of tales, working with them can be time-consuming.

Example of Narrative Analysis

Assume you’re conducting narrative analysis on refugee interviews. As you read the stories, you’ll notice common themes of toughness, loss, and hope. The narratives provide insight into the obstacles that refugees face, their strengths, and the dreams that guide them.

The analysis can provide a deeper insight into the refugees’ experiences and the broader social context they navigate by examining the narratives’ emotional subtleties and underlying meanings. This example highlights how narrative analysis can reveal important insights into human stories.

Method 4: Grounded Theory Analysis

Grounded theory analysis is an iterative and systematic approach that allows you to create theories directly from data without being limited by pre-existing hypotheses. With an open mind, you collect data and generate early codes and labels that capture essential ideas or concepts within the data.

As you progress, you refine these codes and increasingly connect them, eventually developing a theory based on the data. Grounded theory analysis is a dynamic process for developing new insights and hypotheses based on details in your data.

Steps to Do Grounded Theory Analysis

Grounded theory analysis requires the following steps:

  • Initial Coding: First, immerse yourself in the data, producing initial codes that represent major concepts or patterns.
  • Categorize and Connect: Using axial coding, organize the initial codes, which establish relationships and connections between topics.
  • Build the Theory: Focus on creating a core category that connects the codes and themes. Regularly refine the theory by comparing and integrating new data, ensuring that it evolves organically from the data.

Grounded theory analysis has various benefits:

  • Theory Generation: It provides a one-of-a-kind opportunity to generate hypotheses straight from data and promotes new insights.
  • In-depth Understanding: The analysis allows you to deeply analyze the data and reveal complex relationships and patterns.
  • Flexible Process: This method is customizable and ongoing, which allows you to enhance your research as you collect additional data.

However, challenges might arise with:

  • Time and Resources: Because grounded theory analysis is a continuous process, it requires a large commitment of time and resources.
  • Theoretical Development: Creating a grounded theory involves a thorough understanding of qualitative data analysis software and theoretical concepts.
  • Interpretation of Complexity: Interpreting and incorporating a newly developed theory into existing literature can be intellectually hard.

Example of Grounded Theory Analysis

Assume you’re performing a grounded theory analysis on workplace collaboration interviews. As you open code the data, you will discover notions such as “communication barriers,” “team dynamics,” and “leadership roles.” Axial coding demonstrates links between these notions, emphasizing the significance of efficient communication in developing collaboration.

You create the core “Integrated Communication Strategies” category through selective coding, which unifies new topics.

This theory-driven category serves as the framework for understanding how numerous aspects contribute to effective team collaboration. This example shows how grounded theory analysis allows you to generate a theory directly from the inherent nature of the data.

Method 5: Discourse Analysis

Discourse analysis focuses on language and communication. You’ll look at how language produces meaning and how it reflects power relations, identities, and cultural influences. This strategy examines what is said and how it is said; the words, phrasing, and larger context of communication.

The analysis is precious when investigating power dynamics, identities, and cultural influences encoded in language. By evaluating the language used in your data, you can identify underlying assumptions, cultural standards, and how individuals negotiate meaning through communication.

Steps to Do Discourse Analysis

Conducting discourse analysis entails the following steps:

  • Select Discourse: For analysis, choose language-based data such as texts, speeches, or media content.
  • Analyze Language: Immerse yourself in the conversation, examining language choices, metaphors, and underlying assumptions.
  • Discover Patterns: Recognize the dialogue’s reoccurring themes, ideologies, and power dynamics. To fully understand the effects of these patterns, put them in their larger context.

There are various advantages of using discourse analysis:

  • Understanding Language: It provides an extensive understanding of how language builds meaning and influences perceptions.
  • Uncovering Power Dynamics: The analysis reveals how power dynamics appear via language.
  • Cultural Insights: This method identifies cultural norms, beliefs, and ideologies stored in communication.

However, the following challenges may arise:

  • Complexity of Interpretation: Language analysis involves navigating multiple levels of nuance and interpretation.
  • Subjectivity: Interpretation can be subjective, so controlling researcher bias is important.
  • Time-Intensive: Discourse analysis can take a lot of time because careful linguistic study is required in this analysis.

Example of Discourse Analysis

Consider doing discourse analysis on media coverage of a political event. You notice repeating linguistic patterns in news articles that depict the event as a conflict between opposing parties. Through deconstruction, you can expose how this framing supports particular ideologies and power relations.

You can illustrate how language choices influence public perceptions and contribute to building the narrative around the event by analyzing the speech within the broader political and social context. This example shows how discourse analysis can reveal hidden power dynamics and cultural influences on communication.

How to do Qualitative Data Analysis with the QuestionPro Research suite?

QuestionPro is a popular survey and research platform that offers tools for collecting and analyzing qualitative and quantitative data. Follow these general steps for conducting qualitative data analysis using the QuestionPro Research Suite:

  • Collect Qualitative Data: Set up your survey to capture qualitative responses. It might involve open-ended questions, text boxes, or comment sections where participants can provide detailed responses.
  • Export Qualitative Responses: Export the responses once you’ve collected qualitative data through your survey. QuestionPro typically allows you to export survey data in various formats, such as Excel or CSV.
  • Prepare Data for Analysis: Review the exported data and clean it if necessary. Remove irrelevant or duplicate entries to ensure your data is ready for analysis.
  • Code and Categorize Responses: Segment and label data, letting new patterns emerge naturally, then develop categories through axial coding to structure the analysis.
  • Identify Themes: Analyze the coded responses to identify recurring themes, patterns, and insights. Look for similarities and differences in participants’ responses.
  • Generate Reports and Visualizations: Utilize the reporting features of QuestionPro to create visualizations, charts, and graphs that help communicate the themes and findings from your qualitative research.
  • Interpret and Draw Conclusions: Interpret the themes and patterns you’ve identified in the qualitative data. Consider how these findings answer your research questions or provide insights into your study topic.
  • Integrate with Quantitative Data (if applicable): If you’re also conducting quantitative research using QuestionPro, consider integrating your qualitative findings with quantitative results to provide a more comprehensive understanding.

Qualitative data analysis is vital in uncovering various human experiences, views, and stories. If you’re ready to transform your research journey and apply the power of qualitative analysis, now is the moment to do it. Book a demo with QuestionPro today and begin your journey of exploration.

LEARN MORE         FREE TRIAL

MORE LIKE THIS

NPS Survey Platform

NPS Survey Platform: Types, Tips, 11 Best Platforms & Tools

Apr 26, 2024

user journey vs user flow

User Journey vs User Flow: Differences and Similarities

gap analysis tools

Best 7 Gap Analysis Tools to Empower Your Business

Apr 25, 2024

employee survey tools

12 Best Employee Survey Tools for Organizational Excellence

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Table of Contents

What is data collection, why do we need data collection, what are the different data collection methods, data collection tools, the importance of ensuring accurate and appropriate data collection, issues related to maintaining the integrity of data collection, what are common challenges in data collection, what are the key steps in the data collection process, data collection considerations and best practices, choose the right data science program, are you interested in a career in data science, what is data collection: methods, types, tools.

What is Data Collection? Definition, Types, Tools, and Techniques

The process of gathering and analyzing accurate data from various sources to find answers to research problems, trends and probabilities, etc., to evaluate possible outcomes is Known as Data Collection. Knowledge is power, information is knowledge, and data is information in digitized form, at least as defined in IT. Hence, data is power. But before you can leverage that data into a successful strategy for your organization or business, you need to gather it. That’s your first step.

So, to help you get the process started, we shine a spotlight on data collection. What exactly is it? Believe it or not, it’s more than just doing a Google search! Furthermore, what are the different types of data collection? And what kinds of data collection tools and data collection techniques exist?

If you want to get up to speed about what is data collection process, you’ve come to the right place. 

Transform raw data into captivating visuals with Simplilearn's hands-on Data Visualization Courses and captivate your audience. Also, master the art of data management with Simplilearn's comprehensive data management courses  - unlock new career opportunities today!

Data collection is the process of collecting and evaluating information or data from multiple sources to find answers to research problems, answer questions, evaluate outcomes, and forecast trends and probabilities. It is an essential phase in all types of research, analysis, and decision-making, including that done in the social sciences, business, and healthcare.

Accurate data collection is necessary to make informed business decisions, ensure quality assurance, and keep research integrity.

During data collection, the researchers must identify the data types, the sources of data, and what methods are being used. We will soon see that there are many different data collection methods . There is heavy reliance on data collection in research, commercial, and government fields.

Before an analyst begins collecting data, they must answer three questions first:

  • What’s the goal or purpose of this research?
  • What kinds of data are they planning on gathering?
  • What methods and procedures will be used to collect, store, and process the information?

Additionally, we can break up data into qualitative and quantitative types. Qualitative data covers descriptions such as color, size, quality, and appearance. Quantitative data, unsurprisingly, deals with numbers, such as statistics, poll numbers, percentages, etc.

Before a judge makes a ruling in a court case or a general creates a plan of attack, they must have as many relevant facts as possible. The best courses of action come from informed decisions, and information and data are synonymous.

The concept of data collection isn’t a new one, as we’ll see later, but the world has changed. There is far more data available today, and it exists in forms that were unheard of a century ago. The data collection process has had to change and grow with the times, keeping pace with technology.

Whether you’re in the world of academia, trying to conduct research, or part of the commercial sector, thinking of how to promote a new product, you need data collection to help you make better choices.

Now that you know what is data collection and why we need it, let's take a look at the different methods of data collection. While the phrase “data collection” may sound all high-tech and digital, it doesn’t necessarily entail things like computers, big data , and the internet. Data collection could mean a telephone survey, a mail-in comment card, or even some guy with a clipboard asking passersby some questions. But let’s see if we can sort the different data collection methods into a semblance of organized categories.

Primary and secondary methods of data collection are two approaches used to gather information for research or analysis purposes. Let's explore each data collection method in detail:

1. Primary Data Collection:

Primary data collection involves the collection of original data directly from the source or through direct interaction with the respondents. This method allows researchers to obtain firsthand information specifically tailored to their research objectives. There are various techniques for primary data collection, including:

a. Surveys and Questionnaires: Researchers design structured questionnaires or surveys to collect data from individuals or groups. These can be conducted through face-to-face interviews, telephone calls, mail, or online platforms.

b. Interviews: Interviews involve direct interaction between the researcher and the respondent. They can be conducted in person, over the phone, or through video conferencing. Interviews can be structured (with predefined questions), semi-structured (allowing flexibility), or unstructured (more conversational).

c. Observations: Researchers observe and record behaviors, actions, or events in their natural setting. This method is useful for gathering data on human behavior, interactions, or phenomena without direct intervention.

d. Experiments: Experimental studies involve the manipulation of variables to observe their impact on the outcome. Researchers control the conditions and collect data to draw conclusions about cause-and-effect relationships.

e. Focus Groups: Focus groups bring together a small group of individuals who discuss specific topics in a moderated setting. This method helps in understanding opinions, perceptions, and experiences shared by the participants.

2. Secondary Data Collection:

Secondary data collection involves using existing data collected by someone else for a purpose different from the original intent. Researchers analyze and interpret this data to extract relevant information. Secondary data can be obtained from various sources, including:

a. Published Sources: Researchers refer to books, academic journals, magazines, newspapers, government reports, and other published materials that contain relevant data.

b. Online Databases: Numerous online databases provide access to a wide range of secondary data, such as research articles, statistical information, economic data, and social surveys.

c. Government and Institutional Records: Government agencies, research institutions, and organizations often maintain databases or records that can be used for research purposes.

d. Publicly Available Data: Data shared by individuals, organizations, or communities on public platforms, websites, or social media can be accessed and utilized for research.

e. Past Research Studies: Previous research studies and their findings can serve as valuable secondary data sources. Researchers can review and analyze the data to gain insights or build upon existing knowledge.

Now that we’ve explained the various techniques, let’s narrow our focus even further by looking at some specific tools. For example, we mentioned interviews as a technique, but we can further break that down into different interview types (or “tools”).

Word Association

The researcher gives the respondent a set of words and asks them what comes to mind when they hear each word.

Sentence Completion

Researchers use sentence completion to understand what kind of ideas the respondent has. This tool involves giving an incomplete sentence and seeing how the interviewee finishes it.

Role-Playing

Respondents are presented with an imaginary situation and asked how they would act or react if it was real.

In-Person Surveys

The researcher asks questions in person.

Online/Web Surveys

These surveys are easy to accomplish, but some users may be unwilling to answer truthfully, if at all.

Mobile Surveys

These surveys take advantage of the increasing proliferation of mobile technology. Mobile collection surveys rely on mobile devices like tablets or smartphones to conduct surveys via SMS or mobile apps.

Phone Surveys

No researcher can call thousands of people at once, so they need a third party to handle the chore. However, many people have call screening and won’t answer.

Observation

Sometimes, the simplest method is the best. Researchers who make direct observations collect data quickly and easily, with little intrusion or third-party bias. Naturally, it’s only effective in small-scale situations.

Accurate data collecting is crucial to preserving the integrity of research, regardless of the subject of study or preferred method for defining data (quantitative, qualitative). Errors are less likely to occur when the right data gathering tools are used (whether they are brand-new ones, updated versions of them, or already available).

Among the effects of data collection done incorrectly, include the following -

  • Erroneous conclusions that squander resources
  • Decisions that compromise public policy
  • Incapacity to correctly respond to research inquiries
  • Bringing harm to participants who are humans or animals
  • Deceiving other researchers into pursuing futile research avenues
  • The study's inability to be replicated and validated

When these study findings are used to support recommendations for public policy, there is the potential to result in disproportionate harm, even if the degree of influence from flawed data collecting may vary by discipline and the type of investigation.

Let us now look at the various issues that we might face while maintaining the integrity of data collection.

In order to assist the errors detection process in the data gathering process, whether they were done purposefully (deliberate falsifications) or not, maintaining data integrity is the main justification (systematic or random errors).

Quality assurance and quality control are two strategies that help protect data integrity and guarantee the scientific validity of study results.

Each strategy is used at various stages of the research timeline:

  • Quality control - tasks that are performed both after and during data collecting
  • Quality assurance - events that happen before data gathering starts

Let us explore each of them in more detail now.

Quality Assurance

As data collecting comes before quality assurance, its primary goal is "prevention" (i.e., forestalling problems with data collection). The best way to protect the accuracy of data collection is through prevention. The uniformity of protocol created in the thorough and exhaustive procedures manual for data collecting serves as the best example of this proactive step. 

The likelihood of failing to spot issues and mistakes early in the research attempt increases when guides are written poorly. There are several ways to show these shortcomings:

  • Failure to determine the precise subjects and methods for retraining or training staff employees in data collecting
  • List of goods to be collected, in part
  • There isn't a system in place to track modifications to processes that may occur as the investigation continues.
  • Instead of detailed, step-by-step instructions on how to deliver tests, there is a vague description of the data gathering tools that will be employed.
  • Uncertainty regarding the date, procedure, and identity of the person or people in charge of examining the data
  • Incomprehensible guidelines for using, adjusting, and calibrating the data collection equipment.

Now, let us look at how to ensure Quality Control.

Become a Data Scientist With Real-World Experience

Become a Data Scientist With Real-World Experience

Quality Control

Despite the fact that quality control actions (detection/monitoring and intervention) take place both after and during data collection, the specifics should be meticulously detailed in the procedures manual. Establishing monitoring systems requires a specific communication structure, which is a prerequisite. Following the discovery of data collection problems, there should be no ambiguity regarding the information flow between the primary investigators and staff personnel. A poorly designed communication system promotes slack oversight and reduces opportunities for error detection.

Direct staff observation conference calls, during site visits, or frequent or routine assessments of data reports to spot discrepancies, excessive numbers, or invalid codes can all be used as forms of detection or monitoring. Site visits might not be appropriate for all disciplines. Still, without routine auditing of records, whether qualitative or quantitative, it will be challenging for investigators to confirm that data gathering is taking place in accordance with the manual's defined methods. Additionally, quality control determines the appropriate solutions, or "actions," to fix flawed data gathering procedures and reduce recurrences.

Problems with data collection, for instance, that call for immediate action include:

  • Fraud or misbehavior
  • Systematic mistakes, procedure violations 
  • Individual data items with errors
  • Issues with certain staff members or a site's performance 

Researchers are trained to include one or more secondary measures that can be used to verify the quality of information being obtained from the human subject in the social and behavioral sciences where primary data collection entails using human subjects. 

For instance, a researcher conducting a survey would be interested in learning more about the prevalence of risky behaviors among young adults as well as the social factors that influence these risky behaviors' propensity for and frequency. Let us now explore the common challenges with regard to data collection.

There are some prevalent challenges faced while collecting data, let us explore a few of them to understand them better and avoid them.

Data Quality Issues

The main threat to the broad and successful application of machine learning is poor data quality. Data quality must be your top priority if you want to make technologies like machine learning work for you. Let's talk about some of the most prevalent data quality problems in this blog article and how to fix them.

Inconsistent Data

When working with various data sources, it's conceivable that the same information will have discrepancies between sources. The differences could be in formats, units, or occasionally spellings. The introduction of inconsistent data might also occur during firm mergers or relocations. Inconsistencies in data have a tendency to accumulate and reduce the value of data if they are not continually resolved. Organizations that have heavily focused on data consistency do so because they only want reliable data to support their analytics.

Data Downtime

Data is the driving force behind the decisions and operations of data-driven businesses. However, there may be brief periods when their data is unreliable or not prepared. Customer complaints and subpar analytical outcomes are only two ways that this data unavailability can have a significant impact on businesses. A data engineer spends about 80% of their time updating, maintaining, and guaranteeing the integrity of the data pipeline. In order to ask the next business question, there is a high marginal cost due to the lengthy operational lead time from data capture to insight.

Schema modifications and migration problems are just two examples of the causes of data downtime. Data pipelines can be difficult due to their size and complexity. Data downtime must be continuously monitored, and it must be reduced through automation.

Ambiguous Data

Even with thorough oversight, some errors can still occur in massive databases or data lakes. For data streaming at a fast speed, the issue becomes more overwhelming. Spelling mistakes can go unnoticed, formatting difficulties can occur, and column heads might be deceptive. This unclear data might cause a number of problems for reporting and analytics.

Become a Data Science Expert & Get Your Dream Job

Become a Data Science Expert & Get Your Dream Job

Duplicate Data

Streaming data, local databases, and cloud data lakes are just a few of the sources of data that modern enterprises must contend with. They might also have application and system silos. These sources are likely to duplicate and overlap each other quite a bit. For instance, duplicate contact information has a substantial impact on customer experience. If certain prospects are ignored while others are engaged repeatedly, marketing campaigns suffer. The likelihood of biased analytical outcomes increases when duplicate data are present. It can also result in ML models with biased training data.

Too Much Data

While we emphasize data-driven analytics and its advantages, a data quality problem with excessive data exists. There is a risk of getting lost in an abundance of data when searching for information pertinent to your analytical efforts. Data scientists, data analysts, and business users devote 80% of their work to finding and organizing the appropriate data. With an increase in data volume, other problems with data quality become more serious, particularly when dealing with streaming data and big files or databases.

Inaccurate Data

For highly regulated businesses like healthcare, data accuracy is crucial. Given the current experience, it is more important than ever to increase the data quality for COVID-19 and later pandemics. Inaccurate information does not provide you with a true picture of the situation and cannot be used to plan the best course of action. Personalized customer experiences and marketing strategies underperform if your customer data is inaccurate.

Data inaccuracies can be attributed to a number of things, including data degradation, human mistake, and data drift. Worldwide data decay occurs at a rate of about 3% per month, which is quite concerning. Data integrity can be compromised while being transferred between different systems, and data quality might deteriorate with time.

Hidden Data

The majority of businesses only utilize a portion of their data, with the remainder sometimes being lost in data silos or discarded in data graveyards. For instance, the customer service team might not receive client data from sales, missing an opportunity to build more precise and comprehensive customer profiles. Missing out on possibilities to develop novel products, enhance services, and streamline procedures is caused by hidden data.

Finding Relevant Data

Finding relevant data is not so easy. There are several factors that we need to consider while trying to find relevant data, which include -

  • Relevant Domain
  • Relevant demographics
  • Relevant Time period and so many more factors that we need to consider while trying to find relevant data.

Data that is not relevant to our study in any of the factors render it obsolete and we cannot effectively proceed with its analysis. This could lead to incomplete research or analysis, re-collecting data again and again, or shutting down the study.

Deciding the Data to Collect

Determining what data to collect is one of the most important factors while collecting data and should be one of the first factors while collecting data. We must choose the subjects the data will cover, the sources we will be used to gather it, and the quantity of information we will require. Our responses to these queries will depend on our aims, or what we expect to achieve utilizing your data. As an illustration, we may choose to gather information on the categories of articles that website visitors between the ages of 20 and 50 most frequently access. We can also decide to compile data on the typical age of all the clients who made a purchase from your business over the previous month.

Not addressing this could lead to double work and collection of irrelevant data or ruining your study as a whole.

Dealing With Big Data

Big data refers to exceedingly massive data sets with more intricate and diversified structures. These traits typically result in increased challenges while storing, analyzing, and using additional methods of extracting results. Big data refers especially to data sets that are quite enormous or intricate that conventional data processing tools are insufficient. The overwhelming amount of data, both unstructured and structured, that a business faces on a daily basis. 

The amount of data produced by healthcare applications, the internet, social networking sites social, sensor networks, and many other businesses are rapidly growing as a result of recent technological advancements. Big data refers to the vast volume of data created from numerous sources in a variety of formats at extremely fast rates. Dealing with this kind of data is one of the many challenges of Data Collection and is a crucial step toward collecting effective data. 

Low Response and Other Research Issues

Poor design and low response rates were shown to be two issues with data collecting, particularly in health surveys that used questionnaires. This might lead to an insufficient or inadequate supply of data for the study. Creating an incentivized data collection program might be beneficial in this case to get more responses.

Now, let us look at the key steps in the data collection process.

In the Data Collection Process, there are 5 key steps. They are explained briefly below -

1. Decide What Data You Want to Gather

The first thing that we need to do is decide what information we want to gather. We must choose the subjects the data will cover, the sources we will use to gather it, and the quantity of information that we would require. For instance, we may choose to gather information on the categories of products that an average e-commerce website visitor between the ages of 30 and 45 most frequently searches for. 

2. Establish a Deadline for Data Collection

The process of creating a strategy for data collection can now begin. We should set a deadline for our data collection at the outset of our planning phase. Some forms of data we might want to continuously collect. We might want to build up a technique for tracking transactional data and website visitor statistics over the long term, for instance. However, we will track the data throughout a certain time frame if we are tracking it for a particular campaign. In these situations, we will have a schedule for when we will begin and finish gathering data. 

3. Select a Data Collection Approach

We will select the data collection technique that will serve as the foundation of our data gathering plan at this stage. We must take into account the type of information that we wish to gather, the time period during which we will receive it, and the other factors we decide on to choose the best gathering strategy.

4. Gather Information

Once our plan is complete, we can put our data collection plan into action and begin gathering data. In our DMP, we can store and arrange our data. We need to be careful to follow our plan and keep an eye on how it's doing. Especially if we are collecting data regularly, setting up a timetable for when we will be checking in on how our data gathering is going may be helpful. As circumstances alter and we learn new details, we might need to amend our plan.

5. Examine the Information and Apply Your Findings

It's time to examine our data and arrange our findings after we have gathered all of our information. The analysis stage is essential because it transforms unprocessed data into insightful knowledge that can be applied to better our marketing plans, goods, and business judgments. The analytics tools included in our DMP can be used to assist with this phase. We can put the discoveries to use to enhance our business once we have discovered the patterns and insights in our data.

Let us now look at some data collection considerations and best practices that one might follow.

We must carefully plan before spending time and money traveling to the field to gather data. While saving time and resources, effective data collection strategies can help us collect richer, more accurate, and richer data.

Below, we will be discussing some of the best practices that we can follow for the best results -

1. Take Into Account the Price of Each Extra Data Point

Once we have decided on the data we want to gather, we need to make sure to take the expense of doing so into account. Our surveyors and respondents will incur additional costs for each additional data point or survey question.

2. Plan How to Gather Each Data Piece

There is a dearth of freely accessible data. Sometimes the data is there, but we may not have access to it. For instance, unless we have a compelling cause, we cannot openly view another person's medical information. It could be challenging to measure several types of information.

Consider how time-consuming and difficult it will be to gather each piece of information while deciding what data to acquire.

3. Think About Your Choices for Data Collecting Using Mobile Devices

Mobile-based data collecting can be divided into three categories -

  • IVRS (interactive voice response technology) -  Will call the respondents and ask them questions that have already been recorded. 
  • SMS data collection - Will send a text message to the respondent, who can then respond to questions by text on their phone. 
  • Field surveyors - Can directly enter data into an interactive questionnaire while speaking to each respondent, thanks to smartphone apps.

We need to make sure to select the appropriate tool for our survey and responders because each one has its own disadvantages and advantages.

4. Carefully Consider the Data You Need to Gather

It's all too easy to get information about anything and everything, but it's crucial to only gather the information that we require. 

It is helpful to consider these 3 questions:

  • What details will be helpful?
  • What details are available?
  • What specific details do you require?

5. Remember to Consider Identifiers

Identifiers, or details describing the context and source of a survey response, are just as crucial as the information about the subject or program that we are actually researching.

In general, adding more identifiers will enable us to pinpoint our program's successes and failures with greater accuracy, but moderation is the key.

6. Data Collecting Through Mobile Devices is the Way to Go

Although collecting data on paper is still common, modern technology relies heavily on mobile devices. They enable us to gather many various types of data at relatively lower prices and are accurate as well as quick. There aren't many reasons not to pick mobile-based data collecting with the boom of low-cost Android devices that are available nowadays.

The Ultimate Ticket to Top Data Science Job Roles

The Ultimate Ticket to Top Data Science Job Roles

1. What is data collection with example?

Data collection is the process of collecting and analyzing information on relevant variables in a predetermined, methodical way so that one can respond to specific research questions, test hypotheses, and assess results. Data collection can be either qualitative or quantitative. Example: A company collects customer feedback through online surveys and social media monitoring to improve their products and services.

2. What are the primary data collection methods?

As is well known, gathering primary data is costly and time intensive. The main techniques for gathering data are observation, interviews, questionnaires, schedules, and surveys.

3. What are data collection tools?

The term "data collecting tools" refers to the tools/devices used to gather data, such as a paper questionnaire or a system for computer-assisted interviews. Tools used to gather data include case studies, checklists, interviews, occasionally observation, surveys, and questionnaires.

4. What’s the difference between quantitative and qualitative methods?

While qualitative research focuses on words and meanings, quantitative research deals with figures and statistics. You can systematically measure variables and test hypotheses using quantitative methods. You can delve deeper into ideas and experiences using qualitative methodologies.

5. What are quantitative data collection methods?

While there are numerous other ways to get quantitative information, the methods indicated above—probability sampling, interviews, questionnaire observation, and document review—are the most typical and frequently employed, whether collecting information offline or online.

6. What is mixed methods research?

User research that includes both qualitative and quantitative techniques is known as mixed methods research. For deeper user insights, mixed methods research combines insightful user data with useful statistics.

7. What are the benefits of collecting data?

Collecting data offers several benefits, including:

  • Knowledge and Insight
  • Evidence-Based Decision Making
  • Problem Identification and Solution
  • Validation and Evaluation
  • Identifying Trends and Predictions
  • Support for Research and Development
  • Policy Development
  • Quality Improvement
  • Personalization and Targeting
  • Knowledge Sharing and Collaboration

8. What’s the difference between reliability and validity?

Reliability is about consistency and stability, while validity is about accuracy and appropriateness. Reliability focuses on the consistency of results, while validity focuses on whether the results are actually measuring what they are intended to measure. Both reliability and validity are crucial considerations in research to ensure the trustworthiness and meaningfulness of the collected data and measurements.

Are you thinking about pursuing a career in the field of data science? Simplilearn's Data Science courses are designed to provide you with the necessary skills and expertise to excel in this rapidly changing field. Here's a detailed comparison for your reference:

Program Name Data Scientist Master's Program Post Graduate Program In Data Science Post Graduate Program In Data Science Geo All Geos All Geos Not Applicable in US University Simplilearn Purdue Caltech Course Duration 11 Months 11 Months 11 Months Coding Experience Required Basic Basic No Skills You Will Learn 10+ skills including data structure, data manipulation, NumPy, Scikit-Learn, Tableau and more 8+ skills including Exploratory Data Analysis, Descriptive Statistics, Inferential Statistics, and more 8+ skills including Supervised & Unsupervised Learning Deep Learning Data Visualization, and more Additional Benefits Applied Learning via Capstone and 25+ Data Science Projects Purdue Alumni Association Membership Free IIMJobs Pro-Membership of 6 months Resume Building Assistance Upto 14 CEU Credits Caltech CTME Circle Membership Cost $$ $$$$ $$$$ Explore Program Explore Program Explore Program

We live in the Data Age, and if you want a career that fully takes advantage of this, you should consider a career in data science. Simplilearn offers a Caltech Post Graduate Program in Data Science  that will train you in everything you need to know to secure the perfect position. This Data Science PG program is ideal for all working professionals, covering job-critical topics like R, Python programming , machine learning algorithms , NLP concepts , and data visualization with Tableau in great detail. This is all provided via our interactive learning model with live sessions by global practitioners, practical labs, and industry projects.

Data Science & Business Analytics Courses Duration and Fees

Data Science & Business Analytics programs typically range from a few weeks to several months, with fees varying based on program and institution.

Recommended Reads

Data Science Career Guide: A Comprehensive Playbook To Becoming A Data Scientist

Capped Collection in MongoDB

An Ultimate One-Stop Solution Guide to Collections in C# Programming With Examples

Managing Data

Difference Between Collection and Collections in Java

What Are Java Collections and How to Implement Them?

Get Affiliated Certifications with Live Class programs

Data scientist.

  • Add the IBM Advantage to your Learning
  • 25 Industry-relevant Projects and Integrated labs

Caltech Data Sciences-Bootcamp

  • Exclusive visit to Caltech’s Robotics Lab

Caltech Post Graduate Program in Data Science

  • Earn a program completion certificate from Caltech CTME
  • Curriculum delivered in live online sessions by industry experts
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.
  • Open access
  • Published: 19 April 2024

A scoping review of continuous quality improvement in healthcare system: conceptualization, models and tools, barriers and facilitators, and impact

  • Aklilu Endalamaw 1 , 2 ,
  • Resham B Khatri 1 , 3 ,
  • Tesfaye Setegn Mengistu 1 , 2 ,
  • Daniel Erku 1 , 4 , 5 ,
  • Eskinder Wolka 6 ,
  • Anteneh Zewdie 6 &
  • Yibeltal Assefa 1  

BMC Health Services Research volume  24 , Article number:  487 ( 2024 ) Cite this article

732 Accesses

Metrics details

The growing adoption of continuous quality improvement (CQI) initiatives in healthcare has generated a surge in research interest to gain a deeper understanding of CQI. However, comprehensive evidence regarding the diverse facets of CQI in healthcare has been limited. Our review sought to comprehensively grasp the conceptualization and principles of CQI, explore existing models and tools, analyze barriers and facilitators, and investigate its overall impacts.

This qualitative scoping review was conducted using Arksey and O’Malley’s methodological framework. We searched articles in PubMed, Web of Science, Scopus, and EMBASE databases. In addition, we accessed articles from Google Scholar. We used mixed-method analysis, including qualitative content analysis and quantitative descriptive for quantitative findings to summarize findings and PRISMA extension for scoping reviews (PRISMA-ScR) framework to report the overall works.

A total of 87 articles, which covered 14 CQI models, were included in the review. While 19 tools were used for CQI models and initiatives, Plan-Do-Study/Check-Act cycle was the commonly employed model to understand the CQI implementation process. The main reported purposes of using CQI, as its positive impact, are to improve the structure of the health system (e.g., leadership, health workforce, health technology use, supplies, and costs), enhance healthcare delivery processes and outputs (e.g., care coordination and linkages, satisfaction, accessibility, continuity of care, safety, and efficiency), and improve treatment outcome (reduce morbidity and mortality). The implementation of CQI is not without challenges. There are cultural (i.e., resistance/reluctance to quality-focused culture and fear of blame or punishment), technical, structural (related to organizational structure, processes, and systems), and strategic (inadequate planning and inappropriate goals) related barriers that were commonly reported during the implementation of CQI.

Conclusions

Implementing CQI initiatives necessitates thoroughly comprehending key principles such as teamwork and timeline. To effectively address challenges, it’s crucial to identify obstacles and implement optimal interventions proactively. Healthcare professionals and leaders need to be mentally equipped and cognizant of the significant role CQI initiatives play in achieving purposes for quality of care.

Peer Review reports

Continuous quality improvement (CQI) initiative is a crucial initiative aimed at enhancing quality in the health system that has gradually been adopted in the healthcare industry. In the early 20th century, Shewhart laid the foundation for quality improvement by describing three essential steps for process improvement: specification, production, and inspection [ 1 , 2 ]. Then, Deming expanded Shewhart’s three-step model into ‘plan, do, study/check, and act’ (PDSA or PDCA) cycle, which was applied to management practices in Japan in the 1950s [ 3 ] and was gradually translated into the health system. In 1991, Kuperman applied a CQI approach to healthcare, comprising selecting a process to be improved, assembling a team of expert clinicians that understands the process and the outcomes, determining key steps in the process and expected outcomes, collecting data that measure the key process steps and outcomes, and providing data feedback to the practitioners [ 4 ]. These philosophies have served as the baseline for the foundation of principles for continuous improvement [ 5 ].

Continuous quality improvement fosters a culture of continuous learning, innovation, and improvement. It encourages proactive identification and resolution of problems, promotes employee engagement and empowerment, encourages trust and respect, and aims for better quality of care [ 6 , 7 ]. These characteristics drive the interaction of CQI with other quality improvement projects, such as quality assurance and total quality management [ 8 ]. Quality assurance primarily focuses on identifying deviations or errors through inspections, audits, and formal reviews, often settling for what is considered ‘good enough’, rather than pursuing the highest possible standards [ 9 , 10 ], while total quality management is implemented as the management philosophy and system to improve all aspects of an organization continuously [ 11 ].

Continuous quality improvement has been implemented to provide quality care. However, providing effective healthcare is a complicated and complex task in achieving the desired health outcomes and the overall well-being of individuals and populations. It necessitates tackling issues, including access, patient safety, medical advances, care coordination, patient-centered care, and quality monitoring [ 12 , 13 ], rooted long ago. It is assumed that the history of quality improvement in healthcare started in 1854 when Florence Nightingale introduced quality improvement documentation [ 14 ]. Over the passing decades, Donabedian introduced structure, processes, and outcomes as quality of care components in 1966 [ 15 ]. More comprehensively, the Institute of Medicine in the United States of America (USA) has identified effectiveness, efficiency, equity, patient-centredness, safety, and timeliness as the components of quality of care [ 16 ]. Moreover, quality of care has recently been considered an integral part of universal health coverage (UHC) [ 17 ], which requires initiatives to mobilise essential inputs [ 18 ].

While the overall objective of CQI in health system is to enhance the quality of care, it is important to note that the purposes and principles of CQI can vary across different contexts [ 19 , 20 ]. This variation has sparked growing research interest. For instance, a review of CQI approaches for capacity building addressed its role in health workforce development [ 21 ]. Another systematic review, based on random-controlled design studies, assessed the effectiveness of CQI using training as an intervention and the PDSA model [ 22 ]. As a research gap, the former review was not directly related to the comprehensive elements of quality of care, while the latter focused solely on the impact of training using the PDSA model, among other potential models. Additionally, a review conducted in 2015 aimed to identify barriers and facilitators of CQI in Canadian contexts [ 23 ]. However, all these reviews presented different perspectives and investigated distinct outcomes. This suggests that there is still much to explore in terms of comprehensively understanding the various aspects of CQI initiatives in healthcare.

As a result, we conducted a scoping review to address several aspects of CQI. Scoping reviews serve as a valuable tool for systematically mapping the existing literature on a specific topic. They are instrumental when dealing with heterogeneous or complex bodies of research. Scoping reviews provide a comprehensive overview by summarizing and disseminating findings across multiple studies, even when evidence varies significantly [ 24 ]. In our specific scoping review, we included various types of literature, including systematic reviews, to enhance our understanding of CQI.

This scoping review examined how CQI is conceptualized and measured and investigated models and tools for its application while identifying implementation challenges and facilitators. It also analyzed the purposes and impact of CQI on the health systems, providing valuable insights for enhancing healthcare quality.

Protocol registration and results reporting

Protocol registration for this scoping review was not conducted. Arksey and O’Malley’s methodological framework was utilized to conduct this scoping review [ 25 ]. The scoping review procedures start by defining the research questions, identifying relevant literature, selecting articles, extracting data, and summarizing the results. The review findings are reported using the PRISMA extension for a scoping review (PRISMA-ScR) [ 26 ]. McGowan and colleagues also advised researchers to report findings from scoping reviews using PRISMA-ScR [ 27 ].

Defining the research problems

This review aims to comprehensively explore the conceptualization, models, tools, barriers, facilitators, and impacts of CQI within the healthcare system worldwide. Specifically, we address the following research questions: (1) How has CQI been defined across various contexts? (2) What are the diverse approaches to implementing CQI in healthcare settings? (3) Which tools are commonly employed for CQI implementation ? (4) What barriers hinder and facilitators support successful CQI initiatives? and (5) What effects CQI initiatives have on the overall care quality?

Information source and search strategy

We conducted the search in PubMed, Web of Science, Scopus, and EMBASE databases, and the Google Scholar search engine. The search terms were selected based on three main distinct concepts. One group was CQI-related terms. The second group included terms related to the purpose for which CQI has been implemented, and the third group included processes and impact. These terms were selected based on the Donabedian framework of structure, process, and outcome [ 28 ]. Additionally, the detailed keywords were recruited from the primary health framework, which has described lists of dimensions under process, output, outcome, and health system goals of any intervention for health [ 29 ]. The detailed search strategy is presented in the Supplementary file 1 (Search strategy). The search for articles was initiated on August 12, 2023, and the last search was conducted on September 01, 2023.

Eligibility criteria and article selection

Based on the scoping review’s population, concept, and context frameworks [ 30 ], the population included any patients or clients. Additionally, the concepts explored in the review encompassed definitions, implementation, models, tools, barriers, facilitators, and impacts of CQI. Furthermore, the review considered contexts at any level of health systems. We included articles if they reported results of qualitative or quantitative empirical study, case studies, analytic or descriptive synthesis, any review, and other written documents, were published in peer-reviewed journals, and were designed to address at least one of the identified research questions or one of the identified implementation outcomes or their synonymous taxonomy as described in the search strategy. Based on additional contexts, we included articles published in English without geographic and time limitations. We excluded articles with abstracts only, conference abstracts, letters to editors, commentators, and corrections.

We exported all citations to EndNote x20 to remove duplicates and screen relevant articles. The article selection process includes automatic duplicate removal by using EndNote x20, unmatched title and abstract removal, citation and abstract-only materials removal, and full-text assessment. The article selection process was mainly conducted by the first author (AE) and reported to the team during the weekly meetings. The first author encountered papers that caused confusion regarding whether to include or exclude them and discussed them with the last author (YA). Then, decisions were ultimately made. Whenever disagreements happened, they were resolved by discussion and reconsideration of the review questions in relation to the written documents of the article. Further statistical analysis, such as calculating Kappa, was not performed to determine article inclusion or exclusion.

Data extraction and data items

We extracted first author, publication year, country, settings, health problem, the purpose of the study, study design, types of intervention if applicable, CQI approaches/steps if applicable, CQI tools and procedures if applicable, and main findings using a customized Microsoft Excel form.

Summarizing and reporting the results

The main findings were summarized and described based on the main themes, including concepts under conceptualizing, principles, teams, timelines, models, tools, barriers, facilitators, and impacts of CQI. Results-based convergent synthesis, achieved through mixed-method analysis, involved content analysis to identify the thematic presentation of findings. Additionally, a narrative description was used for quantitative findings, aligning them with the appropriate theme. The authors meticulously reviewed the primary findings from each included material and contextualized these findings concerning the main themes1. This approach provides a comprehensive understanding of complex interventions and health systems, acknowledging quantitative and qualitative evidence.

Search results

A total of 11,251 documents were identified from various databases: SCOPUS ( n  = 4,339), PubMed ( n  = 2,893), Web of Science ( n  = 225), EMBASE ( n  = 3,651), and Google Scholar ( n  = 143). After removing duplicates ( n  = 5,061), 6,190 articles were evaluated by title and abstract. Subsequently, 208 articles were assessed for full-text eligibility. Following the eligibility criteria, 121 articles were excluded, leaving 87 included in the current review (Fig.  1 ).

figure 1

Article selection process

Operationalizing continuous quality improvement

Continuous Quality Improvement (CQI) is operationalized as a cyclic process that requires commitment to implementation, teamwork, time allocation, and celebrating successes and failures.

CQI is a cyclic ongoing process that is followed reflexive, analytical and iterative steps, including identifying gaps, generating data, developing and implementing action plans, evaluating performance, providing feedback to implementers and leaders, and proposing necessary adjustments [ 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 ].

CQI requires committing to the philosophy, involving continuous improvement [ 19 , 38 ], establishing a mission statement [ 37 ], and understanding quality definition [ 19 ].

CQI involves a wide range of patient-oriented measures and performance indicators, specifically satisfying internal and external customers, developing quality assurance, adopting common quality measures, and selecting process measures [ 8 , 19 , 35 , 36 , 37 , 39 , 40 ].

CQI requires celebrating success and failure without personalization, leading each team member to develop error-free attitudes [ 19 ]. Success and failure are related to underlying organizational processes and systems as causes of failure rather than blaming individuals [ 8 ] because CQI is process-focused based on collaborative, data-driven, responsive, rigorous and problem-solving statistical analysis [ 8 , 19 , 38 ]. Furthermore, a gap or failure opens another opportunity for establishing a data-driven learning organization [ 41 ].

CQI cannot be implemented without a CQI team [ 8 , 19 , 37 , 39 , 42 , 43 , 44 , 45 , 46 ]. A CQI team comprises individuals from various disciplines, often comprising a team leader, a subject matter expert (physician or other healthcare provider), a data analyst, a facilitator, frontline staff, and stakeholders [ 39 , 43 , 47 , 48 , 49 ]. It is also important to note that inviting stakeholders or partners as part of the CQI support intervention is crucial [ 19 , 38 , 48 ].

The timeline is another distinct feature of CQI because the results of CQI vary based on the implementation duration of each cycle [ 35 ]. There is no specific time limit for CQI implementation, although there is a general consensus that a cycle of CQI should be relatively short [ 35 ]. For instance, a CQI implementation took 2 months [ 42 ], 4 months [ 50 ], 9 months [ 51 , 52 ], 12 months [ 53 , 54 , 55 ], and one year and 5 months [ 49 ] duration to achieve the desired positive outcome, while bi-weekly [ 47 ] and monthly data reviews and analyses [ 44 , 48 , 56 ], and activities over 3 months [ 57 ] have also resulted in a positive outcome.

Continuous quality improvement models and tools

There have been several models are utilized. The Plan-Do-Study/Check-Act cycle is a stepwise process involving project initiation, situation analysis, root cause identification, solution generation and selection, implementation, result evaluation, standardization, and future planning [ 7 , 36 , 37 , 45 , 47 , 48 , 49 , 50 , 51 , 53 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 68 , 69 , 70 ]. The FOCUS-PDCA cycle enhances the PDCA process by adding steps to find and improve a process (F), organize a knowledgeable team (O), clarify the process (C), understand variations (U), and select improvements (S) [ 55 , 71 , 72 , 73 ]. The FADE cycle involves identifying a problem (Focus), understanding it through data analysis (Analyze), devising solutions (Develop), and implementing the plan (Execute) [ 74 ]. The Logic Framework involves brainstorming to identify improvement areas, conducting root cause analysis to develop a problem tree, logically reasoning to create an objective tree, formulating the framework, and executing improvement projects [ 75 ]. Breakthrough series approach requires CQI teams to meet in quarterly collaborative learning sessions, share learning experiences, and continue discussion by telephone and cross-site visits to strengthen learning and idea exchange [ 47 ]. Another CQI model is the Lean approach, which has been conducted with Kaizen principles [ 52 ], 5 S principles, and the Six Sigma model. The 5 S (Sort, Set/Straighten, Shine, Standardize, Sustain) systematically organises and improves the workplace, focusing on sorting, setting order, shining, standardizing, and sustaining the improvement [ 54 , 76 ]. Kaizen principles guide CQI by advocating for continuous improvement, valuing all ideas, solving problems, focusing on practical, low-cost improvements, using data to drive change, acknowledging process defects, reducing variability and waste, recognizing every interaction as a customer-supplier relationship, empowering workers, responding to all ideas, and maintaining a disciplined workplace [ 77 ]. Lean Six Sigma, a CQI model, applies the DMAIC methodology, which involves defining (D) and measuring the problem (M), analyzing root causes (A), improving by finding solutions (I), and controlling by assessing process stability (C) [ 78 , 79 ]. The 5 C-cyclic model (consultation, collection, consideration, collaboration, and celebration), the first CQI framework for volunteer dental services in Aboriginal communities, ensures quality care based on community needs [ 80 ]. One study used meetings involving activities such as reviewing objectives, assigning roles, discussing the agenda, completing tasks, retaining key outputs, planning future steps, and evaluating the meeting’s effectiveness [ 81 ].

Various tools are involved in the implementation or evaluation of CQI initiatives: checklists [ 53 , 82 ], flowcharts [ 81 , 82 , 83 ], cause-and-effect diagrams (fishbone or Ishikawa diagrams) [ 60 , 62 , 79 , 81 , 82 ], fuzzy Pareto diagram [ 82 ], process maps [ 60 ], time series charts [ 48 ], why-why analysis [ 79 ], affinity diagrams and multivoting [ 81 ], and run chart [ 47 , 48 , 51 , 60 , 84 ], and others mentioned in the table (Table  1 ).

Barriers and facilitators of continuous quality improvement implementation

Implementing CQI initiatives is determined by various barriers and facilitators, which can be thematized into four dimensions. These dimensions are cultural, technical, structural, and strategic dimensions.

Continuous quality improvement initiatives face various cultural, strategic, technical, and structural barriers. Cultural dimension barriers involve resistance to change (e.g., not accepting online technology), lack of quality-focused culture, staff reporting apprehensiveness, and fear of blame or punishment [ 36 , 41 , 85 , 86 ]. The technical dimension barriers of CQI can include various factors that hinder the effective implementation and execution of CQI processes [ 36 , 86 , 87 , 88 , 89 ]. Structural dimension barriers of CQI arise from the organization structure, process, and systems that can impede the effective implementation and sustainability of CQI [ 36 , 85 , 86 , 87 , 88 ]. Strategic dimension barriers are, for example, the inability to select proper CQI goals and failure to integrate CQI into organizational planning and goals [ 36 , 85 , 86 , 87 , 88 , 90 ].

Facilitators are also grouped to cultural, structural, technical, and strategic dimensions to provide solutions to CQI barriers. Cultural challenges were addressed by developing a group culture to CQI and other rewards [ 39 , 41 , 80 , 85 , 86 , 87 , 90 , 91 , 92 ]. Technical facilitators are pivotal to improving technical barriers [ 39 , 42 , 53 , 69 , 86 , 90 , 91 ]. Structural-related facilitators are related to improving communication, infrastructure, and systems [ 86 , 92 , 93 ]. Strategic dimension facilitators include strengthening leadership and improving decision-making skills [ 43 , 53 , 67 , 86 , 87 , 92 , 94 , 95 ] (Table  2 ).

Impact of continuous quality improvement

Continuous quality improvement initiatives can significantly impact the quality of healthcare in a wide range of health areas, focusing on improving structure, the health service delivery process and improving client wellbeing and reducing mortality.

Structure components

These are health leadership, financing, workforce, technology, and equipment and supplies. CQI has improved planning, monitoring and evaluation [ 48 , 53 ], and leadership and planning [ 48 ], indicating improvement in leadership perspectives. Implementing CQI in primary health care (PHC) settings has shown potential for maintaining or reducing operation costs [ 67 ]. Findings from another study indicate that the costs associated with implementing CQI interventions per facility ranged from approximately $2,000 to $10,500 per year, with an average cost of approximately $10 to $60 per admitted client [ 57 ]. However, based on model predictions, the average cost savings after implementing CQI were estimated to be $5430 [ 31 ]. CQI can also be applied to health workforce development [ 32 ]. CQI in the institutional system improved medical education [ 66 , 96 , 97 ], human resources management [ 53 ], motivated staffs [ 76 ], and increased staff health awareness [ 69 ], while concerns raised about CQI impartiality, independence, and public accountability [ 96 ]. Regarding health technology, CQI also improved registration and documentation [ 48 , 53 , 98 ]. Furthermore, the CQI initiatives increased cleanliness [ 54 ] and improved logistics, supplies, and equipment [ 48 , 53 , 68 ].

Process and output components

The process component focuses on the activities and actions involved in delivering healthcare services.

Service delivery

CQI interventions improved service delivery [ 53 , 56 , 99 ], particularly a significant 18% increase in the overall quality of service performance [ 48 ], improved patient counselling, adherence to appropriate procedures, and infection prevention [ 48 , 68 ], and optimised workflow [ 52 ].

Coordination and collaboration

CQI initiatives improved coordination and collaboration through collecting and analysing data, onsite technical support, training, supportive supervision [ 53 ] and facilitating linkages between work processes and a quality control group [ 65 ].

Patient satisfaction

The CQI initiatives increased patient satisfaction and improved quality of life by optimizing care quality management, improving the quality of clinical nursing, reducing nursing defects and enhancing the wellbeing of clients [ 54 , 76 , 100 ], although CQI was not associated with changes in adolescent and young adults’ satisfaction [ 51 ].

CQI initiatives reduced medication error reports from 16 to 6 [ 101 ], and it significantly reduced the administration of inappropriate prophylactic antibiotics [ 44 ], decreased errors in inpatient care [ 52 ], decreased the overall episiotomy rate from 44.5 to 33.3% [ 83 ], reduced the overall incidence of unplanned endotracheal extubation [ 102 ], improving appropriate use of computed tomography angiography [ 103 ], and appropriate diagnosis and treatment selection [ 47 ].

Continuity of care

CQI initiatives effectively improve continuity of care by improving client and physician interaction. For instance, provider continuity levels showed a 64% increase [ 55 ]. Modifying electronic medical record templates, scheduling, staff and parental education, standardization of work processes, and birth to 1-year age-specific incentives in post-natal follow-up care increased continuity of care to 74% in 2018 compared to baseline 13% in 2012 [ 84 ].

The CQI initiative yielded enhanced efficiency in the cardiac catheterization laboratory, as evidenced by improved punctuality in procedure starts and increased efficiency in manual sheath-pulls inside [ 78 ].

Accessibility

CQI initiatives were effective in improving accessibility in terms of increasing service coverage and utilization rate. For instance, screening for cigarettes, nutrition counselling, folate prescription, maternal care, immunization coverage [ 53 , 81 , 104 , 105 ], reducing the percentage of non-attending patients to surgery to 0.9% from the baseline 3.9% [ 43 ], increasing Chlamydia screening rates from 29 to 60% [ 45 ], increasing HIV care continuum coverage [ 51 , 59 , 60 ], increasing in the uptake of postpartum long-acting reversible contraceptive use from 6.9% at the baseline to 25.4% [ 42 ], increasing post-caesarean section prophylaxis from 36 to 89% [ 62 ], a 31% increase of kangaroo care practice [ 50 ], and increased follow-up [ 65 ]. Similarly, the QI intervention increased the quality of antenatal care by 29.3%, correct partograph use by 51.7%, and correct active third-stage labour management, a 19.6% improvement from the baseline, but not significantly associated with improvement in contraceptive service uptake [ 61 ].

Timely access

CQI interventions improved the time care provision [ 52 ], and reduced waiting time [ 62 , 74 , 76 , 106 ]. For instance, the discharge process waiting time in the emergency department decreased from 76 min to 22 min [ 79 ]. It also reduced mean postprocedural length of stay from 2.8 days to 2.0 days [ 31 ].

Acceptability

Acceptability of CQI by healthcare providers was satisfactory. For instance, 88% of the faculty, 64% of the residents, and 82% of the staff believed CQI to be useful in the healthcare clinic [ 107 ].

Outcome components

Morbidity and mortality.

CQI efforts have demonstrated better management outcomes among diabetic patients [ 40 ], patients with oral mucositis [ 71 ], and anaemic patients [ 72 ]. It has also reduced infection rate in post-caesarean Sect. [ 62 ], reduced post-peritoneal dialysis peritonitis [ 49 , 108 ], and prevented pressure ulcers [ 70 ]. It is explained by peritonitis incidence from once every 40.1 patient months at baseline to once every 70.8 patient months after CQI [ 49 ] and a 63% reduction in pressure ulcer prevalence within 2 years from 2008 to 2010 [ 70 ]. Furthermore, CQI initiatives significantly reduced in-hospital deaths [ 31 ] and increased patient survival rates [ 108 ]. Figure  2 displays the overall process of the CQI implementations.

figure 2

The overall mechanisms of continuous quality improvement implementation

In this review, we examined the fundamental concepts and principles underlying CQI, the factors that either hinder or assist in its successful application and implementation, and the purpose of CQI in enhancing quality of care across various health issues.

Our findings have brought attention to the application and implementation of CQI, emphasizing its underlying concepts and principles, as evident in the existing literature [ 31 , 32 , 33 , 34 , 35 , 36 , 39 , 40 , 43 , 45 , 46 ]. Continuous quality improvement has shared with the principles of continuous improvement, such as a customer-driven focus, effective leadership, active participation of individuals, a process-oriented approach, systematic implementation, emphasis on design improvement and prevention, evidence-based decision-making, and fostering partnership [ 5 ]. Moreover, Deming’s 14 principles laid the foundation for CQI principles [ 109 ]. These principles have been adapted and put into practice in various ways: ten [ 19 ] and five [ 38 ] principles in hospitals, five principles for capacity building [ 38 ], and two principles for medication error prevention [ 41 ]. As a principle, the application of CQI can be process-focused [ 8 , 19 ] or impact-focused [ 38 ]. Impact-focused CQI focuses on achieving specific outcomes or impacts, whereas process-focused CQI prioritizes and improves the underlying processes and systems. These principles complement each other and can be utilized based on the objectives of quality improvement initiatives in healthcare settings. Overall, CQI is an ongoing educational process that requires top management’s involvement, demands coordination across departments, encourages the incorporation of views beyond clinical area, and provides non-judgemental evidence based on objective data [ 110 ].

The current review recognized that it was not easy to implement CQI. It requires reasonable utilization of various models and tools. The application of each tool can be varied based on the studied health problem and the purpose of CQI initiative [ 111 ], varied in context, content, structure, and usability [ 112 ]. Additionally, overcoming the cultural, technical, structural, and strategic-related barriers. These barriers have emerged from clinical staff, managers, and health systems perspectives. Of the cultural obstacles, staff non-involvement, resistance to change, and reluctance to report error were staff-related. In contrast, others, such as the absence of celebration for success and hierarchical and rational culture, may require staff and manager involvement. Staff members may exhibit reluctance in reporting errors due to various cultural factors, including lack of trust, hierarchical structures, fear of retribution, and a blame-oriented culture. These challenges pose obstacles to implementing standardized CQI practices, as observed, for instance, in community pharmacy settings [ 85 ]. The hierarchical culture, characterized by clearly defined levels of power, authority, and decision-making, posed challenges to implementing CQI initiatives in public health [ 41 , 86 ]. Although rational culture, a type of organizational culture, emphasizes logical thinking and rational decision-making, it can also create challenges for CQI implementation [ 41 , 86 ] because hierarchical and rational cultures, which emphasize bureaucratic norms and narrow definitions of achievement, were found to act as barriers to the implementation of CQI [ 86 ]. These could be solved by developing a shared mindset and collective commitment, establishing a shared purpose, developing group norms, and cultivating psychological preparedness among staff, managers, and clients to implement and sustain CQI initiatives. Furthermore, reversing cultural-related barriers necessitates cultural-related solutions: development of a culture and group culture to CQI [ 41 , 86 ], positive comprehensive perception [ 91 ], commitment [ 85 ], involving patients, families, leaders, and staff [ 39 , 92 ], collaborating for a common goal [ 80 , 86 ], effective teamwork [ 86 , 87 ], and rewarding and celebrating successes [ 80 , 90 ].

The technical dimension barriers of CQI can include inadequate capitalization of a project and insufficient support for CQI facilitators and data entry managers [ 36 ], immature electronic medical records or poor information systems [ 36 , 86 ], and the lack of training and skills [ 86 , 87 , 88 ]. These challenges may cause the CQI team to rely on outdated information and technologies. The presence of barriers on the technical dimension may challenge the solid foundation of CQI expertise among staff, the ability to recognize opportunities for improvement, a comprehensive understanding of how services are produced and delivered, and routine use of expertise in daily work. Addressing these technical barriers requires knowledge creation activities (training, seminar, and education) [ 39 , 42 , 53 , 69 , 86 , 90 , 91 ], availability of quality data [ 86 ], reliable information [ 92 ], and a manual-online hybrid reporting system [ 85 ].

Structural dimension barriers of CQI include inadequate communication channels and lack of standardized process, specifically weak physician-to-physician synergies [ 36 ], lack of mechanisms for disseminating knowledge and limited use of communication mechanisms [ 86 ]. Lack of communication mechanism endangers sharing ideas and feedback among CQI teams, leading to misunderstandings, limited participation and misinterpretations, and a lack of learning [ 113 ]. Knowledge translation facilitates the co-production of research, subsequent diffusion of knowledge, and the developing stakeholder’s capacity and skills [ 114 ]. Thus, the absence of a knowledge translation mechanism may cause missed opportunities for learning, inefficient problem-solving, and limited creativity. To overcome these challenges, organizations should establish effective communication and information systems [ 86 , 93 ] and learning systems [ 92 ]. Though CQI and knowledge translation have interacted with each other, it is essential to recognize that they are distinct. CQI focuses on process improvement within health care systems, aiming to optimize existing processes, reduce errors, and enhance efficiency.

In contrast, knowledge translation bridges the gap between research evidence and clinical practice, translating research findings into actionable knowledge for practitioners. While both CQI and knowledge translation aim to enhance health care quality and patient outcomes, they employ different strategies: CQI utilizes tools like Plan-Do-Study-Act cycles and statistical process control, while knowledge translation involves knowledge synthesis and dissemination. Additionally, knowledge translation can also serve as a strategy to enhance CQI. Both concepts share the same principle: continuous improvement is essential for both. Therefore, effective strategies on the structural dimension may build efficient and effective steering councils, information systems, and structures to diffuse learning throughout the organization.

Strategic factors, such as goals, planning, funds, and resources, determine the overall purpose of CQI initiatives. Specific barriers were improper goals and poor planning [ 36 , 86 , 88 ], fragmentation of quality assurance policies [ 87 ], inadequate reinforcement to staff [ 36 , 90 ], time constraints [ 85 , 86 ], resource inadequacy [ 86 ], and work overload [ 86 ]. These barriers can be addressed through strengthening leadership [ 86 , 87 ], CQI-based mentoring [ 94 ], periodic monitoring, supportive supervision and coaching [ 43 , 53 , 87 , 92 , 95 ], participation, empowerment, and accountability [ 67 ], involving all stakeholders in decision-making [ 86 , 87 ], a provider-payer partnership [ 64 ], and compensating staff for after-hours meetings on CQI [ 85 ]. The strategic dimension, characterized by a strategic plan and integrated CQI efforts, is devoted to processes that are central to achieving strategic priorities. Roles and responsibilities are defined in terms of integrated strategic and quality-related goals [ 115 ].

The utmost goal of CQI has been to improve the quality of care, which is usually revealed by structure, process, and outcome. After resolving challenges and effectively using tools and running models, the goal of CQI reflects the ultimate reason and purpose of its implementation. First, effectively implemented CQI initiatives can improve leadership, health financing, health workforce development, health information technology, and availability of supplies as the building blocks of a health system [ 31 , 48 , 53 , 68 , 98 ]. Second, effectively implemented CQI initiatives improved care delivery process (counselling, adherence with standards, coordination, collaboration, and linkages) [ 48 , 53 , 65 , 68 ]. Third, the CQI can improve outputs of healthcare delivery, such as satisfaction, accessibility (timely access, utilization), continuity of care, safety, efficiency, and acceptability [ 52 , 54 , 55 , 76 , 78 ]. Finally, the effectiveness of the CQI initiatives has been tested in enhancing responses related to key aspects of the HIV response, maternal and child health, non-communicable disease control, and others (e.g., surgery and peritonitis). However, it is worth noting that CQI initiative has not always been effective. For instance, CQI using a two- to nine-times audit cycle model through systems assessment tools did not bring significant change to increase syphilis testing performance [ 116 ]. This study was conducted within the context of Aboriginal and Torres Strait Islander people’s primary health care settings. Notably, ‘the clinics may not have consistently prioritized syphilis testing performance in their improvement strategies, as facilitated by the CQI program’ [ 116 ]. Additionally, by applying CQI-based mentoring, uptake of facility-based interventions was not significantly improved, though it was effective in increasing community health worker visits during pregnancy and the postnatal period, knowledge about maternal and child health and exclusive breastfeeding practice, and HIV disclosure status [ 117 ]. The study conducted in South Africa revealed no significant association between the coverage of facility-based interventions and Continuous Quality Improvement (CQI) implementation. This lack of association was attributed to the already high antenatal and postnatal attendance rates in both control and intervention groups at baseline, leaving little room for improvement. Additionally, the coverage of HIV interventions remained consistently high throughout the study period [ 117 ].

Regarding health care and policy implications, CQI has played a vital role in advancing PHC and fostering the realization of UHC goals worldwide. The indicators found in Donabedian’s framework that are positively influenced by CQI efforts are comparable to those included in the PHC performance initiative’s conceptual framework [ 29 , 118 , 119 ]. It is clearly explained that PHC serves as the roadmap to realizing the vision of UHC [ 120 , 121 ]. Given these circumstances, implementing CQI can contribute to the achievement of PHC principles and the objectives of UHC. For instance, by implementing CQI methods, countries have enhanced the accessibility, affordability, and quality of PHC services, leading to better health outcomes for their populations. CQI has facilitated identifying and resolving healthcare gaps and inefficiencies, enabling countries to optimize resource allocation and deliver more effective and patient-centered care. However, it is crucial to recognize that the successful implementation of Continuous Quality Improvement (CQI) necessitates optimizing the duration of each cycle, understanding challenges and barriers that extend beyond the health system and settings, and acknowledging that its effectiveness may be compromised if these challenges are not adequately addressed.

Despite abundant literature, there are still gaps regarding the relationship between CQI and other dimensions within the healthcare system. No studies have examined the impact of CQI initiatives on catastrophic health expenditure, effective service coverage, patient-centredness, comprehensiveness, equity, health security, and responsiveness.

Limitations

In conducting this review, it has some limitations to consider. Firstly, only articles published in English were included, which may introduce the exclusion of relevant non-English articles. Additionally, as this review follows a scoping methodology, the focus is on synthesising available evidence rather than critically evaluating or scoring the quality of the included articles.

Continuous quality improvement is investigated as a continuous and ongoing intervention, where the implementation time can vary across different cycles. The CQI team and implementation timelines were critical elements of CQI in different models. Among the commonly used approaches, the PDSA or PDCA is frequently employed. In most CQI models, a wide range of tools, nineteen tools, are commonly utilized to support the improvement process. Cultural, technical, structural, and strategic barriers and facilitators are significant in implementing CQI initiatives. Implementing the CQI initiative aims to improve health system blocks, enhance health service delivery process and output, and ultimately prevent morbidity and reduce mortality. For future researchers, considering that CQI is context-dependent approach, conducting scale-up implementation research about catastrophic health expenditure, effective service coverage, patient-centredness, comprehensiveness, equity, health security, and responsiveness across various settings and health issues would be valuable.

Availability of data and materials

The data used and/or analyzed during the current study are available in this manuscript and/or the supplementary file.

Shewhart WA, Deming WE. Memoriam: Walter A. Shewhart, 1891–1967. Am Stat. 1967;21(2):39–40.

Article   Google Scholar  

Shewhart WA. Statistical method from the viewpoint of quality control. New York: Dover; 1986. ISBN 978-0486652320. OCLC 13822053. Reprint. Originally published: Washington, DC: Graduate School of the Department of Agriculture, 1939.

Moen R, editor Foundation and History of the PDSA Cycle. Asian network for quality conference Tokyo. https://www.deming.org/sites/default/files/pdf/2015/PDSA_History_Ron_MoenPdf . 2009.

Kuperman G, James B, Jacobsen J, Gardner RM. Continuous quality improvement applied to medical care: experiences at LDS hospital. Med Decis Making. 1991;11(4suppl):S60–65.

Article   CAS   PubMed   Google Scholar  

Singh J, Singh H. Continuous improvement philosophy–literature review and directions. Benchmarking: An International Journal. 2015;22(1):75–119.

Goldstone J. Presidential address: Sony, Porsche, and vascular surgery in the 21st century. J Vasc Surg. 1997;25(2):201–10.

Radawski D. Continuous quality improvement: origins, concepts, problems, and applications. J Physician Assistant Educ. 1999;10(1):12–6.

Shortell SM, O’Brien JL, Carman JM, Foster RW, Hughes E, Boerstler H, et al. Assessing the impact of continuous quality improvement/total quality management: concept versus implementation. Health Serv Res. 1995;30(2):377.

CAS   PubMed   PubMed Central   Google Scholar  

Lohr K. Quality of health care: an introduction to critical definitions, concepts, principles, and practicalities. Striving for quality in health care. 1991.

Berwick DM. The clinical process and the quality process. Qual Manage Healthc. 1992;1(1):1–8.

Article   CAS   Google Scholar  

Gift B. On the road to TQM. Food Manage. 1992;27(4):88–9.

CAS   PubMed   Google Scholar  

Greiner A, Knebel E. The core competencies needed for health care professionals. health professions education: A bridge to quality. 2003:45–73.

McCalman J, Bailie R, Bainbridge R, McPhail-Bell K, Percival N, Askew D et al. Continuous quality improvement and comprehensive primary health care: a systems framework to improve service quality and health outcomes. Front Public Health. 2018:6 (76):1–6.

Sheingold BH, Hahn JA. The history of healthcare quality: the first 100 years 1860–1960. Int J Afr Nurs Sci. 2014;1:18–22.

Google Scholar  

Donabedian A. Evaluating the quality of medical care. Milbank Q. 1966;44(3):166–206.

Institute of Medicine (US) Committee on Quality of Health Care in America. Crossing the Quality Chasm: A New Health System for the 21st Century. Washington (DC): National Academies Press (US). 2001. 2, Improving the 21st-century Health Care System. Available from: https://www.ncbi.nlm.nih.gov/books/NBK222265/ .

Rubinstein A, Barani M, Lopez AS. Quality first for effective universal health coverage in low-income and middle-income countries. Lancet Global Health. 2018;6(11):e1142–1143.

Article   PubMed   Google Scholar  

Agency for Healthcare Reserach and Quality. Quality Improvement and monitoring at your fingertips USA,: Agency for Healthcare Reserach and Quality. 2022. Available from: https://qualityindicators.ahrq.gov/ .

Anderson CA, Cassidy B, Rivenburgh P. Implementing continuous quality improvement (CQI) in hospitals: lessons learned from the International Quality Study. Qual Assur Health Care. 1991;3(3):141–6.

Gardner K, Mazza D. Quality in general practice - definitions and frameworks. Aust Fam Physician. 2012;41(3):151–4.

PubMed   Google Scholar  

Loper AC, Jensen TM, Farley AB, Morgan JD, Metz AJ. A systematic review of approaches for continuous quality improvement capacity-building. J Public Health Manage Pract. 2022;28(2):E354.

Hill JE, Stephani A-M, Sapple P, Clegg AJ. The effectiveness of continuous quality improvement for developing professional practice and improving health care outcomes: a systematic review. Implement Sci. 2020;15(1):1–14.

Candas B, Jobin G, Dubé C, Tousignant M, Abdeljelil AB, Grenier S, et al. Barriers and facilitators to implementing continuous quality improvement programs in colonoscopy services: a mixed methods systematic review. Endoscopy Int Open. 2016;4(02):E118–133.

Peters MD, Marnie C, Colquhoun H, Garritty CM, Hempel S, Horsley T, et al. Scoping reviews: reinforcing and advancing the methodology and application. Syst Reviews. 2021;10(1):1–6.

Arksey H, O’Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. 2005;8(1):19–32.

Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467–73.

McGowan J, Straus S, Moher D, Langlois EV, O’Brien KK, Horsley T, et al. Reporting scoping reviews—PRISMA ScR extension. J Clin Epidemiol. 2020;123:177–9.

Donabedian A. Explorations in quality assessment and monitoring: the definition of quality and approaches to its assessment. Health Administration Press, Ann Arbor. 1980;1.

World Health Organization. Operational framework for primary health care: transforming vision into action. Geneva: World Health Organization and the United Nations Children’s Fund (UNICEF); 2020 [updated 14 December 2020; cited 2023 Nov Oct 17]. Available from: https://www.who.int/publications/i/item/9789240017832 .

The Joanna Briggs Institute. The Joanna Briggs Institute Reviewers’ Manual :2014 edition. Australia: The Joanna Briggs Institute. 2014:88–91.

Rihal CS, Kamath CC, Holmes DR Jr, Reller MK, Anderson SS, McMurtry EK, et al. Economic and clinical outcomes of a physician-led continuous quality improvement intervention in the delivery of percutaneous coronary intervention. Am J Manag Care. 2006;12(8):445–52.

Ade-Oshifogun JB, Dufelmeier T. Prevention and Management of Do not return notices: a quality improvement process for Supplemental staffing nursing agencies. Nurs Forum. 2012;47(2):106–12.

Rubenstein L, Khodyakov D, Hempel S, Danz M, Salem-Schatz S, Foy R, et al. How can we recognize continuous quality improvement? Int J Qual Health Care. 2014;26(1):6–15.

O’Neill SM, Hempel S, Lim YW, Danz MS, Foy R, Suttorp MJ, et al. Identifying continuous quality improvement publications: what makes an improvement intervention ‘CQI’? BMJ Qual Saf. 2011;20(12):1011–9.

Article   PubMed   PubMed Central   Google Scholar  

Sibthorpe B, Gardner K, McAullay D. Furthering the quality agenda in Aboriginal community controlled health services: understanding the relationship between accreditation, continuous quality improvement and national key performance indicator reporting. Aust J Prim Health. 2016;22(4):270–5.

Bennett CL, Crane JM. Quality improvement efforts in oncology: are we ready to begin? Cancer Invest. 2001;19(1):86–95.

VanValkenburgh DA. Implementing continuous quality improvement at the facility level. Adv Ren Replace Ther. 2001;8(2):104–13.

Loper AC, Jensen TM, Farley AB, Morgan JD, Metz AJ. A systematic review of approaches for continuous quality improvement capacity-building. J Public Health Manage Practice. 2022;28(2):E354–361.

Ryan M. Achieving and sustaining quality in healthcare. Front Health Serv Manag. 2004;20(3):3–11.

Nicolucci A, Allotta G, Allegra G, Cordaro G, D’Agati F, Di Benedetto A, et al. Five-year impact of a continuous quality improvement effort implemented by a network of diabetes outpatient clinics. Diabetes Care. 2008;31(1):57–62.

Wakefield BJ, Blegen MA, Uden-Holman T, Vaughn T, Chrischilles E, Wakefield DS. Organizational culture, continuous quality improvement, and medication administration error reporting. Am J Med Qual. 2001;16(4):128–34.

Sori DA, Debelew GT, Degefa LS, Asefa Z. Continuous quality improvement strategy for increasing immediate postpartum long-acting reversible contraceptive use at Jimma University Medical Center, Jimma, Ethiopia. BMJ Open Qual. 2023;12(1):e002051.

Roche B, Robin C, Deleaval PJ, Marti MC. Continuous quality improvement in ambulatory surgery: the non-attending patient. Ambul Surg. 1998;6(2):97–100.

O’Connor JB, Sondhi SS, Mullen KD, McCullough AJ. A continuous quality improvement initiative reduces inappropriate prescribing of prophylactic antibiotics for endoscopic procedures. Am J Gastroenterol. 1999;94(8):2115–21.

Ursu A, Greenberg G, McKee M. Continuous quality improvement methodology: a case study on multidisciplinary collaboration to improve chlamydia screening. Fam Med Community Health. 2019;7(2):e000085.

Quick B, Nordstrom S, Johnson K. Using continuous quality improvement to implement evidence-based medicine. Lippincotts Case Manag. 2006;11(6):305–15 ( quiz 16 – 7 ).

Oyeledun B, Phillips A, Oronsaye F, Alo OD, Shaffer N, Osibo B, et al. The effect of a continuous quality improvement intervention on retention-in-care at 6 months postpartum in a PMTCT Program in Northern Nigeria: results of a cluster randomized controlled study. J Acquir Immune Defic Syndr. 2017;75(Suppl 2):S156–164.

Nyengerai T, Phohole M, Iqaba N, Kinge CW, Gori E, Moyo K, et al. Quality of service and continuous quality improvement in voluntary medical male circumcision programme across four provinces in South Africa: longitudinal and cross-sectional programme data. PLoS ONE. 2021;16(8):e0254850.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Wang J, Zhang H, Liu J, Zhang K, Yi B, Liu Y, et al. Implementation of a continuous quality improvement program reduces the occurrence of peritonitis in PD. Ren Fail. 2014;36(7):1029–32.

Stikes R, Barbier D. Applying the plan-do-study-act model to increase the use of kangaroo care. J Nurs Manag. 2013;21(1):70–8.

Wagner AD, Mugo C, Bluemer-Miroite S, Mutiti PM, Wamalwa DC, Bukusi D, et al. Continuous quality improvement intervention for adolescent and young adult HIV testing services in Kenya improves HIV knowledge. AIDS. 2017;31(Suppl 3):S243–252.

Le RD, Melanson SE, Santos KS, Paredes JD, Baum JM, Goonan EM, et al. Using lean principles to optimise inpatient phlebotomy services. J Clin Pathol. 2014;67(8):724–30.

Manyazewal T, Mekonnen A, Demelew T, Mengestu S, Abdu Y, Mammo D, et al. Improving immunization capacity in Ethiopia through continuous quality improvement interventions: a prospective quasi-experimental study. Infect Dis Poverty. 2018;7:7.

Kamiya Y, Ishijma H, Hagiwara A, Takahashi S, Ngonyani HAM, Samky E. Evaluating the impact of continuous quality improvement methods at hospitals in Tanzania: a cluster-randomized trial. Int J Qual Health Care. 2017;29(1):32–9.

Kibbe DC, Bentz E, McLaughlin CP. Continuous quality improvement for continuity of care. J Fam Pract. 1993;36(3):304–8.

Adrawa N, Ongiro S, Lotee K, Seret J, Adeke M, Izudi J. Use of a context-specific package to increase sputum smear monitoring among people with pulmonary tuberculosis in Uganda: a quality improvement study. BMJ Open Qual. 2023;12(3):1–6.

Hunt P, Hunter SB, Levan D. Continuous quality improvement in substance abuse treatment facilities: how much does it cost? J Subst Abuse Treat. 2017;77:133–40.

Azadeh A, Ameli M, Alisoltani N, Motevali Haghighi S. A unique fuzzy multi-control approach for continuous quality improvement in a radio therapy department. Qual Quantity. 2016;50(6):2469–93.

Memiah P, Tlale J, Shimabale M, Nzyoka S, Komba P, Sebeza J, et al. Continuous quality improvement (CQI) institutionalization to reach 95:95:95 HIV targets: a multicountry experience from the Global South. BMC Health Serv Res. 2021;21(1):711.

Yapa HM, De Neve JW, Chetty T, Herbst C, Post FA, Jiamsakul A, et al. The impact of continuous quality improvement on coverage of antenatal HIV care tests in rural South Africa: results of a stepped-wedge cluster-randomised controlled implementation trial. PLoS Med. 2020;17(10):e1003150.

Dadi TL, Abebo TA, Yeshitla A, Abera Y, Tadesse D, Tsegaye S, et al. Impact of quality improvement interventions on facility readiness, quality and uptake of maternal and child health services in developing regions of Ethiopia: a secondary analysis of programme data. BMJ Open Qual. 2023;12(4):e002140.

Weinberg M, Fuentes JM, Ruiz AI, Lozano FW, Angel E, Gaitan H, et al. Reducing infections among women undergoing cesarean section in Colombia by means of continuous quality improvement methods. Arch Intern Med. 2001;161(19):2357–65.

Andreoni V, Bilak Y, Bukumira M, Halfer D, Lynch-Stapleton P, Perez C. Project management: putting continuous quality improvement theory into practice. J Nurs Care Qual. 1995;9(3):29–37.

Balfour ME, Zinn TE, Cason K, Fox J, Morales M, Berdeja C, et al. Provider-payer partnerships as an engine for continuous quality improvement. Psychiatric Serv. 2018;69(6):623–5.

Agurto I, Sandoval J, De La Rosa M, Guardado ME. Improving cervical cancer prevention in a developing country. Int J Qual Health Care. 2006;18(2):81–6.

Anderson CI, Basson MD, Ali M, Davis AT, Osmer RL, McLeod MK, et al. Comprehensive multicenter graduate surgical education initiative incorporating entrustable professional activities, continuous quality improvement cycles, and a web-based platform to enhance teaching and learning. J Am Coll Surg. 2018;227(1):64–76.

Benjamin S, Seaman M. Applying continuous quality improvement and human performance technology to primary health care in Bahrain. Health Care Superv. 1998;17(1):62–71.

Byabagambi J, Marks P, Megere H, Karamagi E, Byakika S, Opio A, et al. Improving the quality of voluntary medical male circumcision through use of the continuous quality improvement approach: a pilot in 30 PEPFAR-Supported sites in Uganda. PLoS ONE. 2015;10(7):e0133369.

Hogg S, Roe Y, Mills R. Implementing evidence-based continuous quality improvement strategies in an urban Aboriginal Community Controlled Health Service in South East Queensland: a best practice implementation pilot. JBI Database Syst Rev Implement Rep. 2017;15(1):178–87.

Hopper MB, Morgan S. Continuous quality improvement initiative for pressure ulcer prevention. J Wound Ostomy Cont Nurs. 2014;41(2):178–80.

Ji J, Jiang DD, Xu Z, Yang YQ, Qian KY, Zhang MX. Continuous quality improvement of nutrition management during radiotherapy in patients with nasopharyngeal carcinoma. Nurs Open. 2021;8(6):3261–70.

Chen M, Deng JH, Zhou FD, Wang M, Wang HY. Improving the management of anemia in hemodialysis patients by implementing the continuous quality improvement program. Blood Purif. 2006;24(3):282–6.

Reeves S, Matney K, Crane V. Continuous quality improvement as an ideal in hospital practice. Health Care Superv. 1995;13(4):1–12.

Barton AJ, Danek G, Johns P, Coons M. Improving patient outcomes through CQI: vascular access planning. J Nurs Care Qual. 1998;13(2):77–85.

Buttigieg SC, Gauci D, Dey P. Continuous quality improvement in a Maltese hospital using logical framework analysis. J Health Organ Manag. 2016;30(7):1026–46.

Take N, Byakika S, Tasei H, Yoshikawa T. The effect of 5S-continuous quality improvement-total quality management approach on staff motivation, patients’ waiting time and patient satisfaction with services at hospitals in Uganda. J Public Health Afr. 2015;6(1):486.

PubMed   PubMed Central   Google Scholar  

Jacobson GH, McCoin NS, Lescallette R, Russ S, Slovis CM. Kaizen: a method of process improvement in the emergency department. Acad Emerg Med. 2009;16(12):1341–9.

Agarwal S, Gallo J, Parashar A, Agarwal K, Ellis S, Khot U, et al. Impact of lean six sigma process improvement methodology on cardiac catheterization laboratory efficiency. Catheter Cardiovasc Interv. 2015;85:S119.

Rahul G, Samanta AK, Varaprasad G A Lean Six Sigma approach to reduce overcrowding of patients and improving the discharge process in a super-specialty hospital. In 2020 International Conference on System, Computation, Automation and Networking (ICSCAN) 2020 July 3 (pp. 1-6). IEEE

Patel J, Nattabi B, Long R, Durey A, Naoum S, Kruger E, et al. The 5 C model: A proposed continuous quality improvement framework for volunteer dental services in remote Australian Aboriginal communities. Community Dent Oral Epidemiol. 2023;51(6):1150–8.

Van Acker B, McIntosh G, Gudes M. Continuous quality improvement techniques enhance HMO members’ immunization rates. J Healthc Qual. 1998;20(2):36–41.

Horine PD, Pohjala ED, Luecke RW. Healthcare financial managers and CQI. Healthc Financ Manage. 1993;47(9):34.

Reynolds JL. Reducing the frequency of episiotomies through a continuous quality improvement program. CMAJ. 1995;153(3):275–82.

Bunik M, Galloway K, Maughlin M, Hyman D. First five quality improvement program increases adherence and continuity with well-child care. Pediatr Qual Saf. 2021;6(6):e484.

Boyle TA, MacKinnon NJ, Mahaffey T, Duggan K, Dow N. Challenges of standardized continuous quality improvement programs in community pharmacies: the case of SafetyNET-Rx. Res Social Adm Pharm. 2012;8(6):499–508.

Price A, Schwartz R, Cohen J, Manson H, Scott F. Assessing continuous quality improvement in public health: adapting lessons from healthcare. Healthc Policy. 2017;12(3):34–49.

Gage AD, Gotsadze T, Seid E, Mutasa R, Friedman J. The influence of continuous quality improvement on healthcare quality: a mixed-methods study from Zimbabwe. Soc Sci Med. 2022;298:114831.

Chan YC, Ho SJ. Continuous quality improvement: a survey of American and Canadian healthcare executives. Hosp Health Serv Adm. 1997;42(4):525–44.

Balas EA, Puryear J, Mitchell JA, Barter B. How to structure clinical practice guidelines for continuous quality improvement? J Med Syst. 1994;18(5):289–97.

ElChamaa R, Seely AJE, Jeong D, Kitto S. Barriers and facilitators to the implementation and adoption of a continuous quality improvement program in surgery: a case study. J Contin Educ Health Prof. 2022;42(4):227–35.

Candas B, Jobin G, Dubé C, Tousignant M, Abdeljelil A, Grenier S, et al. Barriers and facilitators to implementing continuous quality improvement programs in colonoscopy services: a mixed methods systematic review. Endoscopy Int Open. 2016;4(2):E118–133.

Brandrud AS, Schreiner A, Hjortdahl P, Helljesen GS, Nyen B, Nelson EC. Three success factors for continual improvement in healthcare: an analysis of the reports of improvement team members. BMJ Qual Saf. 2011;20(3):251–9.

Lee S, Choi KS, Kang HY, Cho W, Chae YM. Assessing the factors influencing continuous quality improvement implementation: experience in Korean hospitals. Int J Qual Health Care. 2002;14(5):383–91.

Horwood C, Butler L, Barker P, Phakathi S, Haskins L, Grant M, et al. A continuous quality improvement intervention to improve the effectiveness of community health workers providing care to mothers and children: a cluster randomised controlled trial in South Africa. Hum Resour Health. 2017;15(1):39.

Hyrkäs K, Lehti K. Continuous quality improvement through team supervision supported by continuous self-monitoring of work and systematic patient feedback. J Nurs Manag. 2003;11(3):177–88.

Akdemir N, Peterson LN, Campbell CM, Scheele F. Evaluation of continuous quality improvement in accreditation for medical education. BMC Med Educ. 2020;20(Suppl 1):308.

Barzansky B, Hunt D, Moineau G, Ahn D, Lai CW, Humphrey H, et al. Continuous quality improvement in an accreditation system for undergraduate medical education: benefits and challenges. Med Teach. 2015;37(11):1032–8.

Gaylis F, Nasseri R, Salmasi A, Anderson C, Mohedin S, Prime R, et al. Implementing continuous quality improvement in an integrated community urology practice: lessons learned. Urology. 2021;153:139–46.

Gaga S, Mqoqi N, Chimatira R, Moko S, Igumbor JO. Continuous quality improvement in HIV and TB services at selected healthcare facilities in South Africa. South Afr J HIV Med. 2021;22(1):1202.

Wang F, Yao D. Application effect of continuous quality improvement measures on patient satisfaction and quality of life in gynecological nursing. Am J Transl Res. 2021;13(6):6391–8.

Lee SB, Lee LL, Yeung RS, Chan J. A continuous quality improvement project to reduce medication error in the emergency department. World J Emerg Med. 2013;4(3):179–82.

Chiang AA, Lee KC, Lee JC, Wei CH. Effectiveness of a continuous quality improvement program aiming to reduce unplanned extubation: a prospective study. Intensive Care Med. 1996;22(11):1269–71.

Chinnaiyan K, Al-Mallah M, Goraya T, Patel S, Kazerooni E, Poopat C, et al. Impact of a continuous quality improvement initiative on appropriate use of coronary CT angiography: results from a multicenter, statewide registry, the advanced cardiovascular imaging consortium (ACIC). J Cardiovasc Comput Tomogr. 2011;5(4):S29–30.

Gibson-Helm M, Rumbold A, Teede H, Ranasinha S, Bailie R, Boyle J. A continuous quality improvement initiative: improving the provision of pregnancy care for Aboriginal and Torres Strait Islander women. BJOG: Int J Obstet Gynecol. 2015;122:400–1.

Bennett IM, Coco A, Anderson J, Horst M, Gambler AS, Barr WB, et al. Improving maternal care with a continuous quality improvement strategy: a report from the interventions to minimize preterm and low birth weight infants through continuous improvement techniques (IMPLICIT) network. J Am Board Fam Med. 2009;22(4):380–6.

Krall SP, Iv CLR, Donahue L. Effect of continuous quality improvement methods on reducing triage to thrombolytic interval for Acute myocardial infarction. Acad Emerg Med. 1995;2(7):603–9.

Swanson TK, Eilers GM. Physician and staff acceptance of continuous quality improvement. Fam Med. 1994;26(9):583–6.

Yu Y, Zhou Y, Wang H, Zhou T, Li Q, Li T, et al. Impact of continuous quality improvement initiatives on clinical outcomes in peritoneal dialysis. Perit Dial Int. 2014;34(Suppl 2):S43–48.

Schiff GD, Goldfield NI. Deming meets Braverman: toward a progressive analysis of the continuous quality improvement paradigm. Int J Health Serv. 1994;24(4):655–73.

American Hospital Association Division of Quality Resources Chicago, IL: The role of hospital leadership in the continuous improvement of patient care quality. American Hospital Association. J Healthc Qual. 1992;14(5):8–14,22.

Scriven M. The Logic and Methodology of checklists [dissertation]. Western Michigan University; 2000.

Hales B, Terblanche M, Fowler R, Sibbald W. Development of medical checklists for improved quality of patient care. Int J Qual Health Care. 2008;20(1):22–30.

Vermeir P, Vandijck D, Degroote S, Peleman R, Verhaeghe R, Mortier E, et al. Communication in healthcare: a narrative review of the literature and practical recommendations. Int J Clin Pract. 2015;69(11):1257–67.

Eljiz K, Greenfield D, Hogden A, Taylor R, Siddiqui N, Agaliotis M, et al. Improving knowledge translation for increased engagement and impact in healthcare. BMJ open Qual. 2020;9(3):e000983.

O’Brien JL, Shortell SM, Hughes EF, Foster RW, Carman JM, Boerstler H, et al. An integrative model for organization-wide quality improvement: lessons from the field. Qual Manage Healthc. 1995;3(4):19–30.

Adily A, Girgis S, D’Este C, Matthews V, Ward JE. Syphilis testing performance in Aboriginal primary health care: exploring impact of continuous quality improvement over time. Aust J Prim Health. 2020;26(2):178–83.

Horwood C, Butler L, Barker P, Phakathi S, Haskins L, Grant M, et al. A continuous quality improvement intervention to improve the effectiveness of community health workers providing care to mothers and children: a cluster randomised controlled trial in South Africa. Hum Resour Health. 2017;15:1–11.

Veillard J, Cowling K, Bitton A, Ratcliffe H, Kimball M, Barkley S, et al. Better measurement for performance improvement in low- and middle-income countries: the primary Health Care Performance Initiative (PHCPI) experience of conceptual framework development and indicator selection. Milbank Q. 2017;95(4):836–83.

Barbazza E, Kringos D, Kruse I, Klazinga NS, Tello JE. Creating performance intelligence for primary health care strengthening in Europe. BMC Health Serv Res. 2019;19(1):1006.

Assefa Y, Hill PS, Gilks CF, Admassu M, Tesfaye D, Van Damme W. Primary health care contributions to universal health coverage. Ethiopia Bull World Health Organ. 2020;98(12):894.

Van Weel C, Kidd MR. Why strengthening primary health care is essential to achieving universal health coverage. CMAJ. 2018;190(15):E463–466.

Download references

Acknowledgements

Not applicable.

The authors received no fund.

Author information

Authors and affiliations.

School of Public Health, The University of Queensland, Brisbane, Australia

Aklilu Endalamaw, Resham B Khatri, Tesfaye Setegn Mengistu, Daniel Erku & Yibeltal Assefa

College of Medicine and Health Sciences, Bahir Dar University, Bahir Dar, Ethiopia

Aklilu Endalamaw & Tesfaye Setegn Mengistu

Health Social Science and Development Research Institute, Kathmandu, Nepal

Resham B Khatri

Centre for Applied Health Economics, School of Medicine, Grifth University, Brisbane, Australia

Daniel Erku

Menzies Health Institute Queensland, Grifth University, Brisbane, Australia

International Institute for Primary Health Care in Ethiopia, Addis Ababa, Ethiopia

Eskinder Wolka & Anteneh Zewdie

You can also search for this author in PubMed   Google Scholar

Contributions

AE conceptualized the study, developed the first draft of the manuscript, and managing feedbacks from co-authors. YA conceptualized the study, provided feedback, and supervised the whole processes. RBK provided feedback throughout. TSM provided feedback throughout. DE provided feedback throughout. EW provided feedback throughout. AZ provided feedback throughout. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Aklilu Endalamaw .

Ethics declarations

Ethics approval and consent to participate.

Not applicable because this research is based on publicly available articles.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., supplementary material 2., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Endalamaw, A., Khatri, R.B., Mengistu, T.S. et al. A scoping review of continuous quality improvement in healthcare system: conceptualization, models and tools, barriers and facilitators, and impact. BMC Health Serv Res 24 , 487 (2024). https://doi.org/10.1186/s12913-024-10828-0

Download citation

Received : 27 December 2023

Accepted : 05 March 2024

Published : 19 April 2024

DOI : https://doi.org/10.1186/s12913-024-10828-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Continuous quality improvement
  • Quality of Care

BMC Health Services Research

ISSN: 1472-6963

types of data analysis in research

Numbers, Facts and Trends Shaping Your World

Read our research on:

Full Topic List

Regions & Countries

  • Publications
  • Our Methods
  • Short Reads
  • Tools & Resources

Read Our Research On:

What the data says about crime in the U.S.

A growing share of Americans say reducing crime should be a top priority for the president and Congress to address this year. Around six-in-ten U.S. adults (58%) hold that view today, up from 47% at the beginning of Joe Biden’s presidency in 2021.

We conducted this analysis to learn more about U.S. crime patterns and how those patterns have changed over time.

The analysis relies on statistics published by the FBI, which we accessed through the Crime Data Explorer , and the Bureau of Justice Statistics (BJS), which we accessed through the  National Crime Victimization Survey data analysis tool .

To measure public attitudes about crime in the U.S., we relied on survey data from Pew Research Center and Gallup.

Additional details about each data source, including survey methodologies, are available by following the links in the text of this analysis.

A line chart showing that, since 2021, concerns about crime have grown among both Republicans and Democrats.

With the issue likely to come up in this year’s presidential election, here’s what we know about crime in the United States, based on the latest available data from the federal government and other sources.

How much crime is there in the U.S.?

It’s difficult to say for certain. The  two primary sources of government crime statistics  – the Federal Bureau of Investigation (FBI) and the Bureau of Justice Statistics (BJS) – paint an incomplete picture.

The FBI publishes  annual data  on crimes that have been reported to law enforcement, but not crimes that haven’t been reported. Historically, the FBI has also only published statistics about a handful of specific violent and property crimes, but not many other types of crime, such as drug crime. And while the FBI’s data is based on information from thousands of federal, state, county, city and other police departments, not all law enforcement agencies participate every year. In 2022, the most recent full year with available statistics, the FBI received data from 83% of participating agencies .

BJS, for its part, tracks crime by fielding a  large annual survey of Americans ages 12 and older and asking them whether they were the victim of certain types of crime in the past six months. One advantage of this approach is that it captures both reported and unreported crimes. But the BJS survey has limitations of its own. Like the FBI, it focuses mainly on a handful of violent and property crimes. And since the BJS data is based on after-the-fact interviews with crime victims, it cannot provide information about one especially high-profile type of offense: murder.

All those caveats aside, looking at the FBI and BJS statistics side-by-side  does  give researchers a good picture of U.S. violent and property crime rates and how they have changed over time. In addition, the FBI is transitioning to a new data collection system – known as the National Incident-Based Reporting System – that eventually will provide national information on a much larger set of crimes , as well as details such as the time and place they occur and the types of weapons involved, if applicable.

Which kinds of crime are most and least common?

A bar chart showing that theft is most common property crime, and assault is most common violent crime.

Property crime in the U.S. is much more common than violent crime. In 2022, the FBI reported a total of 1,954.4 property crimes per 100,000 people, compared with 380.7 violent crimes per 100,000 people.  

By far the most common form of property crime in 2022 was larceny/theft, followed by motor vehicle theft and burglary. Among violent crimes, aggravated assault was the most common offense, followed by robbery, rape, and murder/nonnegligent manslaughter.

BJS tracks a slightly different set of offenses from the FBI, but it finds the same overall patterns, with theft the most common form of property crime in 2022 and assault the most common form of violent crime.

How have crime rates in the U.S. changed over time?

Both the FBI and BJS data show dramatic declines in U.S. violent and property crime rates since the early 1990s, when crime spiked across much of the nation.

Using the FBI data, the violent crime rate fell 49% between 1993 and 2022, with large decreases in the rates of robbery (-74%), aggravated assault (-39%) and murder/nonnegligent manslaughter (-34%). It’s not possible to calculate the change in the rape rate during this period because the FBI  revised its definition of the offense in 2013 .

Line charts showing that U.S. violent and property crime rates have plunged since 1990s, regardless of data source.

The FBI data also shows a 59% reduction in the U.S. property crime rate between 1993 and 2022, with big declines in the rates of burglary (-75%), larceny/theft (-54%) and motor vehicle theft (-53%).

Using the BJS statistics, the declines in the violent and property crime rates are even steeper than those captured in the FBI data. Per BJS, the U.S. violent and property crime rates each fell 71% between 1993 and 2022.

While crime rates have fallen sharply over the long term, the decline hasn’t always been steady. There have been notable increases in certain kinds of crime in some years, including recently.

In 2020, for example, the U.S. murder rate saw its largest single-year increase on record – and by 2022, it remained considerably higher than before the coronavirus pandemic. Preliminary data for 2023, however, suggests that the murder rate fell substantially last year .

How do Americans perceive crime in their country?

Americans tend to believe crime is up, even when official data shows it is down.

In 23 of 27 Gallup surveys conducted since 1993 , at least 60% of U.S. adults have said there is more crime nationally than there was the year before, despite the downward trend in crime rates during most of that period.

A line chart showing that Americans tend to believe crime is up nationally, less so locally.

While perceptions of rising crime at the national level are common, fewer Americans believe crime is up in their own communities. In every Gallup crime survey since the 1990s, Americans have been much less likely to say crime is up in their area than to say the same about crime nationally.

Public attitudes about crime differ widely by Americans’ party affiliation, race and ethnicity, and other factors . For example, Republicans and Republican-leaning independents are much more likely than Democrats and Democratic leaners to say reducing crime should be a top priority for the president and Congress this year (68% vs. 47%), according to a recent Pew Research Center survey.

How does crime in the U.S. differ by demographic characteristics?

Some groups of Americans are more likely than others to be victims of crime. In the  2022 BJS survey , for example, younger people and those with lower incomes were far more likely to report being the victim of a violent crime than older and higher-income people.

There were no major differences in violent crime victimization rates between male and female respondents or between those who identified as White, Black or Hispanic. But the victimization rate among Asian Americans (a category that includes Native Hawaiians and other Pacific Islanders) was substantially lower than among other racial and ethnic groups.

The same BJS survey asks victims about the demographic characteristics of the offenders in the incidents they experienced.

In 2022, those who are male, younger people and those who are Black accounted for considerably larger shares of perceived offenders in violent incidents than their respective shares of the U.S. population. Men, for instance, accounted for 79% of perceived offenders in violent incidents, compared with 49% of the nation’s 12-and-older population that year. Black Americans accounted for 25% of perceived offenders in violent incidents, about twice their share of the 12-and-older population (12%).

As with all surveys, however, there are several potential sources of error, including the possibility that crime victims’ perceptions about offenders are incorrect.

How does crime in the U.S. differ geographically?

There are big geographic differences in violent and property crime rates.

For example, in 2022, there were more than 700 violent crimes per 100,000 residents in New Mexico and Alaska. That compares with fewer than 200 per 100,000 people in Rhode Island, Connecticut, New Hampshire and Maine, according to the FBI.

The FBI notes that various factors might influence an area’s crime rate, including its population density and economic conditions.

What percentage of crimes are reported to police? What percentage are solved?

Line charts showing that fewer than half of crimes in the U.S. are reported, and fewer than half of reported crimes are solved.

Most violent and property crimes in the U.S. are not reported to police, and most of the crimes that  are  reported are not solved.

In its annual survey, BJS asks crime victims whether they reported their crime to police. It found that in 2022, only 41.5% of violent crimes and 31.8% of household property crimes were reported to authorities. BJS notes that there are many reasons why crime might not be reported, including fear of reprisal or of “getting the offender in trouble,” a feeling that police “would not or could not do anything to help,” or a belief that the crime is “a personal issue or too trivial to report.”

Most of the crimes that are reported to police, meanwhile,  are not solved , at least based on an FBI measure known as the clearance rate . That’s the share of cases each year that are closed, or “cleared,” through the arrest, charging and referral of a suspect for prosecution, or due to “exceptional” circumstances such as the death of a suspect or a victim’s refusal to cooperate with a prosecution. In 2022, police nationwide cleared 36.7% of violent crimes that were reported to them and 12.1% of the property crimes that came to their attention.

Which crimes are most likely to be reported to police? Which are most likely to be solved?

Bar charts showing that most vehicle thefts are reported to police, but relatively few result in arrest.

Around eight-in-ten motor vehicle thefts (80.9%) were reported to police in 2022, making them by far the most commonly reported property crime tracked by BJS. Household burglaries and trespassing offenses were reported to police at much lower rates (44.9% and 41.2%, respectively), while personal theft/larceny and other types of theft were only reported around a quarter of the time.

Among violent crimes – excluding homicide, which BJS doesn’t track – robbery was the most likely to be reported to law enforcement in 2022 (64.0%). It was followed by aggravated assault (49.9%), simple assault (36.8%) and rape/sexual assault (21.4%).

The list of crimes  cleared  by police in 2022 looks different from the list of crimes reported. Law enforcement officers were generally much more likely to solve violent crimes than property crimes, according to the FBI.

The most frequently solved violent crime tends to be homicide. Police cleared around half of murders and nonnegligent manslaughters (52.3%) in 2022. The clearance rates were lower for aggravated assault (41.4%), rape (26.1%) and robbery (23.2%).

When it comes to property crime, law enforcement agencies cleared 13.0% of burglaries, 12.4% of larcenies/thefts and 9.3% of motor vehicle thefts in 2022.

Are police solving more or fewer crimes than they used to?

Nationwide clearance rates for both violent and property crime are at their lowest levels since at least 1993, the FBI data shows.

Police cleared a little over a third (36.7%) of the violent crimes that came to their attention in 2022, down from nearly half (48.1%) as recently as 2013. During the same period, there were decreases for each of the four types of violent crime the FBI tracks:

Line charts showing that police clearance rates for violent crimes have declined in recent years.

  • Police cleared 52.3% of reported murders and nonnegligent homicides in 2022, down from 64.1% in 2013.
  • They cleared 41.4% of aggravated assaults, down from 57.7%.
  • They cleared 26.1% of rapes, down from 40.6%.
  • They cleared 23.2% of robberies, down from 29.4%.

The pattern is less pronounced for property crime. Overall, law enforcement agencies cleared 12.1% of reported property crimes in 2022, down from 19.7% in 2013. The clearance rate for burglary didn’t change much, but it fell for larceny/theft (to 12.4% in 2022 from 22.4% in 2013) and motor vehicle theft (to 9.3% from 14.2%).

Note: This is an update of a post originally published on Nov. 20, 2020.

  • Criminal Justice

John Gramlich's photo

John Gramlich is an associate director at Pew Research Center

8 facts about Black Lives Matter

#blacklivesmatter turns 10, support for the black lives matter movement has dropped considerably from its peak in 2020, fewer than 1% of federal criminal defendants were acquitted in 2022, before release of video showing tyre nichols’ beating, public views of police conduct had improved modestly, most popular.

1615 L St. NW, Suite 800 Washington, DC 20036 USA (+1) 202-419-4300 | Main (+1) 202-857-8562 | Fax (+1) 202-419-4372 |  Media Inquiries

Research Topics

  • Age & Generations
  • Coronavirus (COVID-19)
  • Economy & Work
  • Family & Relationships
  • Gender & LGBTQ
  • Immigration & Migration
  • International Affairs
  • Internet & Technology
  • Methodological Research
  • News Habits & Media
  • Non-U.S. Governments
  • Other Topics
  • Politics & Policy
  • Race & Ethnicity
  • Email Newsletters

ABOUT PEW RESEARCH CENTER  Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of  The Pew Charitable Trusts .

Copyright 2024 Pew Research Center

Terms & Conditions

Privacy Policy

Cookie Settings

Reprints, Permissions & Use Policy

National Center for Science and Engineering Statistics

  • Report PDF (690 KB)
  • Report - All Formats .ZIP (1.1 MB)
  • Share on X/Twitter
  • Share on Facebook
  • Share on LinkedIn
  • Send as Email

Business R&D Performance in the United States Tops $600 Billion in 2021

September 28, 2023

Funds spent for business R&D performed in the United States, by type of R&D, source of funds, and size of company: 2018–21

i = more than 50% of the estimate is a combination of imputation and reweighting to account for nonresponse.

a Domestic R&D performance is the cost of R&D paid for and performed by the respondent company and paid for by others outside of the company and performed by the respondent company. b R&D comprises creative and systematic work undertaken in order to increase the stock of knowledge and to devise new applications of available knowledge. This includes (1) activities aimed at acquiring new knowledge or understanding without specific immediate commercial applications or uses (basic research), (2) activities aimed at solving a specific problem or meeting a specific commercial objective (applied research), and (3) systematic work, drawing on research and practical experience and resulting in additional knowledge, which is directed to producing new processes or to improving existing products—goods or services—or processes (development). c Includes foreign subsidiaries of U.S. companies. d Includes companies located inside and outside the United States; U.S. state government agencies and laboratories; U.S. universities, colleges, and academic researchers; and all other organizations located inside and outside the United States. e Includes only companies with 10 or more domestic employees.

Detail may not add to total because of rounding.

National Center for Science and Engineering Statistics and Census Bureau, Business Enterprise Research and Development Survey.

R&D Performance, by Type of R&D, Industry Sector, and Source of Funding

In 2021, of the $602 billion that companies spent on R&D, $40 billion (7%) was spent on basic research, $86 billion (14%) on applied research, and $476 billion (79%) on development ( table 1 ). In 2021, companies in manufacturing industries performed $326 billion (54%) of domestic R&D , defined as R&D performed in the 50 states and Washington, DC ( table 2 ). Most of the funding came from these companies’ own funds (88%). Companies in nonmanufacturing industries performed $276 billion of domestic R&D (46% of total domestic R&D performance), 87% of which was paid for from companies’ own funds.

The U.S. federal government was a large source of external funding for R&D ( also referred to as R&D paid for by others ) across all industries. Of the $75 billion paid for by others, the federal government accounted for $24 billion. Seventy-four percent of federal government funding went to three industry groups: aerospace products and parts (North American Industry Classification System [NAICS] code 3364) ($11 billion), scientific research and development services (NAICS 5417) ($4 billion), and computer and electronic products (NAICS 334) ($3 billion). Companies also received funds from other U.S. companies ($27 billion) and foreign companies—including foreign parent companies of U.S. subsidiaries ($23 billion). Eighteen billion dollars (69%) of all business R&D funded by other U.S. companies was for scientific research and development services (NAICS 5417). The distribution of foreign company R&D funding was spread more broadly across multiple industries ( table 2 ). (See “ Survey Information and Data Availability ” for information on the availability of data tables with full industry detail.)

Funds spent for business R&D performed in the United States, by source of funds, selected industry, and company size: 2021

D = suppressed to avoid disclosure of confidential information; i = more than 50% of the estimate is a combination of imputation and reweighting to account for nonresponse; r = relative standard error is more than 50%.

NAICS = North American Industry Classification System; nec = not elsewhere classified.

a All R&D is the cost of R&D paid for and performed by the respondent company and paid for by others outside of the company and performed by the respondent company. b Includes foreign subsidiaries of U.S. companies ($32.1 billion). c Includes foreign parent companies of U.S. subsidiaries ($20.8 billion) and unaffiliated companies ($2.5 billion). Excludes funds from foreign subsidiaries to U.S. companies paid for through intercompany transactions ($32.1 billion). d Includes U.S. state government agencies and laboratories ($0.3 billion); U.S. universities, colleges, and academic researchers (< $0.01 billion); and all other organizations located inside ($0.7 billion) and outside the United States (< $0.01 billion). e Includes only companies with 10 or more domestic employees.

Detail may not add to total because of rounding. Industry classification was based on the dominant business code for domestic R&D performance, where available. For companies that did not report business codes, the classification used for sampling was assigned. Statistics are representative of companies located in the United States that performed or funded $50,000 or more of R&D.

National Center for Science and Engineering Statistics and Census Bureau, Business Enterprise Research and Development Survey, 2021.

Sales, R&D Intensity, and Employment of Companies That Performed or Funded R&D

U.S. companies that performed or funded R&D reported domestic net sales of $13 trillion in 2021 ( table 3 ). ​ Determining the amount of domestic net sales and operating revenues was left to the reporting company. However, guidance was given to include revenues from foreign operations and subsidiaries and from discontinued operations and to exclude intracompany transfers, returns, allowances, freight charges, and excise, sales, and other revenue-based taxes. For all industries, the R&D intensity (R&D-to-sales ratio) was 4.6%; for manufacturers, 5.0%; and for nonmanufacturers, 4.2%. Manufacturing industries with high levels of R&D intensity in 2021 were pharmaceuticals and medicines (NAICS 3254) (16.1%) and computer and electronic products (NAICS 334) (13.0%). Among the nonmanufacturing industries, industries with high levels of R&D intensity were scientific research and development services (NAICS 5417) (41.2%), software publishers (NAICS 5112) (12.9%), and computer systems design and related services (NAICS 5415) (10.2%).

Businesses that performed or funded R&D employed 23.7 million people in the United States in 2021 ( table 3 ). ​ Employment statistics in this InfoBrief are headcounts unless they are designated as full-time equivalent (FTE) estimates. R&D employees include researchers (defined as R&D scientists and engineers and their managers) and the technicians, technologists, and support staff members who work on R&D or who provide direct support to R&D activities. Approximately 2.1 million (9%) were business R&D employees. ​ The number of persons employed who were assigned full time to R&D plus a prorated number of employees who worked on R&D only part of the time was 1.9 million FTEs, of which 1.3 million FTEs were R&D researchers.

Of the 2.1 million people working on R&D in companies that performed or funded business R&D in 2021, 1.5 million were men and 0.6 million were women; 48% of the men and 45% of the women worked in manufacturing industries ( table 4 ). Researchers—that is, scientists, engineers, and their managers—accounted for 1.4 million of the 2.1 million R&D workers (67%). Of the R&D workers, 130,000 (9%) held PhD degrees. R&D technicians numbered 501,000, and 205,000 were grouped as other supporting staff.

Sales, R&D, R&D intensity, and employment for companies that performed or funded business R&D in the United States, by selected industry and company size: 2021

a Dollar values are for goods sold or services rendered by R&D-performing or R&D-funding companies located in the United States to customers outside of the company, including the U.S. federal government, foreign customers, and the company's foreign subsidiaries. Included are revenues from a company’s foreign operations and subsidiaries and from discontinued operations. If a respondent company is owned by a foreign parent company, sales to the parent company and to affiliates not owned by the respondent company are included. Excluded are intracompany transfers, returns, allowances, freight charges, and excise, sales, and other revenue-based taxes. b All R&D is the cost of R&D paid for and performed by the respondent company and paid for by others outside of the company and performed by the respondent company. c R&D intensity is the cost of domestic R&D paid for by the respondent company and others outside of the company and performed by the company divided by domestic net sales of companies that performed or funded R&D. d Data recorded on 12 March represent employment figures for the year. e Headcounts of researchers, R&D managers, technicians, clerical staff, and others assigned to R&D groups. f Includes only companies with 10 or more domestic employees.

Detail may not add to total because of rounding. Industry classification was based on the dominant business code for domestic R&D performance, where available. For companies that did not report business codes, the classification used for sampling was assigned.

Domestic employment, R&D employment by sex and work activity, R&D researchers by level of education, and full-time equivalent researcher employment for companies that performed or funded business R&D in the United States, by industrial sector: 2021

NAICS = North American Industry Classification System.

a Data recorded on 12 March represent employment figures for the year. b Includes R&D scientists and engineers and their managers. c Includes clerical staff and others assigned to R&D groups. d The number of persons employed who were assigned full time to R&D, plus a prorated number of employees who worked on R&D only part of the time.

Detail may not add to total because of rounding. Industry classification was based on the dominant business code for domestic R&D performance, where available. For companies that did not report business codes, the classification used for sampling was assigned. Excludes data for federally funded research and development centers. Also available in the full set of data tables are statistics on domestic R&D employment, by state; foreign R&D personnel headcounts, by country; and headcounts of leased (i.e., external) R&D personnel, by function.

R&D Performance, by Company Size

Small- and medium-sized companies (10–249 domestic employees) performed 9.8% of the nation’s total business R&D in 2021 ( table 3 ). Frascati Manual ; see Organisation for Economic Co-operation and Development (OECD). 2015. Frascati Manual: Guidelines for Collecting and Reporting Data on Research and Experimental Development. The Measurement of Scientific, Technological, and Innovation Activities . Paris: OECD Publishing. Available at https://www.oecd-ilibrary.org/science-and-technology/frascati-manual-2015_9789264239012-en . Anderson and Kindlon (2019) provide estimates of R&D performance and employment using these new classifications over 2008–15. The authors also compare the trends to those observed in SIRD for the time prior to 2008. The ABS, also cosponsored by NCSES and the Census Bureau, collects R&D data from companies with fewer than 10 employees for 2017 and beyond. See Anderson G, Kindlon A; NCSES. 2019. Indicators of R&D in Small Businesses: Data from the 2009–15 Business R&D and Innovation Survey . InfoBrief NSF 19-316. Alexandria, VA: National Science Foundation. Available at https://www.nsf.gov/statistics/2019/nsf19316/ ." data-bs-content="Company size classifications changed for 2017 and subsequent years in response to the revised Frascati Manual ; see Organisation for Economic Co-operation and Development (OECD). 2015. Frascati Manual: Guidelines for Collecting and Reporting Data on Research and Experimental Development. The Measurement of Scientific, Technological, and Innovation Activities . Paris: OECD Publishing. Available at https://www.oecd-ilibrary.org/science-and-technology/frascati-manual-2015_9789264239012-en . Anderson and Kindlon (2019) provide estimates of R&D performance and employment using these new classifications over 2008–15. The authors also compare the trends to those observed in SIRD for the time prior to 2008. The ABS, also cosponsored by NCSES and the Census Bureau, collects R&D data from companies with fewer than 10 employees for 2017 and beyond. See Anderson G, Kindlon A; NCSES. 2019. Indicators of R&D in Small Businesses: Data from the 2009–15 Business R&D and Innovation Survey . InfoBrief NSF 19-316. Alexandria, VA: National Science Foundation. Available at https://www.nsf.gov/statistics/2019/nsf19316/ ." data-endnote-uuid="bbd761ec-4ed8-45ec-810e-9b53647fe422">​ Company size classifications changed for 2017 and subsequent years in response to the revised Frascati Manual ; see Organisation for Economic Co-operation and Development (OECD). 2015. Frascati Manual: Guidelines for Collecting and Reporting Data on Research and Experimental Development. The Measurement of Scientific, Technological, and Innovation Activities . Paris: OECD Publishing. Available at https://www.oecd-ilibrary.org/science-and-technology/frascati-manual-2015_9789264239012-en . Anderson and Kindlon (2019) provide estimates of R&D performance and employment using these new classifications over 2008–15. The authors also compare the trends to those observed in SIRD for the time prior to 2008. The ABS, also cosponsored by NCSES and the Census Bureau, collects R&D data from companies with fewer than 10 employees for 2017 and beyond. See Anderson G, Kindlon A; NCSES. 2019. Indicators of R&D in Small Businesses: Data from the 2009–15 Business R&D and Innovation Survey . InfoBrief NSF 19-316. Alexandria, VA: National Science Foundation. Available at https://www.nsf.gov/statistics/2019/nsf19316/ . For these companies as a group, the R&D intensity was 8.8%. These companies accounted for 5% of sales and employed 7% of the 23.7 million employees who worked for R&D-performing or R&D-funding companies. They employed 18% of the 2.1 million employees engaged in business R&D in the United States.

Large companies with 250–24,999 domestic employees performed 52% of the nation’s total business R&D in 2021, and their R&D intensity was 4.7%. They accounted for 51% of sales, employed 42% of those who worked for R&D-performing or R&D-funding companies, and employed 51% of R&D employees in the United States.

The largest companies (25,000 or more domestic employees) performed 38% of the nation’s total business R&D in 2021, and their R&D intensity was 4.0%. They accounted for 44% of sales, employed 51% of those who worked for R&D-performing or R&D-funding companies, and employed 31% of business R&D employees in the United States.

R&D Performance, by State

In 2021, of the $602 billion of R&D performed in the United States, businesses in California alone accounted for 35.1% ( table 5 ). Other states with large amounts of business R&D were Washington (8.1% of the national total in 2021), Massachusetts (6.6%), Texas (4.7%), New York (4.4%), and New Jersey (4.2%). Over Half of U.S. Business R&D Performed in 10 Metropolitan Areas in 2015 . InfoBrief NSF 19-322. Alexandria, VA: National Science Foundation. Available at https://www.nsf.gov/statistics/2019/nsf19322/ . Also see Shackelford B, Wolfe R; NCSES. 2020. Businesses Performed 60% of Their U.S. R&D in 10 Metropolitan Areas in 2018 . InfoBrief NSF 21-331. Alexandria, VA: National Science Foundation. Available at https://ncses.nsf.gov/pubs/nsf21331 . Information and statistics on U.S. state trends in R&D, science and engineering education, workforce, patents and publications, and knowledge-intensive industries is also available in the Science and Engineering State Indicators data tool at https://ncses.nsf.gov/indicators/states ." data-bs-content="In addition to statistics for all states and for all states by industry, below-state level statistics are available in the full set of data tables and in other InfoBriefs; see Shackelford B, Wolfe R; NCSES. 2019. Over Half of U.S. Business R&D Performed in 10 Metropolitan Areas in 2015 . InfoBrief NSF 19-322. Alexandria, VA: National Science Foundation. Available at https://www.nsf.gov/statistics/2019/nsf19322/ . Also see Shackelford B, Wolfe R; NCSES. 2020. Businesses Performed 60% of Their U.S. R&D in 10 Metropolitan Areas in 2018 . InfoBrief NSF 21-331. Alexandria, VA: National Science Foundation. Available at https://ncses.nsf.gov/pubs/nsf21331 . Information and statistics on U.S. state trends in R&D, science and engineering education, workforce, patents and publications, and knowledge-intensive industries is also available in the Science and Engineering State Indicators data tool at https://ncses.nsf.gov/indicators/states ." data-endnote-uuid="8051c6cd-6983-4989-9a6c-bbc5713eaaa4">​ In addition to statistics for all states and for all states by industry, below-state level statistics are available in the full set of data tables and in other InfoBriefs; see Shackelford B, Wolfe R; NCSES. 2019. Over Half of U.S. Business R&D Performed in 10 Metropolitan Areas in 2015 . InfoBrief NSF 19-322. Alexandria, VA: National Science Foundation. Available at https://www.nsf.gov/statistics/2019/nsf19322/ . Also see Shackelford B, Wolfe R; NCSES. 2020. Businesses Performed 60% of Their U.S. R&D in 10 Metropolitan Areas in 2018 . InfoBrief NSF 21-331. Alexandria, VA: National Science Foundation. Available at https://ncses.nsf.gov/pubs/nsf21331 . Information and statistics on U.S. state trends in R&D, science and engineering education, workforce, patents and publications, and knowledge-intensive industries is also available in the Science and Engineering State Indicators data tool at https://ncses.nsf.gov/indicators/states .

Funds spent for business R&D performed in the United States, by state and source of funds: 2021

a All R&D is the cost of domestic R&D paid for by the respondent company and others outside of the company and performed by the respondent company. b Includes data reported that were not allocated to a specific state by multi-establishment companies. For single-establishment companies, data reported were allocated to the state in the address used to mail the survey form.

Capital Expenditures

Companies that performed or funded R&D in the United States in 2021 spent $793 billion on capital, that is, assets with expected useful lives of more than 1 year ( table 6 ). Of this amount, $53 billion (7%) was for assets used for domestic R&D operations (i.e., land acquisitions, buildings and land improvement, equipment, capitalized software, and other assets). Companies in manufacturing industries spent $28 billion on capital for domestic R&D, and companies in nonmanufacturing industries spent $24 billion. Industries with high levels of capital expenditures on assets used for domestic R&D in 2021 were pharmaceuticals and medicines (NAICS 3254) ($7.5 billion, or 14% of national capital expenditures on assets used for R&D) and semiconductor and other electronic products (NAICS 3344) ($5 billion, or 9%). Among all types of capital assets, manufacturing industries spent the most on equipment ($15 billion, or 53% of total capital assets used for domestic R&D), and nonmanufacturing industries disbursed the most on capitalized software ($13.7 billion, or 56%).

Capital expenditures in the United States, total and used for domestic R&D, by type of expenditure, industry, and company size: 2021

* = amount < $500,000; i = more than 50% of the estimate is a combination of imputation and reweighting to account for nonresponse; r = relative standard error is more than 50%.

a Domestic R&D is the R&D paid for by the respondent company and others outside of the company and performed by the company. b Capital expenditures are payments by a business for assets that usually have a useful life of more than 1 year. The value of assets acquired or improved through capital expenditures is recorded on a company’s balance sheet. BERD Survey statistics exclude the cost of assets acquired through mergers and acquisitions. c Capital expenditures for long-lived assets used in a company’s R&D operations are not included in its R&D expense, but any depreciation recorded for those assets is included in its R&D expense. For 2021, depreciation associated with domestic R&D paid for and performed by the company was $18.4 billion and with domestic R&D performed by the company and paid for by others was $2.7 billion. d Includes the cost of purchased or improved buildings and other facilities that are fixed to the land. e Includes the cost of other capital expenditures, including purchased patents and other intangible assets, and expenditures not distributed among the categories shown. f Includes only companies with 10 or more domestic employees.

Detail may not add to total because of rounding. Industry classification was based on dominant business code for domestic R&D performance, where available. For companies that did not report business codes, the classification used for sampling was assigned. An estimate range may be displayed in place of a single estimate to avoid disclosing operations of individual companies.

National Center for Science and Engineering Statistics and U.S. Census Bureau, Business Enterprise Research and Development Survey, 2021.

Survey Information and Data Availability

The sample for the BERD Survey was selected to represent all for-profit, nonfarm companies that were publicly or privately held, had 10 or more employees in the United States, and performed or funded R&D either domestically or abroad. The estimates in this InfoBrief are based on responses from a sample of the population and may differ from actual values because of sampling variability or other factors. As a result, apparent differences between the estimates for two or more groups may not be statistically significant. All comparative statements in this InfoBrief have undergone statistical testing and are significant at the 90% confidence level unless otherwise noted. The variances of estimates in this report were calculated using design-based formulas. Also, because the statistics from the survey are based on a sample, they are subject to both sampling and nonsampling errors. (See the 2021 “Technical Notes” at https://ncses.nsf.gov/surveys/business-enterprise-research-development/ .) ​ The Census Bureau reviewed the information in this InfoBrief for unauthorized disclosure of confidential information and approved the disclosure avoidance practices applied (Project No. P-7504682, Disclosure Review Board (DRB) approval number: CBDRB-FY23-0161).

Beginning in survey year 2018, companies that performed or funded less than $50,000 of R&D were excluded from tabulation.

In this InfoBrief, money amounts are expressed in current U.S. dollars and are not adjusted for inflation. A company is defined as a business organization located in the United States, either U.S. owned or a U.S. affiliate of a foreign parent company, of one or more establishments under common ownership or control.

For 2020, a total of 47,500 companies were sampled to represent the population of 1,140,000 companies; for 2021, a total of 47,500 companies were sampled, representing 1,137,000 companies. The actual numbers of reporting units in the sample that remained within the scope of the survey between sample selection and tabulation were 44,500 for 2020 and 44,000 for 2021. These lower counts represent the number of reporting units that were determined to be within the scope of the survey after all data collected were processed. Reasons for the reduced counts include mergers, acquisitions, and instances where companies had fewer than 10 employees in the United States or had gone out of business in the interim. Of these in-scope reporting units, 67% were considered to have met the criteria for a complete response to the 2020 survey; 69% fulfilled the 2021 complete response criteria. Coverage of the previous year’s known positive R&D stratum for 2020 was 92%; the coverage rate for 2021 was also 92%. Industry classification was based on the dominant business activity for domestic R&D performance, where available. For reporting units that did not report business activity codes for R&D, the classification used for sampling was assigned.

The estimation methodology for state estimates in the BERD Survey takes the form of a hybrid estimator, combining the unweighted reported amount, by state, with a weighted amount apportioned (or raked) across states with relevant industrial activity. The hybrid estimator smooths the estimate over states with R&D activity, by industry, and accounts for real observed change within a state. Table 5 shows the adjusted state estimates after this estimation methodology was applied.

The full set of data tables from the 2021 survey will be available at the BERD Survey page . Individual data tables and tables with relative standard errors and imputation rates from the 2021 survey are available from the author in advance of the full release. To minimize reporting burden, survey items are rotated on and off the survey on an odd- and even-numbered year schedule. Statistics on patents, intellectual property, and technology transfer activities were rotated off the survey for 2021. Items rotated on the survey for 2021 include questions on R&D performed by others by type of performer, federal R&D by government agency, and R&D by application area.

The BERD Survey contains confidential data that are protected under Title 13 and Title 26 of the U.S. Code. Restricted microdata can be accessed at the secure Federal Statistical Research Data Centers (FSRDCs) administered by the Census Bureau. FSRDCs are partnerships between federal statistical agencies and leading research institutions. FSRDCs provide secure environments supporting qualified researchers using restricted-access data while protecting respondent confidentiality. Researchers interested in using the microdata can submit a proposal to the Census Bureau, which evaluates proposals based on their benefit to the Census Bureau, scientific merit, feasibility, and risk of disclosure. To learn more about the FSRDCs and how to apply, please visit https://www.census.gov/about/adrm/fsrdc.html .

Suggested Citation

Britt R; National Center for Science and Engineering Statistics (NCSES). 2023. Business R&D Performance in the United States Tops $600 Billion in 2021 . NSF 23-350. Alexandria, VA: National Science Foundation. Available at http://ncses.nsf.gov/pubs/nsf23350 .

1 NSF has cosponsored an annual business R&D survey since 1953. The Survey of Industrial Research and Development (SIRD) collected data for 1953–2007, and its successor, the Business R&D and Innovation Survey (BRDIS), collected data for 2008–16. Beginning with 2017, the collection of innovation data was moved to the Annual Business Survey (ABS), another survey cosponsored with the Census Bureau, and BRDIS became the Business Research and Development Survey (BRDS). Beginning with 2019, the business R&D data collection reported here was renamed the Business Enterprise Research and Development (BERD) Survey for international comparability.

2 Determining the amount of domestic net sales and operating revenues was left to the reporting company. However, guidance was given to include revenues from foreign operations and subsidiaries and from discontinued operations and to exclude intracompany transfers, returns, allowances, freight charges, and excise, sales, and other revenue-based taxes.

3 Employment statistics in this InfoBrief are headcounts unless they are designated as full-time equivalent (FTE) estimates. R&D employees include researchers (defined as R&D scientists and engineers and their managers) and the technicians, technologists, and support staff members who work on R&D or who provide direct support to R&D activities.

4 The number of persons employed who were assigned full time to R&D plus a prorated number of employees who worked on R&D only part of the time was 1.9 million FTEs, of which 1.3 million FTEs were R&D researchers.

5 Company size classifications changed for 2017 and subsequent years in response to the revised Frascati Manual ; see Organisation for Economic Co-operation and Development (OECD). 2015. Frascati Manual: Guidelines for Collecting and Reporting Data on Research and Experimental Development. The Measurement of Scientific, Technological, and Innovation Activities . Paris: OECD Publishing. Available at https://www.oecd-ilibrary.org/science-and-technology/frascati-manual-2015_9789264239012-en . Anderson and Kindlon (2019) provide estimates of R&D performance and employment using these new classifications over 2008–15. The authors also compare the trends to those observed in SIRD for the time prior to 2008. The ABS, also cosponsored by NCSES and the Census Bureau, collects R&D data from companies with fewer than 10 employees for 2017 and beyond. See Anderson G, Kindlon A; NCSES. 2019. Indicators of R&D in Small Businesses: Data from the 2009–15 Business R&D and Innovation Survey . InfoBrief NSF 19-316. Alexandria, VA: National Science Foundation. Available at https://www.nsf.gov/statistics/2019/nsf19316/ .

6 In addition to statistics for all states and for all states by industry, below-state level statistics are available in the full set of data tables and in other InfoBriefs; see Shackelford B, Wolfe R; NCSES. 2019. Over Half of U.S. Business R&D Performed in 10 Metropolitan Areas in 2015 . InfoBrief NSF 19-322. Alexandria, VA: National Science Foundation. Available at https://www.nsf.gov/statistics/2019/nsf19322/ . Also see Shackelford B, Wolfe R; NCSES. 2020. Businesses Performed 60% of Their U.S. R&D in 10 Metropolitan Areas in 2018 . InfoBrief NSF 21-331. Alexandria, VA: National Science Foundation. Available at https://ncses.nsf.gov/pubs/nsf21331 . Information and statistics on U.S. state trends in R&D, science and engineering education, workforce, patents and publications, and knowledge-intensive industries is also available in the Science and Engineering State Indicators data tool at https://ncses.nsf.gov/indicators/states .

7 The Census Bureau reviewed the information in this InfoBrief for unauthorized disclosure of confidential information and approved the disclosure avoidance practices applied (Project No. P-7504682, Disclosure Review Board (DRB) approval number: CBDRB-FY23-0161).

Report Author

Ronda Britt Survey Manager NCSES Tel: (703) 292-7765 E-mail: [email protected]

National Center for Science and Engineering Statistics Directorate for Social, Behavioral and Economic Sciences National Science Foundation 2415 Eisenhower Avenue, Suite W14200 Alexandria, VA 22314 Tel: (703) 292-8780 FIRS: (800) 877-8339 TDD: (800) 281-8749 E-mail: [email protected]

Source Data & Analysis

Data Tables (NSF 23-351)

Get e-mail updates from NCSES

NCSES is an official statistical agency. Subscribe below to receive our latest news and announcements.

The independent source for health policy research, polling, and news.

A New Use for Wegovy Opens the Door to Medicare Coverage for Millions of People with Obesity

Juliette Cubanski , Tricia Neuman , Nolan Sroczynski , and Anthony Damico Published: Apr 24, 2024

The FDA recently approved a new use for Wegovy (semaglutide), the blockbuster anti-obesity drug, to reduce the risk of heart attacks and stroke in people with cardiovascular disease who are overweight or obese. Wegovy belongs to a class of medications called GLP-1 (glucagon-like peptide-1) agonists that were initially approved to treat type 2 diabetes but are also highly effective anti-obesity drugs. The new FDA-approved indication for Wegovy paves the way for Medicare coverage of this drug and broader coverage by other insurers. Medicare is currently prohibited by law from covering Wegovy and other medications when used specifically for obesity. However, semaglutide is covered by Medicare as a treatment for diabetes, branded as Ozempic.

What does the FDA’s decision mean for Medicare coverage of Wegovy?

The FDA’s decision opens the door to Medicare coverage of Wegovy, which was first approved by the FDA as an anti-obesity medication. Soon after the FDA’s approval of the new use for Wegovy, the Centers for Medicare & Medicaid Services (CMS) issued a memo indicating that Medicare Part D plans can add Wegovy to their formularies now that it has a medically-accepted indication that is not specifically excluded from Medicare coverage . Because Wegovy is a self-administered injectable drug, coverage will be provided under Part D , Medicare’s outpatient drug benefit offered by private stand-alone drug plans and Medicare Advantage plans, not Part B, which covers physician-administered drugs.

How many Medicare beneficiaries could be eligible for coverage of Wegovy for its new use?

Figure 1: An Estimated 1 in 4 Medicare Beneficiaries With Obesity or Overweight Could Be Eligible for Medicare Part D Coverage of Wegovy to Reduce the Risk of Serious Heart Problems

Of these 3.6 million beneficiaries, 1.9 million also had diabetes (other than Type 1) and may already have been eligible for Medicare coverage of GLP-1s as diabetes treatments prior to the FDA’s approval of the new use of Wegovy.

Not all people who are eligible based on the new indication are likely to take Wegovy, however. Some might be dissuaded by the potential side effects and adverse reactions . Out-of-pocket costs could also be a barrier. Based on the list price of $1,300 per month (not including rebates or other discounts negotiated by pharmacy benefit managers), Wegovy could be covered as a specialty tier drug, where Part D plans are allowed to charge coinsurance of 25% to 33%. Because coinsurance amounts are pegged to the list price, Medicare beneficiaries required to pay coinsurance could face monthly costs of $325 to $430 before they reach the new cap on annual out-of-pocket drug spending established by the Inflation Reduction Act – around $3,300 in 2024, based on brand drugs only, and $2,000 in 2025. But even paying $2,000 out of pocket would still be beyond the reach of many people with Medicare who live on modest incomes . Ultimately, how much beneficiaries pay out of pocket will depend on Part D plan coverage and formulary tier placement of Wegovy.

Further, some people may have difficulty accessing Wegovy if Part D plans apply prior authorization and step therapy tools to manage costs and ensure appropriate use. These factors could have a dampening effect on use by Medicare beneficiaries, even among the target population.

When will Medicare Part D plans begin covering Wegovy?

Some Part D plans have already announced that they will begin covering Wegovy this year, although it is not yet clear how widespread coverage will be in 2024. While Medicare drug plans can add new drugs to their formularies during the year to reflect new approvals and expanded indications, plans are not required to cover every new drug that comes to market. Part D plans are required to cover at least two drugs in each category or class and all or substantially all drugs in six protected classes . However, facing a relatively high price and potentially large patient population for Wegovy, many Part D plans might be reluctant to expand coverage now, since they can’t adjust their premiums mid-year to account for higher costs associated with use of this drug. So, broader coverage in 2025 could be more likely.

How might expanded coverage of Wegovy affect Medicare spending?

The impact on Medicare spending associated with expanded coverage of Wegovy will depend in part on how many Part D plans add coverage for it and the extent to which plans apply restrictions on use like prior authorization; how many people who qualify to take the drug use it; and negotiated prices paid by plans. For example, if plans receive a 50% rebate on the list price of $1,300 per month (or $15,600 per year), that could mean annual net costs per person around $7,800. If 10% of the target population (an estimated 360,000 people) uses Wegovy for a full year, that would amount to additional net Medicare Part D spending of $2.8 billion for one year for this one drug alone.

It’s possible that Medicare could select semaglutide for drug price negotiation as early as 2025, based on the earliest FDA approval of Ozempic in late 2017 . For small-molecule drugs like semaglutide, at least seven years must have passed from its FDA approval date to be eligible for selection, and for drugs with multiple FDA approvals, CMS will use the earliest approval date to make this determination. If semaglutide is selected for negotiation next year, a negotiated price would be available beginning in 2027. This could help to lower Medicare and out-of-pocket spending on semaglutide products, including Wegovy as well as Ozempic and Rybelsus, the oral formulation approved for type 2 diabetes. As of 2022, gross Medicare spending on Ozempic alone placed it sixth among the 10 top-selling drugs in Medicare Part D, with annual gross spending of $4.6 billion, based on KFF analysis . This estimate does not include rebates, which Medicare’s actuaries estimated to be  31.5% overall in 2022  but could be as high as  69%  for Ozempic, according to one estimate.

What does this mean for Medicare coverage of anti-obesity drugs?

For now, use of GLP-1s specifically for obesity continues to be excluded from Medicare coverage by law. But the FDA’s decision signals a turning point for broader Medicare coverage of GLP-1s since Wegovy can now be used to reduce the risk of heart attack and stroke by people with cardiovascular disease and obesity or overweight, and not only as an anti-obesity drug. And more pathways to Medicare coverage could open up if these drugs gain FDA approval for other uses . For example, Eli Lilly has just reported clinical trial results showing the benefits of its GLP-1, Zepbound (tirzepatide), in reducing the occurrence of sleep apnea events among people with obesity or overweight. Lilly reportedly plans to seek FDA approval for this use and if approved, the drug would be the first pharmaceutical treatment on the market for sleep apnea.

If more Medicare beneficiaries with obesity or overweight gain access to GLP-1s based on other approved uses for these medications, that could reduce the cost of proposed legislation to lift the statutory prohibition on Medicare coverage of anti-obesity drugs. This is because the Congressional Budget Office (CBO), Congress’s official scorekeeper for proposed legislation, would incorporate the cost of coverage for these other uses into its baseline estimates for Medicare spending, which means that the incremental cost of changing the law to allow Medicare coverage for anti-obesity drugs would be lower than it would be without FDA’s approval of these drugs for other uses. Ultimately how widely Medicare Part D coverage of GLP-1s expands could have far-reaching effects on people with obesity and on Medicare spending.

  • Medicare Part D
  • Chronic Diseases
  • Heart Disease
  • Medicare Advantage

news release

  • An Estimated 1 in 4 Medicare Beneficiaries With Obesity or Overweight Could Be Eligible for Medicare Coverage of Wegovy, an Anti-Obesity Drug, to Reduce Heart Risk

Also of Interest

  • An Overview of the Medicare Part D Prescription Drug Benefit
  • FAQs about the Inflation Reduction Act’s Medicare Drug Price Negotiation Program
  • What Could New Anti-Obesity Drugs Mean for Medicare?
  • Medicare Spending on Ozempic and Other GLP-1s Is Skyrocketing
  • Privacy Policy

Research Method

Home » Quantitative Data – Types, Methods and Examples

Quantitative Data – Types, Methods and Examples

Table of Contents

 Quantitative Data

Quantitative Data

Definition:

Quantitative data refers to numerical data that can be measured or counted. This type of data is often used in scientific research and is typically collected through methods such as surveys, experiments, and statistical analysis.

Quantitative Data Types

There are two main types of quantitative data: discrete and continuous.

  • Discrete data: Discrete data refers to numerical values that can only take on specific, distinct values. This type of data is typically represented as whole numbers and cannot be broken down into smaller units. Examples of discrete data include the number of students in a class, the number of cars in a parking lot, and the number of children in a family.
  • Continuous data: Continuous data refers to numerical values that can take on any value within a certain range or interval. This type of data is typically represented as decimal or fractional values and can be broken down into smaller units. Examples of continuous data include measurements of height, weight, temperature, and time.

Quantitative Data Collection Methods

There are several common methods for collecting quantitative data. Some of these methods include:

  • Surveys : Surveys involve asking a set of standardized questions to a large number of people. Surveys can be conducted in person, over the phone, via email or online, and can be used to collect data on a wide range of topics.
  • Experiments : Experiments involve manipulating one or more variables and observing the effects on a specific outcome. Experiments can be conducted in a controlled laboratory setting or in the real world.
  • Observational studies : Observational studies involve observing and collecting data on a specific phenomenon without intervening or manipulating any variables. Observational studies can be conducted in a natural setting or in a laboratory.
  • Secondary data analysis : Secondary data analysis involves using existing data that was collected for a different purpose to answer a new research question. This method can be cost-effective and efficient, but it is important to ensure that the data is appropriate for the research question being studied.
  • Physiological measures: Physiological measures involve collecting data on biological or physiological processes, such as heart rate, blood pressure, or brain activity.
  • Computerized tracking: Computerized tracking involves collecting data automatically from electronic sources, such as social media, online purchases, or website analytics.

Quantitative Data Analysis Methods

There are several methods for analyzing quantitative data, including:

  • Descriptive statistics: Descriptive statistics are used to summarize and describe the basic features of the data, such as the mean, median, mode, standard deviation, and range.
  • Inferential statistics : Inferential statistics are used to make generalizations about a population based on a sample of data. These methods include hypothesis testing, confidence intervals, and regression analysis.
  • Data visualization: Data visualization involves creating charts, graphs, and other visual representations of the data to help identify patterns and trends. Common types of data visualization include histograms, scatterplots, and bar charts.
  • Time series analysis: Time series analysis involves analyzing data that is collected over time to identify patterns and trends in the data.
  • Multivariate analysis : Multivariate analysis involves analyzing data with multiple variables to identify relationships between the variables.
  • Factor analysis : Factor analysis involves identifying underlying factors or dimensions that explain the variation in the data.
  • Cluster analysis: Cluster analysis involves identifying groups or clusters of observations that are similar to each other based on multiple variables.

Quantitative Data Formats

Quantitative data can be represented in different formats, depending on the nature of the data and the purpose of the analysis. Here are some common formats:

  • Tables : Tables are a common way to present quantitative data, particularly when the data involves multiple variables. Tables can be used to show the frequency or percentage of data in different categories or to display summary statistics.
  • Charts and graphs: Charts and graphs are useful for visualizing quantitative data and can be used to highlight patterns and trends in the data. Some common types of charts and graphs include line charts, bar charts, scatterplots, and pie charts.
  • Databases : Quantitative data can be stored in databases, which allow for easy sorting, filtering, and analysis of large amounts of data.
  • Spreadsheets : Spreadsheets can be used to organize and analyze quantitative data, particularly when the data is relatively small in size. Spreadsheets allow for calculations and data manipulation, as well as the creation of charts and graphs.
  • Statistical software : Statistical software, such as SPSS, R, and SAS, can be used to analyze quantitative data. These programs allow for more advanced statistical analyses and data modeling, as well as the creation of charts and graphs.

Quantitative Data Gathering Guide

Here is a basic guide for gathering quantitative data:

  • Define the research question: The first step in gathering quantitative data is to clearly define the research question. This will help determine the type of data to be collected, the sample size, and the methods of data analysis.
  • Choose the data collection method: Select the appropriate method for collecting data based on the research question and available resources. This could include surveys, experiments, observational studies, or other methods.
  • Determine the sample size: Determine the appropriate sample size for the research question. This will depend on the level of precision needed and the variability of the population being studied.
  • Develop the data collection instrument: Develop a questionnaire or survey instrument that will be used to collect the data. The instrument should be designed to gather the specific information needed to answer the research question.
  • Pilot test the data collection instrument : Before collecting data from the entire sample, pilot test the instrument on a small group to identify any potential problems or issues.
  • Collect the data: Collect the data from the selected sample using the chosen data collection method.
  • Clean and organize the data : Organize the data into a format that can be easily analyzed. This may involve checking for missing data, outliers, or errors.
  • Analyze the data: Analyze the data using appropriate statistical methods. This may involve descriptive statistics, inferential statistics, or other types of analysis.
  • Interpret the results: Interpret the results of the analysis in the context of the research question. Identify any patterns, trends, or relationships in the data and draw conclusions based on the findings.
  • Communicate the findings: Communicate the findings of the analysis in a clear and concise manner, using appropriate tables, graphs, and other visual aids as necessary. The results should be presented in a way that is accessible to the intended audience.

Examples of Quantitative Data

Here are some examples of quantitative data:

  • Height of a person (measured in inches or centimeters)
  • Weight of a person (measured in pounds or kilograms)
  • Temperature (measured in Fahrenheit or Celsius)
  • Age of a person (measured in years)
  • Number of cars sold in a month
  • Amount of rainfall in a specific area (measured in inches or millimeters)
  • Number of hours worked in a week
  • GPA (grade point average) of a student
  • Sales figures for a product
  • Time taken to complete a task.
  • Distance traveled (measured in miles or kilometers)
  • Speed of an object (measured in miles per hour or kilometers per hour)
  • Number of people attending an event
  • Price of a product (measured in dollars or other currency)
  • Blood pressure (measured in millimeters of mercury)
  • Amount of sugar in a food item (measured in grams)
  • Test scores (measured on a numerical scale)
  • Number of website visitors per day
  • Stock prices (measured in dollars)
  • Crime rates (measured by the number of crimes per 100,000 people)

Applications of Quantitative Data

Quantitative data has a wide range of applications across various fields, including:

  • Scientific research: Quantitative data is used extensively in scientific research to test hypotheses and draw conclusions. For example, in biology, researchers might use quantitative data to measure the growth rate of cells or the effectiveness of a drug treatment.
  • Business and economics: Quantitative data is used to analyze business and economic trends, forecast future performance, and make data-driven decisions. For example, a company might use quantitative data to analyze sales figures and customer demographics to determine which products are most popular among which segments of their customer base.
  • Education: Quantitative data is used in education to measure student performance, evaluate teaching methods, and identify areas where improvement is needed. For example, a teacher might use quantitative data to track the progress of their students over the course of a semester and adjust their teaching methods accordingly.
  • Public policy: Quantitative data is used in public policy to evaluate the effectiveness of policies and programs, identify areas where improvement is needed, and develop evidence-based solutions. For example, a government agency might use quantitative data to evaluate the impact of a social welfare program on poverty rates.
  • Healthcare : Quantitative data is used in healthcare to evaluate the effectiveness of medical treatments, track the spread of diseases, and identify risk factors for various health conditions. For example, a doctor might use quantitative data to monitor the blood pressure levels of their patients over time and adjust their treatment plan accordingly.

Purpose of Quantitative Data

The purpose of quantitative data is to provide a numerical representation of a phenomenon or observation. Quantitative data is used to measure and describe the characteristics of a population or sample, and to test hypotheses and draw conclusions based on statistical analysis. Some of the key purposes of quantitative data include:

  • Measuring and describing : Quantitative data is used to measure and describe the characteristics of a population or sample, such as age, income, or education level. This allows researchers to better understand the population they are studying.
  • Testing hypotheses: Quantitative data is often used to test hypotheses and theories by collecting numerical data and analyzing it using statistical methods. This can help researchers determine whether there is a statistically significant relationship between variables or whether there is support for a particular theory.
  • Making predictions : Quantitative data can be used to make predictions about future events or trends based on past data. This is often done through statistical modeling or time series analysis.
  • Evaluating programs and policies: Quantitative data is often used to evaluate the effectiveness of programs and policies. This can help policymakers and program managers identify areas where improvements can be made and make evidence-based decisions about future programs and policies.

When to use Quantitative Data

Quantitative data is appropriate to use when you want to collect and analyze numerical data that can be measured and analyzed using statistical methods. Here are some situations where quantitative data is typically used:

  • When you want to measure a characteristic or behavior : If you want to measure something like the height or weight of a population or the number of people who smoke, you would use quantitative data to collect this information.
  • When you want to compare groups: If you want to compare two or more groups, such as comparing the effectiveness of two different medical treatments, you would use quantitative data to collect and analyze the data.
  • When you want to test a hypothesis : If you have a hypothesis or theory that you want to test, you would use quantitative data to collect data that can be analyzed statistically to determine whether your hypothesis is supported by the data.
  • When you want to make predictions: If you want to make predictions about future trends or events, such as predicting sales for a new product, you would use quantitative data to collect and analyze data from past trends to make your prediction.
  • When you want to evaluate a program or policy : If you want to evaluate the effectiveness of a program or policy, you would use quantitative data to collect data about the program or policy and analyze it statistically to determine whether it has had the intended effect.

Characteristics of Quantitative Data

Quantitative data is characterized by several key features, including:

  • Numerical values : Quantitative data consists of numerical values that can be measured and counted. These values are often expressed in terms of units, such as dollars, centimeters, or kilograms.
  • Continuous or discrete : Quantitative data can be either continuous or discrete. Continuous data can take on any value within a certain range, while discrete data can only take on certain values.
  • Objective: Quantitative data is objective, meaning that it is not influenced by personal biases or opinions. It is based on empirical evidence that can be measured and analyzed using statistical methods.
  • Large sample size: Quantitative data is often collected from a large sample size in order to ensure that the results are statistically significant and representative of the population being studied.
  • Statistical analysis: Quantitative data is typically analyzed using statistical methods to determine patterns, relationships, and other characteristics of the data. This allows researchers to make more objective conclusions based on empirical evidence.
  • Precision : Quantitative data is often very precise, with measurements taken to multiple decimal points or significant figures. This precision allows for more accurate analysis and interpretation of the data.

Advantages of Quantitative Data

Some advantages of quantitative data are:

  • Objectivity : Quantitative data is usually objective because it is based on measurable and observable variables. This means that different people who collect the same data will generally get the same results.
  • Precision : Quantitative data provides precise measurements of variables. This means that it is easier to make comparisons and draw conclusions from quantitative data.
  • Replicability : Since quantitative data is based on objective measurements, it is often easier to replicate research studies using the same or similar data.
  • Generalizability : Quantitative data allows researchers to generalize findings to a larger population. This is because quantitative data is often collected using random sampling methods, which help to ensure that the data is representative of the population being studied.
  • Statistical analysis : Quantitative data can be analyzed using statistical methods, which allows researchers to test hypotheses and draw conclusions about the relationships between variables.
  • Efficiency : Quantitative data can often be collected quickly and efficiently using surveys or other standardized instruments, which makes it a cost-effective way to gather large amounts of data.

Limitations of Quantitative Data

Some Limitations of Quantitative Data are as follows:

  • Limited context: Quantitative data does not provide information about the context in which the data was collected. This can make it difficult to understand the meaning behind the numbers.
  • Limited depth: Quantitative data is often limited to predetermined variables and questions, which may not capture the complexity of the phenomenon being studied.
  • Difficulty in capturing qualitative aspects: Quantitative data is unable to capture the subjective experiences and qualitative aspects of human behavior, such as emotions, attitudes, and motivations.
  • Possibility of bias: The collection and interpretation of quantitative data can be influenced by biases, such as sampling bias, measurement bias, or researcher bias.
  • Simplification of complex phenomena: Quantitative data may oversimplify complex phenomena by reducing them to numerical measurements and statistical analyses.
  • Lack of flexibility: Quantitative data collection methods may not allow for changes or adaptations in the research process, which can limit the ability to respond to unexpected findings or new insights.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Primary Data

Primary Data – Types, Methods and Examples

Qualitative Data

Qualitative Data – Types, Methods and Examples

Research Data

Research Data – Types Methods and Examples

Secondary Data

Secondary Data – Types, Methods and Examples

Research Information

Information in Research – Types and Examples

This paper is in the following e-collection/theme issue:

Published on 25.4.2024 in Vol 26 (2024)

Effect of Prosocial Behaviors on e-Consultations in a Web-Based Health Care Community: Panel Data Analysis

Authors of this article:

Author Orcid Image

Original Paper

  • Xiaoxiao Liu 1, 2 , PhD   ; 
  • Huijing Guo 3 , PhD   ; 
  • Le Wang 4 , PhD   ; 
  • Mingye Hu 5 , PhD   ; 
  • Yichan Wei 1 , BBM   ; 
  • Fei Liu 6 , PhD   ; 
  • Xifu Wang 7 , MCM  

1 School of Management, Xi’an Jiaotong University, Xi'an, China

2 China Institute of Hospital Development and Reform, Xi'an Jiaotong University, Xi'an, China

3 School of Economics and Management, China University of Mining and Technology, Xuzhou, China

4 College of Business, City University of Hong Kong, Hong Kong, China (Hong Kong)

5 School of Economics and Management, Xi’an University of Technology, Xi'an, China

6 School of Management, Harbin Engineering University, Harbin, China

7 Healthcare Simulation Center, Guangzhou First People’s Hospital, Guangzhou, China

Corresponding Author:

Xifu Wang, MCM

Healthcare Simulation Center

Guangzhou First People’s Hospital

1 Pan Fu Road

Yuexiu District

Guangzhou, 510180

Phone: 86 13560055951

Email: [email protected]

Background: Patients using web-based health care communities for e-consultation services have the option to choose their service providers from an extensive digital market. To stand out in this crowded field, doctors in web-based health care communities often engage in prosocial behaviors, such as proactive and reactive actions, to attract more users. However, the effect of these behaviors on the volume of e-consultations remains unclear and warrants further exploration.

Objective: This study investigates the impact of various prosocial behaviors on doctors’ e-consultation volume in web-based health care communities and the moderating effects of doctors’ digital and offline reputations.

Methods: A panel data set containing information on 2880 doctors over a 22-month period was obtained from one of the largest web-based health care communities in China. Data analysis was conducted using a 2-way fixed effects model with robust clustered SEs. A series of robustness checks were also performed, including alternative measurements of independent variables and estimation methods.

Results: Results indicated that both types of doctors’ prosocial behaviors, namely, proactive and reactive actions, positively impacted their e-consultation volume. In terms of the moderating effects of external reputation, doctors’ offline professional titles were found to negatively moderate the relationship between their proactive behaviors and their e-consultation volume. However, these titles did not significantly affect the relationship between doctors’ reactive behaviors and their e-consultation volume ( P =.45). Additionally, doctors’ digital recommendations from patients negatively moderated both the relationship between doctors’ proactive behaviors and e-consultation volume and the relationship between doctors’ reactive behaviors and e-consultation volume.

Conclusions: Drawing upon functional motives theory and social exchange theory, this study categorizes doctors’ prosocial behaviors into proactive and reactive actions. It provides empirical evidence that prosocial behaviors can lead to an increase in e-consultation volume. This study also illuminates the moderating roles doctors’ digital and offline reputations play in the relationships between prosocial behaviors and e-consultation volume.

Introduction

e-Consultations, offered through web-based health care communities [ 1 ], are increasingly becoming vital complements to traditional hospital services [ 2 - 4 ]. In hospital consultations, patients can only passively accept treatment [ 5 ] from a limited pool of medical resources within a geographical radius. However, when engaging with web-based health care communities, patients can search for primary care solutions [ 6 ] from an extensive digital market in a relatively short time [ 7 ]. Given that the diagnostic accuracy of e-consultations matches that of hospital consultations [ 8 - 10 ], e-consultations are becoming increasingly attractive to patients [ 3 , 11 ].

Doctors are also showing a growing interest in e-consultations, motivated by economic and social benefits. First, doctors can achieve economic gains by participating in e-consultations [ 7 , 12 ]. Web-based consultation platforms facilitate an efficient reputation system, enabling patients to easily provide feedback about doctors. Consequently, doctors can use e-consultation to strengthen their relationship with patients [ 13 , 14 ] and foster positive word-of-mouth [ 15 ]. More e-consultations can benefit doctors by retaining current patients, attracting new ones, and boosting in-person hospital visits [ 16 , 17 ]. Second, doctors could also receive social returns from engaging in e-consultation [ 7 ]. Active participation in e-consultations allows doctors to demonstrate their skills, attitude, and experience, aiding in accumulating professional capital [ 7 ], building their reputation [ 18 ], and increasing their social influence [ 19 ]. Given these tangible and intangible benefits, it is essential for doctors to diligently provide the desired e-consultations and make additional efforts to highlight their service attributes to stand out [ 6 , 20 , 21 ]. This involves engaging in prosocial behaviors in web-based health care communities, which is the primary research focus of this study.

Prior studies have examined the effects of prosocial behaviors on financial outcomes, such as actions reflecting social responsibility in the workplace [ 22 ]. In the health care sector, previous research has explored doctors’ prosocial behaviors within traditional, offline medical services. Doctors, working in established medical institutes and serving patients with limited choices of clinical service providers, often aim for self-satisfaction and patient satisfaction with their offline prosocial behaviors. For example, research indicates that doctors may act prosocially to regulate their self-oriented feelings [ 23 ] and foster a caring and understanding attitude toward patients [ 24 , 25 ]. Additionally, doctors who demonstrate more empathy and care can elicit positive emotions in patients and improve the doctor-patient relationship [ 26 , 27 ].

Compared to the offline context, doctors’ prosocial behaviors in a digital context may differ in 2 aspects. First, the internet allows patients to choose from a broader, more diverse range of doctors without the constraints of time and space [ 7 ]. However, the uncertainty inherent in the digital environment creates a more pronounced information asymmetry between patients and doctors [ 28 ], consequently making it more challenging for patients to establish trust. Therefore, doctors’ prosocial behaviors are crucial in building their self-image, establishing patients’ trust, and assisting patients in identifying suitable doctors [ 29 , 30 ]. Second, unlike offline environments, web-based medical platforms offer a range of functions, including asynchronous activities such as publishing articles, as well as real-time interactional actions such as answering questions during live streams. This array of functions facilitates the adoption of more diverse prosocial behaviors by doctors.

Although these differences underscore the importance of studying doctors’ prosocial behavior, there has been limited research focusing on the impact of such behaviors in the digital context. One previous study has scrutinized the impact of prosocial behaviors, such as answering patients’ questions freely, on patient engagement within web-based health care communities [ 31 ]. An aspect that requires further exploration is how doctors’ motivations and patients’ involvement vary in doctors’ helping behaviors. Consequently, studies on web-based health care communities should differentiate between diverse prosocial actions to understand their effects on doctors’ web-based service outcomes. This study aims to contribute new knowledge regarding the full breadth of doctors’ prosocial behaviors.

Unlike the previous study that exclusively investigated doctors’ asynchronous behaviors in web-based health care communities [ 31 ], this study also explores the role of synchronous reactive actions in achieving optimal doctors’ e-consultation volume. Recently, web-based health care communities have developed and released live-streaming functions to assist doctors in providing voluntary interactions with patients. The effect of doctors’ engagement in medical live streaming on e-consultation services remains unexplored. While these behaviors could demonstrate doctors’ ethical traits and ability to fulfill an e-consultation workflow, a potential trade-off with e-consultations may exist when doctors engage in prosocial behaviors.

In summary, this study examines the effects of doctors’ proactive and reactive prosocial behaviors, considering their digital and offline reputations as potential moderating factors. First, drawing from functional motives theory (FMT), we explore the impact of doctors’ web-based proactive actions on their e-consultation volume. Proactive behaviors are actions in which individuals exceed their assigned work, focusing on long-term goals to prevent future problems [ 32 , 33 ]. According to FMT, these behaviors reflect helping actions that satisfy personal needs [ 34 ], driven by self-focused motivations [ 35 ], such as impression management and the realization of self-worth goals. For example, knowledge-based proactive behaviors, such as disseminating expertise to preempt future issues, are self-initiated and not reactions to immediate requests [ 36 ]. This study categorizes doctors’ sharing of professional articles as a form of proactive behavior that creates a professional image for their patient audience. This is because these actions aim to assist patients with future health concerns rather than directly responding to patients’ immediate needs.

Second, this study explores the role of doctors’ reactive prosocial behaviors in increasing e-consultations, guided by social exchange theory (SET). Unlike proactive behaviors, reactive behaviors are characterized by instances of individuals engaging in helping activities [ 35 ], typically in response to others’ needs [ 34 ]. SET posits that individuals incurring additional social costs in relationships may anticipate reciprocal value [ 37 , 38 ]. Reactive prosocial behaviors, per SET, are initiated by the motivation to satisfy others’ desires, leading to the development of cooperative social values. In our context, medical live streams facilitate real-time, synchronized interactions, enabling patients to ask questions and doctors to provide immediate responses. Patients’ health questions during these streams indicate their immediate needs. Thus, a higher frequency of live streams within a certain period suggests doctors are increasingly responding to patients’ needs during that time. Therefore, this study uses the number of medical live-streaming sessions conducted by doctors as a measure for their synchronous reactive behaviors.

Finally, considering that doctors’ reputations play a crucial role in their workflow on web-based health care communities [ 39 , 40 ], we test the moderating roles of digital and offline reputation—measured by doctors’ offline professional titles and patients’ recommendations in the digital context, respectively—on the main effects.

Based on previous studies and practices within web-based health care communities, we aim to extend the literature by testing the impact of 2 types of web-based prosocial behaviors by doctors: proactive and synchronous reactive actions on e-consultation volume. We then explore the moderating roles of doctors’ offline and digital reputations on these main effects.

Research Framework and Hypothesis Development

We have developed a research framework, shown in Figure 1 , to identify effective prosocial strategies used by doctors within web-based health care communities to achieve a preferred e-consultation volume from the supply side.

types of data analysis in research

Primarily, we explore the relationships between doctors’ prosocial behaviors and e-consultation volume, drawing on FMT and SET. These theories are widely adopted for measuring and classifying the outcomes of prosocial behaviors from 2 fundamental perspectives based on human nature [ 34 ]. While doctors’ offline prosocial behaviors may help satisfy patients [ 24 , 25 ], who are already service acceptors, the outcomes of doctors’ web-based prosocial behaviors still need careful distinction. It is essential to clearly differentiate between various types of doctors’ prosocial behaviors to identify their nature. In this study, following the leads of FMT and SET, we test 2 kinds of prosocial behavior: proactive (posting professional articles to achieve self-worth) and reactive (conducting medical live streaming to create cooperative social values).

Subsequently, we examine how doctors’ external reputation moderates the impacts of doctors’ proactive and reactive prosocial behaviors. This examination is conducted from the perspectives of reducing uncertainty and building trust, respectively.

Doctors’ Proactive Behaviors and e-Consultation Volume

FMT places emphasis on the primary motivations behind individuals’ behaviors, adopting an atheoretical stance [ 41 ]. Through the exploratory process, previous studies have provided examples to identify the functional motivations behind prosocial behaviors [ 42 ], such as expressing important personal values. In web-based health care communities, doctors have the opportunity to demonstrate personal traits through proactive behaviors. According to FMT, these proactive behaviors stem from the actors’ active efforts to satisfy their own needs and achieve self-worth [ 34 , 35 ].

Doctors might post professional articles, such as clinical notes and scientific papers, on web-based health care communities to help patient readers handle future health problems. These proactive prosocial behaviors are primarily driven by a desire to showcase personal medical competence, a crucial characteristic of a professional image [ 43 ], in medical consultations. By posting professional articles, doctors can display their medical knowledge, care delivery capability, and service quality, thereby enhancing their professional image. We hypothesize that this effort will lead to an increase in the e-consultation volume. Therefore, we propose the following hypothesis:

  • Hypothesis 1: The posting of professional articles by doctors positively impacts their e-consultation volume on web-based health care communities.

Doctors’ Reactive Behaviors and e-Consultation Volume

Considering the social environment in the working context, SET suggests that reactive prosocial behaviors stem from responding to others’ needs [ 34 ]. Engaging in such behaviors can foster positive perceptions among the audience and build cooperative social values [ 44 ] through reactive social exchange. People with a high orientation toward cooperative social values act to maximize mutual interests [ 45 ], a trait highly valued in the medical field.

We use medical live streaming as a measure of doctors’ reactive behaviors on web-based health care communities. Volunteering to provide interactional live streaming, a typical reactive behavior that may generate cooperative social value, gives the patient audience the impression that the doctors will prioritize demand-side interests during e-consultation services. Additionally, engaging in medical live streaming allows doctors to present themselves as authentic and recognized experts. This enhances their social presence [ 46 ], potentially leading to increased service use [ 47 ] and greater popularity [ 48 ]. Consequently, patients are more likely to perceive doctors who participate in medical live streaming as trustworthy for consultations. Given that e-consultations are closely related to the health conditions of the demand side, a credible doctor is likely to attract more e-consultations. Therefore, we propose the following hypothesis:

  • Hypothesis 2: The conduct of medical live streaming by doctors positively impacts their e-consultation volume on web-based health care communities.

Moderating Roles of Offline and Digital Reputation

As doctors’ proactive and reactive behaviors potentially affect their consultation performance, based on 2 distinct theoretical foundations of human nature, there exists a discrepancy in how doctors’ reputations influence the relationship between various prosocial behaviors and e-consultation.

We formulate hypotheses regarding the moderating effects within the context of digital health care, by taking into account the inherent information asymmetry and the significance of establishing patient trust. Specifically, our hypotheses explore the influence of reputation on the relationship between doctors’ proactive behaviors and e-consultation volume, with a focus on reducing uncertainty. Additionally, we examine how reputation moderates the impact of doctors’ reactive behaviors, emphasizing the perspective of trust building.

First, in the marketing literature, service providers’ reputations, which can reduce information asymmetry and purchase uncertainty [ 49 ], are key factors influencing purchasing behavior and sales performance in the digital context [ 50 - 52 ]. Similarly, for doctors, reputations are related to the experiences and beliefs of other stakeholders [ 53 ]. As health care services are credence goods [ 54 ]—whose quality patients cannot discern even after experiencing the services—and given the nature of web-based platforms (eg, the absence of face-to-face meetings), there is a significant information asymmetry [ 51 ]. This increases patients’ uncertainty regarding the quality of doctors. Consequently, doctors’ reputations play crucial roles in patients’ decision-making processes [ 18 , 39 ]. We use doctors’ professional titles and patients’ recommendations on web-based health care communities to measure doctors’ offline and digital reputations.

Proactive behaviors by low-reputation doctors can create deeper professional impressions [ 34 , 35 ] to reduce uncertainty in e-consultations than high-reputation doctors, who are less uncertain in medical services. Then, doctors’ reputations—measured by offline professional titles and digital patients’ recommendations on web-based health care communities—will negatively moderate the relationship between proactive behavior and e-consultation volume. Thus, we propose the following hypotheses:

  • Hypothesis 3a: Doctors’ offline professional titles negatively moderate the relationship between the posting of professional articles and e-consultation volume on web-based health care communities.
  • Hypothesis 3b: Doctors’ digital recommendations from patients negatively moderate the relationship between the posting of professional articles and e-consultation volume on web-based health care communities.

Second, one of the central elements of SET is the concept of trust between actors in the exchange process [ 55 - 58 ]. In the context of digital health, patient’s trust in doctors is important to establish in order to refine the doctor-patient relationship. Doctors’ reputations can reflect their personality traits [ 39 ] and promote trust from patients [ 53 ]. Conducting medical live streaming, a form of reactive prosocial behavior, includes doctors’ cooperative social value orientations that are preferred in e-consultations. For low-reputation doctors, such as those with relatively junior professional titles and few digital patient recommendations, conducting medical live streaming will build patients’ confidence in e-consultations to a greater extent than doctors with high reputations, who are usually already highly trusted. Then, offline and digital reputation may negatively moderate the relationship between engaging in medical live streaming and e-consultation volume. Thus, we propose the final hypotheses:

  • Hypothesis 4a: Doctors’ offline professional titles will negatively moderate the relationship between conducting medical live streaming and e-consultation volume on web-based health care communities.
  • Hypothesis 4b: Doctors’ digital recommendations from patients will negatively moderate the relationship between conducting medical live streaming and e-consultation volume on web-based health care communities.

Research Context and Data Collection

Our research context is one of the largest web-based health care communities in China. This platform, established in 2006, offers e-consultation services to patients. As of July 2023, it boasts over 260,000 active doctors from 10,000 hospitals nationwide and has provided web-based medical services to 79 million patients.

The platform allows doctors to create home pages where they can display relevant information such as offline professional titles, experiences shared by other patients, and personal introductions. Patients can select doctors for e-consultation by browsing this information. Besides e-consultation, doctors can engage in prosocial behavior primarily focused on knowledge sharing. This includes posting professional articles in various formats (text, voice, and short videos) and conducting medical live streams for real-time interaction with patients.

We collected data over a 22-month period, from January 2021 to October 2022, focusing on common diseases such as diabetes, depression, infertility, skin diseases, and gynecological diseases. To ensure that our findings are generalizable to a typical and active doctor on the platform, we included doctors who had posted at least 1 article and conducted at least 1 live stream before the end of the study period in our analysis [ 59 - 61 ]. Our sample consists of 2880 doctors and includes the following information for each doctor: professional title, patient recommendations, records of experiences shared by the doctor’s patients, records of professional articles posted, records of live streams conducted, and records of the doctor’s e-consultations.

Variable Operationalization

Our unit of analysis is each doctor. We investigate how doctors’ prosocial behaviors, including proactive behaviors (posting professional articles) and reactive behaviors (conducting medical live streams), influence their e-consultation volume.

Dependent Variable

Our dependent variable is the doctors’ e-consultation volume, denoted as Consultation it , which is measured by the number of e-consultations of doctor i in month t .

Independent Variables

Our independent variables are doctors’ proactive behaviors and reactive behaviors. Doctors’ proactive behavior is operationalized as the posting of professional articles. Specifically, we denote proactive behavior as Articles it , which is measured by the number of professional articles posted by doctor i in month t . Doctors’ reactive behavior is operationalized as medical live streaming. This variable is denoted as LiveStreaming it , which is calculated as the number of medical live streams conducted by doctor i in month t .

Moderating Variables

We are also interested in how doctors’ external reputation, including their offline professional titles and digital recommendations from patients, influences the relationship between prosocial behaviors and e-consultation volume. A doctor’s offline professional title is denoted as Title i , which is a dummy variable indicating whether doctor i is a chief doctor ( Title i =1 indicates the doctor is a chief doctor, and Title i =0 indicates the doctor has a lower-ranked title). Digital recommendations are captured by Recommendations i , which is the digital recommendation level of doctor i as calculated by the platform based on the recommendations provided by their past patients.

Control Variables

We incorporated several control variables to account for factors that may influence patient’s choices of doctors in the digital context. The shared experiences of patients regarding a doctor’s treatment [ 39 ], as well as the number of patients who have previously consulted with the doctors [ 17 , 62 ], can indicate the doctor’s overall popularity. This, in turn, may affect patient choice. Therefore, we controlled for (1) the total number of patients who consulted with doctor i in the digital context before month t ( TotalPatients it ) and (2) the total number of patient-shared experiences about offline treatment by doctor i before month t ( TotalExperiences it ). Furthermore, doctors’ past behaviors, including article publishing and live streaming, can influence their current practices in posting articles and conducting live streams. Simultaneously, these factors may also act as signals affecting patients’ judgments and selection of doctors [ 12 ]. To account for these influences, we also controlled for (1) the total number of articles posted by doctor i before month t ( TotalArticles it ) and (2) the total number of medical live streams conducted by doctor i before month t ( TotalLiveStreaming it ).

To control for both observed and unobserved doctor-specific factors that do not change over time, individual-fixed effects were added. Additionally, time-fixed effects were introduced into our analysis to account for both observed and unobserved factors that vary over time but remain constant across doctors. Table 1 shows the variables and their definitions.

Estimation Model

To estimate the direct impact of doctors’ proactive behaviors and reactive behaviors on their e-consultation volume, the following 2-way fixed effects regression model was used:

Consultation it = β 0 + β 1 Articles it + β 2 LiveStreaming it + β 3 TotalPatients it + β 4 TotalExperiences it + β 5 TotalArticles it + β 6 TotalLiveStreaming it + α i + δ t + μ it (1)

where i denotes doctor, t denotes month, α i is doctor-fixed effects, δ t is month-fixed effects, Consultation it is the number of e-consultations of doctor i in month t , Articles it is the number of professional articles posted by doctor i in month t , LiveStreaming it is the number of medical live streams conducted by doctor i in month t , TotalPatients it is the total number of patients who consulted doctor i in the digital context before month t , TotalExperiences it is the total number of patient-shared experiences about offline treatment by doctor i before month t , TotalArticles it is the total number of articles posted by doctor i before month t , TotalLiveStreaming it is the total number of medical live streams doctor i conducted before month t , β is the coefficient, and μ it is the error term. We took the log transformation for our continuous variables in the model to reduce the skewness of the variables [ 63 ].

Next, the moderating effects of doctors’ offline professional titles and digital recommendations by patients were investigated based on the following specification:

Consultation it = β 0 + β 1 Articles it + β 2 LiveStreaming it + β 3 Articles it × Title i + β 4 LiveStreaming it × Title i + β 5 Articles it × Recommendation i + β 6 LiveStreaming it × Recommendation i + β 7 TotalPatients it + β 8 TotalExperiences it + β 9 TotalArticles it + β 10 TotalLiveStreaming it + α i + δ t + μ it (2)

where Title i indicates whether doctor i is a chief doctor ( Title i =1 indicates the doctor is a chief doctor, and Title i =0 indicates the doctor has a lower-ranked title). Recommendations i is the digital recommendation level of doctor i by other patients.

Ethical Considerations

This study used secondary publicly available data obtained from a website and did not involve the collection of original data pertaining to human participants. As such, there is no evidence of unethical behavior in the study. Consequently, ethics approval by an ethics committee or institutional review board was not deemed necessary.

In this section, we present our empirical results. The descriptive statistics are shown in Table 2 , and the correlation matrix is shown in Table 3 .

Empirical Results

Results for direct effects.

The analysis was conducted progressively. We first estimated the equation without control variables (model 1) and then added control variables in model 2. The estimated results are shown in Table 4 . From the results, we can see that the coefficient of Articles is significant and positive in model 2 (β=.093; P <.001), indicating that doctors’ proactive behaviors (ie, posting professional articles) can help them obtain more e-consultations. Thus, hypothesis 1 is supported. Regarding doctors’ engagement in medical live streaming, the results show that the coefficient of LiveStreaming is significantly positive (β=.214; P <.001), which suggests that doctors’ reactive behaviors (ie, conducting medical live streaming) can increase their e-consultation volume. This supports hypothesis 2.

a All models include doctor-fixed effects and month-fixed effects; robust SEs clustered by doctors are reported; the number of doctors is 2880, and the number of observations is 63,360.

b R 2 =0.843; F 2,2879 =175.98; P <.001.

c R 2 =0.851; F 6,2879 =119.72; P <.001.

d N/A: not applicable.

Results for Moderating Effects

The results for moderating effects are shown in Table 5 . In model 1, interaction terms were initially introduced between Title and Articles , as well as between Title and LiveStreaming , to estimate the moderating effect of doctors’ offline professional titles. The interaction terms were then added between Recommendations and Articles , as well as between Recommendations and LiveStreaming , to estimate the moderating effect of doctors’ digital recommendations in model 2. Finally, a full model was estimated by incorporating all interaction terms. We find that the results are consistent across all models. Wald tests and likelihood ratio were used to compare the fit among nested models [ 64 , 65 ], and the results show that the inclusion of moderating variables significantly enhances the model’s fit.

Regarding the moderating effect of doctors’ offline professional titles, we find that the coefficient of Articles × Title in model 1 of Table 5 is significantly negative (β=–.058; P <.001), which supports hypothesis 3a that doctors’ offline professional titles have a negative moderating effect on the relationship between doctors’ proactive behaviors and e-consultation volume. However, the coefficient of LiveStreaming × Title is insignificant (β=–.024; P =.45), which suggests that doctors’ offline professional titles have no moderating effect on the relationship between doctors’ reactive behaviors and e-consultation volume. Thus, hypothesis 4a is not supported.

b R 2 =0.851; F 8,2879 =89.98; P <.001; Wald test: P <.001; likelihood ratio: P <.001.

c R 2 =0.851; F 8,2879 =89.13; P <.001; Wald test: P <.001; likelihood ratio: P <.001.

d R 2 =0.852; F 10,2879 =71.44; P <.001; Wald test: P <.001; likelihood ratio: P <.001.

e N/A: not applicable.

For the moderating effect of digital patient recommendations, we find that both of the coefficients of Articles​ × Recommendations and LiveStreaming × Recommendations are negative and significant (β=–.055; P <.001 and β=–.100; P <.001, respectively, in model 2 of Table 5 ). This indicates that digital recommendations from patients have negative moderating effects on the relationship between doctors’ proactive behaviors and e-consultation volume as well as on the relationship between doctors’ reactive behaviors and e-consultation volume; this finding supports hypotheses 3b and 4b.

Robustness Check

First, additional analysis was performed to check whether our findings are robust to different measures of doctors’ reactive behaviors. In the main analysis, we used the number of medical live streams to construct doctors’ reactive behaviors. In the robustness check, doctors’ reactive behaviors were measured using the following measures: (1) the length of time spent in medical live streaming ( LSDuration it ), which is calculated as the total duration of all medical live streams conducted by doctor i in month t ; and (2) the number of doctor-patient interactions in the medical live streams ( LSInteractions it ), which is calculated as the total number of interactions between doctor i and patients in medical live streams in month t . This measure is likely to more effectively capture the reactive element of the behavior. The estimated results are shown in Table 6 , and we can see that the results are consistent with the main results.

Second, in the above analysis, the total number of articles posted by the doctors was used to measure doctors’ proactive behaviors. As doctors can post articles that are either their own original work or reposts from others, we further used the number of original articles ( OriArticles it ) to measure doctors’ proactive prosocial behaviors. Specifically, the number of articles was replaced with the number of original articles posted by doctor i in month t ( OriArticles it ). Models 1 and 2 in Table 7 show the results. We can see that using this alternative measure of proactive behavior does not materially change the results.

Third, as our dependent variable takes nonnegative values, negative binomial regression was further used to re-estimate our models. We find that the results (models 3 and 4 in Table 7 ) are similar to the main results.

Fourth, to further enhance the robustness and validity of our findings, article quality was used as a measure of doctors’ proactive behaviors. This approach is based on the premise that article quality more accurately reflects the effort and time invested by doctors in content creation. Specifically, we assessed article quality based on either the length of each article or the number of likes it received and then re-estimated our model. As indicated in Table 8 , the results remain consistent with our main findings, thereby further reinforcing the validity of our conclusions.

b R 2 =0.851; F 6,2879 =131.71; P <.001.

c R 2 =0.851; F 10,2879 =79.29; P <.001; Wald test: P <.001; likelihood ratio: P <.001.

d R 2 =0.850; F 6,2879 =112.52; P <.001.

e R 2 =0.851; F 10,2879 =68.43; P <.001; Wald test: P <.001; likelihood ratio: P <.001.

f N/A: not applicable.

a All models include doctor-fixed effects and month-fixed effects; robust SEs clustered by doctors are reported in models 1 and 2; bootstrap SEs in models 3 and 4.

b R 2 =0.851; F 6,2879 =118.99; P <.001.

c R 2 =0.852; F 10,2879 =71.04; P <.001; Wald test: P <.001; likelihood ratio: P <.001.

d Log likelihood=–150,015.36.

e Log likelihood=–149,888.24.

b R 2 =0.851; F 6,2879 =127.75; P <.001.

c R 2 =0.851; F 10,2879 =89.97; P <.001; Wald test: P <.001; likelihood ratio: P <.001.

d R 2 =0.851; F 6 , 2879 =133.39; P <.001.

e R 2 =0.852; F 10 , 2879 =84.94; P <.001; Wald test: P <.001; likelihood ratio: P <.001.

Analysis of Results

Web-based medical platforms offer a variety of functions to support doctors’ engagement in different types of prosocial behaviors. However, few studies have investigated the effects of these behaviors. Drawing on FMT and SET, this study categorized doctors’ prosocial practices in web-based health care communities into proactive and reactive actions and examined their effects on e-consultation volume. Briefly, prosocial behaviors positively impact on e-consultation, and a doctor’s digital and offline reputation moderates the relationship between prosocial behavior and e-consultation, albeit with some nuances.

First, we expanded upon existing literature on proactive prosocial behaviors, concluding that these actions can help doctors create professional images [ 43 ] in the medical consultation context. Our panel data analysis reveals that doctors’ posting of professional articles, which contribute to their professional image in the digital context, attracts more e-consultations. This finding aligns with the prior study [ 31 ], which observed that a health professional’s previous asynchronous prosocial behavior positively influences their future economic performance.

Second, drawing from SET, we analyzed the impact of synchronous reactive prosocial behaviors, a less explored area in prior literature. Our findings confirm that engaging in medical live streaming, a form of reactive prosocial behavior, leads to higher e-consultation volumes. Interestingly, we found that the positive impact of conducting a live stream exceeds that of posting an article.

Third, we expanded our research by testing the moderating roles of digital and offline reputations, measured by doctors’ offline professional titles and patients’ recommendations on web-based health care communities. We found that digital reputations significantly moderate the relationships between both types of prosocial behaviors and e-consultation volume. Specifically, doctors who post professional articles or conduct medical live streams attract more e-consultations when they have fewer patient recommendations compared to those with higher recommendations. Regarding offline professional titles, our results indicate a significant moderating effect on the relationship between proactive prosocial behaviors and e-consultation volume. Notably, junior doctors should focus more on posting articles in web-based health care communities to compensate for limitations associated with their titles [ 66 ]. However, the moderating effect of offline titles on the impact of reactive prosocial behaviors was found to be insignificant. We attribute this to the unique dynamics of trust conversion in Chinese health care settings. As doctors’ offline titles are granted by medical institutions, these titles could enhance patients’ trust in doctors only if there is a conversion of trust from the organization to the individual doctor, which represents different types of trust [ 67 ]. Consequently, doctors with the same offline titles from different hospitals may be perceived differently. For example, a senior doctor from a 3-A hospital is usually seen as highly professional in their clinical field, while a doctor with the same title in a 1-A hospital might typically handle primary diseases. Due to this trust conversion phenomenon, patients may not uniformly trust doctors from different hospitals with the same offline titles, leading to the insignificant moderating effect of offline titles on the impact of reactive prosocial behaviors.

In summary, this study underscores the importance of prosocial behaviors and reputation in shaping doctors’ e-consultation volumes on web-based health care communities, offering valuable insights for health care professionals aiming to increase their consultation outreach.

Implications

This study makes several theoretical implications. First, this study contributes to web-based health care community literature by offering a nuanced understanding of how doctors’ prosocial behaviors enhance e-consultation volume. While a limited number of studies have examined the effects of doctors’ freely provided behaviors in the digital context [ 31 ], the specific impact of different types of prosocial behaviors on e-consultation volume remains largely unexplored. This study addresses this knowledge gap by theoretically categorizing doctors’ prosocial behaviors in web-based health care communities into proactive and reactive types and exploring their impacts on e-consultations.

Second, this study enriches web-based health care communities and live streaming literature by validating the role of medical live streaming in web-based health services. Prior research on live streaming has mainly concentrated on e-commerce [ 68 ], web-based gaming [ 69 ], and web-based learning [ 70 ]. Our study extends this research to the health care context, highlighting the importance of live streaming on web-based health care platforms. Specifically, this study delves into how doctors’ synchronous, reactive volunteer interactions via live streaming influence patient decision-making.

Finally, this study advances FMT and SET by highlighting the importance of context in theory development and providing guidance for context-specific theorizing on web-based health platforms. It also sheds light on how the impact of different prosocial behaviors on e-consultation volume varies depending on a doctor’s offline and digital reputations. Notably, this study validates that proactive behaviors work more effectively in promoting e-consultations for doctors with lower titles or fewer digital recommendations, while reactive behaviors are more effective for doctors with fewer digital recommendations.

This study offers several practical implications for doctors and platform managers. First, the beneficial effects of prosocial behaviors suggest that doctors should adapt their engagement activities when participating in web-based health care platforms. Nowadays, an increasing number of doctors are joining web-based health care communities and focusing on e-consultations, attracted by the economic and social benefits. Based on our results, posting professional articles can help doctors establish a professional image, potentially leading to more e-consultations. Additionally, conducting medical live streams can bolster e-consultations by fostering cooperative social value for doctors and enhancing their credibility among patient audiences. Therefore, doctors may prefer engaging in both proactive and reactive prosocial activities in web-based health care communities to attract more patients to their e-consultation services.

Second, the boundary conditions of the effects of prosocial behaviors imply that doctors should strategically leverage the beneficial effect of proactive and reactive behaviors according to their offline and digital reputations. Doctors with fewer digital recommendations should focus more on prosocial behavior to attract patients to e-consultations. Meanwhile, doctors with lower titles should devote their efforts to proactive behaviors to demonstrate their capability in fulfilling the e-consultations, thereby reducing information asymmetry between patients and themselves.

Third, our findings offer implications for web-based health care platform managers in designing effective functions. An increasing number of platforms are launching various features to better serve doctors and patients, meeting the needs of both groups more effectively. Our empirical findings suggest that doctors’ proactive and reactive prosocial behaviors, such as posting professional articles and conducting medical live streams, can help them establish professional image and enhance patient trust, leading to improved performance. Importantly, these behaviors also benefit patients by enhancing their health knowledge and literacy. Thus, platform managers could introduce functions (eg, article posting, live streaming, and doctor-driven communities) to encourage more prosocial behaviors by doctors. Additionally, platform managers might consider incorporating guidelines or incentive mechanisms for prosocial behaviors into their platforms. For example, it is recommended that platforms collect and analyze doctors’ proactive and reactive prosocial behaviors and guide them on how to effectively use these functions and engage in different types of activities.

Limitations

Despite its contributions, this study also presents several limitations that future research should consider. First, various classifications of prosocial behavior are available; for instance, Richaud et al [ 71 ] classified such behavior as altruistic, compliant, emotional, public, anonymous, or dire actions. Given the intricacy of web-based medical services, future studies would benefit from further exploring the roles of these other types of prosocial behavior exhibited by doctors on web-based health care communities. Second, our research model was constructed primarily from the doctor’s perspective and thus did not investigate the influence of doctors’ prosocial behaviors on patients’ satisfaction and well-being. Future research should delve into these relationships to obtain a more comprehensive understanding of the impacts of doctors’ prosocial behaviors. Finally, this study focused only on the quantity of medical live-streaming sessions, overlooking the quality aspect, which could be a crucial factor influencing e-consultation volume. Future research will concentrate on exploring this aspect.

Conclusions

Building upon prior studies on doctors’ prosocial behaviors on web-based health care communities, this study further delineates doctors’ beneficial actions into proactive and synchronous reactive behaviors. This distinction is based on the divergence in doctors’ motives for engaging and patients’ levels of involvement. Drawing from FMT and SET, this study offers insights that could aid doctors in increasing their e-consultation volume by adopting these beneficial behaviors. Concurrently, this research augments our understanding of the roles a doctor’s reputation plays in the relationships between various prosocial behaviors—specifically, proactive and reactive actions—and their e-consultation volume. This study may inspire doctors with comparatively lower offline professional titles and digital popularity to achieve their desired e-consultation volume.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (72001170 and 72102179), the Fundamental Research Funds for the Central Universities (SK2024028), the Ministry of Education in China Project of Humanities and Social Sciences (21XJC630003), the China Postdoctoral Science Foundation (2022T150515, 2023M742818, and 2020M673432), the National Natural Science Foundation of China (72004042), and the Heilongjiang Natural Science Foundation (YQ2023G003), and the grants from City University of Hong Kong (projects 7005959, 7006152, and 7200725).

Conflicts of Interest

None declared.

  • Qi M, Cui J, Li X, Han Y. Influence of e-consultation on the intention of first-visit patients to select medical services: results of a scenario survey. J Med Internet Res. 2023;25:e40993. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cao B, Huang W, Chao N, Yang G, Luo N. Patient activeness during online medical consultation in China: multilevel analysis. J Med Internet Res. 2022;24(5):e35557. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Jiang S. Talk to your doctors online: an internet-based intervention in China. Health Commun. 2021;36(4):405-411. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Jiang J, Cameron AF, Yang M. Analysis of massive online medical consultation service data to understand physicians' economic return: observational data mining study. JMIR Med Inform. 2020;8(2):e16765. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Street RL, Millay B. Analyzing patient participation in medical encounters. Health Commun. 2001;13(1):61-73. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ren D, Ma B. Effectiveness of interactive tools in online health care communities: social exchange theory perspective. J Med Internet Res. 2021;23(3):e21892. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Guo S, Guo X, Fang Y, Vogel D. How doctors gain social and economic returns in online health-care communities: a professional capital perspective. J Manag Inf Syst. 2017;34(2):487-519. [ FREE Full text ] [ CrossRef ]
  • Tande AJ, Berbari EF, Ramar P, Ponamgi SP, Sharma U, Philpot L, et al. Association of a remotely offered infectious diseases eConsult service with improved clinical outcomes. Open Forum Infect Dis. 2020;7(1):ofaa003. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Richardson BR, Truter P, Blumke R, Russell TG. Physiotherapy assessment and diagnosis of musculoskeletal disorders of the knee via telerehabilitation. J Telemed Telecare. 2017;23(1):88-95. [ CrossRef ] [ Medline ]
  • Castaneda PR, Duffy B, Andraska EA, Stevens J, Reschke K, Osborne N, et al. Outcomes and safety of electronic consult use in vascular surgery. J Vasc Surg. 2020;71(5):1726-1732. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Jiang S. The relationship between face-to-face and online patient-provider communication: examining the moderating roles of patient trust and patient satisfaction. Health Commun. 2020;35(3):341-349. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Li J, Tang J, Jiang L, Yen DC, Liu X. Economic success of physicians in the online consultation market: a signaling theory perspective. Int J Electron Commer. 2019;23(2):244-271. [ FREE Full text ] [ CrossRef ]
  • Guo S, Guo X, Zhang X, Vogel D. Doctor–patient relationship strength’s impact in an online healthcare community. Inf Technol Dev. 2017;24(2):279-300. [ FREE Full text ] [ CrossRef ]
  • Audrain-Pontevia AF, Menvielle L. Do online health communities enhance patient-physician relationship? an assessment of the impact of social support and patient empowerment. Health Serv Manage Res. 2018;31(3):154-162. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Yang Y, Zhang X, Lee PK. Improving the effectiveness of online healthcare platforms: an empirical study with multi-period patient-doctor consultation data. Int J Prod Econ. 2019;207:70-80. [ FREE Full text ] [ CrossRef ]
  • Fan W, Zhou Q, Qiu L, Kumar S. Should doctors open online consultation services? An empirical investigation of their impact on offline appointments. Inf Syst Res. 2023;34(2):629-651. [ FREE Full text ] [ CrossRef ]
  • Liu QB, Liu X, Guo X. The effects of participating in a physician-driven online health community in managing chronic disease: evidence from two natural experiments. MIS Q. 2020;44(1):391-419. [ FREE Full text ] [ CrossRef ]
  • Yan Z, Wang T, Chen Y, Zhang H. Knowledge sharing in online health communities: a social exchange theory perspective. Inf Manag. 2016;53(5):643-653. [ FREE Full text ] [ CrossRef ]
  • Luo P, Chen K, Wu C, Li Y. Exploring the social influence of multichannel access in an online health community. J Assoc Inf Sci Technol. 2018;69(1):98-109. [ FREE Full text ] [ CrossRef ]
  • Chai S, Bagchi-Sen S, Morrell C, Rao HR, Upadhyaya SJ. Internet and online information privacy: an exploratory study of preteens and early teens. IEEE Trans Profess Commun. 2009;52(2):167-182. [ FREE Full text ] [ CrossRef ]
  • Ahearne M, Rapp A, Hughes DE, Jindal R. Managing sales force product perceptions and control systems in the success of new product introductions. J Mark Res. 2010;47(4):764-776. [ FREE Full text ] [ CrossRef ]
  • Awaysheh A, Heron RA, Perry T, Wilson JI. On the relation between corporate social responsibility and financial performance. Strateg Manag J. 2020;41(6):965-987. [ FREE Full text ] [ CrossRef ]
  • Coll MP, Grégoire M, Eugène F, Jackson PL. Neural correlates of prosocial behavior towards persons in pain in healthcare providers. Biol Psychol. 2017;128:1-10. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kapıkıran NA. Sources of ethnocultural empathy: personality, intergroup relations, affects. Curr Psychol. 2021;42(14):11510-11528. [ FREE Full text ] [ CrossRef ]
  • Yin Y, Wang Y. Is empathy associated with more prosocial behaviour? a meta‐analysis. Asian J Soc Psychol. 2023;26(1):3-22. [ FREE Full text ] [ CrossRef ]
  • Finset A. "I am worried, Doctor!" emotions in the doctor-patient relationship. Patient Educ Couns. 2012;88(3):359-363. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Larson EB, Yao X. Clinical empathy as emotional labor in the patient-physician relationship. JAMA. 2005;293(9):1100-1106. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Bloom G, Standing H, Lloyd R. Markets, information asymmetry and health care: towards new social contracts. Soc Sci Med. 2008;66(10):2076-2087. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Guo F, Zou B, Guo J, Shi Y, Bo Q, Shi L. What determines academic entrepreneurship success? A social identity perspective. Int Entrep Manag J. 2019;15(3):929-952. [ FREE Full text ] [ CrossRef ]
  • Yang H, Zhang X. Investigating the effect of paid and free feedback about physicians' telemedicine services on patients' and physicians' behaviors: panel data analysis. J Med Internet Res. 2019;21(3):e12156. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Yan Z, Kuang L, Qiu L. Prosocial behaviors and economic performance: evidence from an online mental healthcare platform. Prod Oper Manag. 2022;31(10):3859-3876. [ FREE Full text ] [ CrossRef ]
  • Frese M, Fay D. 4. Personal initiative: an active performance concept for work in the 21st century. Res Organ Behav. 2001;23:133-187. [ FREE Full text ] [ CrossRef ]
  • Frese M, Kring W, Soose A, Zempel J. Personal initiative at work: differences between East and West Germany. Acad Manage J. 1996;39(1):37-63. [ FREE Full text ] [ CrossRef ]
  • Spitzmuller M, Van Dyne L. Proactive and reactive helping: contrasting the positive consequences of different forms of helping. J Organ Behavior. 2013;34(4):560-580. [ FREE Full text ] [ CrossRef ]
  • Bandura A. Social cognitive theory: an agentic perspective. Annu Rev Psychol. 2001;52(1):1-26. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mittal S, Sengupta A, Agrawal NM, Gupta S. How prosocial is proactive: developing and validating a scale and process model of knowledge-based proactive helping. J Manag Organ. 2018;26(4):625-650. [ FREE Full text ] [ CrossRef ]
  • Walter A, Ritter T, Gemünden HG. Value creation in buyer–seller relationships: theoretical considerations and empirical results from a supplier's perspective. Ind Mark Manag. 2001;30(4):365-377. [ FREE Full text ] [ CrossRef ]
  • Eggert A, Ulaga W, Schultz F. Value creation in the relationship life cycle: a quasi-longitudinal analysis. Ind Mark Manag. 2006;35(1):20-27. [ FREE Full text ] [ CrossRef ]
  • Deng Z, Hong Z, Zhang W, Evans R, Chen Y. The effect of online effort and reputation of physicians on patients' choice: 3-wave data analysis of China's good doctor website. J Med Internet Res. 2019;21(3):e10170. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kurihara H, Maeno T, Maeno T. Importance of physicians' attire: factors influencing the impression it makes on patients, a cross-sectional study. Asia Pac Fam Med. 2014;13(1):2. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cooper ML, Shapiro CM, Powers AM. Motivations for sex and risky sexual behavior among adolescents and young adults: a functional perspective. J Pers Soc Psychol. 1998;75(6):1528-1558. [ CrossRef ] [ Medline ]
  • Clary EG, Snyder M, Ridge RD, Copeland J, Stukas AA, Haugen J, et al. Understanding and assessing the motivations of volunteers: a functional approach. J Pers Soc Psychol. 1998;74(6):1516-1530. [ CrossRef ] [ Medline ]
  • Epstein RM, Hundert EM. Defining and assessing professional competence. JAMA. 2002;287(2):226-235. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zhang L, Qiu Y, Zhang N, Li S. How difficult doctor‒patient relationships impair physicians' work engagement: the roles of prosocial motivation and problem-solving pondering. Psychol Rep. 2020;123(3):885-902. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Dabholkar PA, Johnston WJ, Cathey AS. The dynamics of long-term business-to-business exchange relationships. J Acad Mark Sci. 1994;22(2):130-145. [ FREE Full text ] [ CrossRef ]
  • Ochs M, Mestre D, de Montcheuil G, Pergandi JM, Saubesty J, Lombardo E, et al. Training doctors’ social skills to break bad news: evaluation of the impact of virtual environment displays on the sense of presence. J Multimodal User Interfaces. 2019;13(1):41-51. [ FREE Full text ] [ CrossRef ]
  • Schroeder T, Seaman K, Nguyen A, Gewald H, Georgiou A. Enablers and inhibitors to the adoption of mHealth apps by patients—a qualitative analysis of German doctors' perspectives. Patient Educ Couns. 2023;114:107865. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Liu X, Hu M, Xiao BS, Shao J. Is my doctor around me? investigating the impact of doctors’ presence on patients’ review behaviors on an online health platform. J Assoc Inf Sci Technol. 2022;73(9):1279-1296. [ FREE Full text ] [ CrossRef ]
  • Chen Y, Xie J. Online consumer review: word-of-mouth as a new element of marketing communication mix. Manage Sci. 2008;54(3):477-491. [ FREE Full text ] [ CrossRef ]
  • Ye Q, Li Y, Kiang M, Wu W. The impact of seller reputation on the performance of online sales: evidence from TaoBao Buy-It-Now (BIN) data. SIGMIS Database. 2009;40(1):12-19. [ FREE Full text ] [ CrossRef ]
  • Liu X, Guo X, Wu H, Wu T. The impact of individual and organizational reputation on physicians’ appointments online. Int J Electron Commer. 2016;20(4):551-577. [ FREE Full text ] [ CrossRef ]
  • Dewan S, Hsu V. Adverse selection in electronic markets: evidence from online stamp auctions. J Ind Econ. 2004;52(4):497-516. [ CrossRef ]
  • Torres E, Vasquez-Parraga AZ, Barra C. The path of patient loyalty and the role of doctor reputation. Health Mark Q. 2009;26(3):183-197. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Gottschalk F, Mimra W, Waibel C. Health services as credence goods: a field experiment. Econ J. 2020;130(629):1346-1383. [ FREE Full text ] [ CrossRef ]
  • Nunkoo R, Ramkissoon H. Power, trust, social exchange and community support. Ann Tour Res. 2012;39(2):997-1023. [ FREE Full text ] [ CrossRef ]
  • Blau PM. Exchange and Power in Social Life. New Brunswick. Transaction Publishers; 1964.
  • Molm LD, Takahashi N, Peterson G. Risk and trust in social exchange: an experimental test of a classical proposition. Am J Sociol. 2000;105(5):1396-1427. [ FREE Full text ] [ CrossRef ]
  • Lioukas CS, Reuer JJ. Isolating trust outcomes from exchange relationships: social exchange and learning benefits of prior ties in alliances. Acad Manage J. 2015;58(6):1826-1847. [ FREE Full text ] [ CrossRef ]
  • Malgonde OS, Saldanha TJ, Mithas S. Resilience in the open source software community: how pandemic and unemployment shocks influence contributions to others’ and one’s own projects. MIS Q. 2023;47(1):361-390. [ FREE Full text ] [ CrossRef ]
  • Mayya R, Ye S, Viswanathan S, Agarwal R. Who forgoes screening in online markets and why? Evidence from Airbnb. MIS Q. 2021;45(4):1745-1776. [ FREE Full text ] [ CrossRef ]
  • Moqri M, Mei X, Qiu L, Bandyopadhyay S. Effect of “following” on contributions to open source communities. J Manag Inf Syst. 2018;35(4):1188-1217. [ FREE Full text ] [ CrossRef ]
  • Chen Q, Jin J, Zhang T, Yan X. The effects of log-in behaviors and web reviews on patient consultation in online health communities: longitudinal study. J Med Internet Res. 2021;23(6):e25367. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mo J, Sarkar S, Menon S. Competing tasks and task quality: an empirical study of crowdsourcing contests. MIS Q. 2021;45(4):1921-1948. [ FREE Full text ] [ CrossRef ]
  • Gourieroux C, Holly A, Monfort A. Likelihood ratio test, Wald test, and Kuhn-Tucker test in linear models with inequality constraints on the regression parameters. Econometrica. 1982;50(1):63. [ CrossRef ]
  • Peng CH, Yin D, Zhang H. More than words in medical question-and-answer sites: a content-context congruence perspective. Inf Syst Res. 2020;31(3):913-928. [ FREE Full text ] [ CrossRef ]
  • Guo L, Jin B, Yao C, Yang H, Huang D, Wang F. Which doctor to trust: a recommender system for identifying the right doctors. J Med Internet Res. 2016;18(7):e186. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zheng S, Hui SF, Yang Z. Hospital trust or doctor trust? A fuzzy analysis of trust in the health care setting. J Bus Res. 2017;78:217-225. [ FREE Full text ] [ CrossRef ]
  • Mao Z, Du Z, Yuan R, Miao Q. Short-term or long-term cooperation between retailer and MCN? new launched products sales strategies in live streaming e-commerce. J Retail Consum Serv. 2022;67:102996. [ FREE Full text ] [ CrossRef ]
  • Hilvert-Bruce Z, Neill JT, Sjöblom M, Hamari J. Social motivations of live-streaming viewer engagement on Twitch. Comput Hum Behav. 2018;84:58-67. [ FREE Full text ] [ CrossRef ]
  • Tang YM, Chen PC, Law KMY, Wu CH, Lau YY, Guan J, et al. Comparative analysis of student's live online learning readiness during the coronavirus (COVID-19) pandemic in the higher education sector. Comput Educ. 2021;168:104211. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Richaud MC, Mesurado B, Cortada AK. Analysis of dimensions of prosocial behavior in an Argentinean sample of children. Psychol Rep. 2012;111(3):687-696. [ FREE Full text ] [ CrossRef ] [ Medline ]

Abbreviations

Edited by G Eysenbach; submitted 11.09.23; peer-reviewed by P Luo, Y Zhu, C Fu; comments to author 05.10.23; revised version received 30.12.23; accepted 09.03.24; published 25.04.24.

©Xiaoxiao Liu, Huijing Guo, Le Wang, Mingye Hu, Yichan Wei, Fei Liu, Xifu Wang. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 25.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

IMAGES

  1. What is Data Analysis ?

    types of data analysis in research

  2. 7 Types of Statistical Analysis with Best Examples

    types of data analysis in research

  3. 7 Types of Statistical Analysis: Definition and Explanation

    types of data analysis in research

  4. Top 4 Data Analysis Techniques

    types of data analysis in research

  5. Types of Data Analysis

    types of data analysis in research

  6. Your Guide to Qualitative and Quantitative Data Analysis Methods

    types of data analysis in research

VIDEO

  1. How to Assess the Quantitative Data Collected from Questionnaire

  2. Data organization in Biology

  3. What Is Data?

  4. DATA ANALYSIS

  5. Differences Between Robust Regression and Linear Regression

  6. interpretation of data , analysis and thesis writing (Nta UGC net sociology)

COMMENTS

  1. Data Analysis in Research: Types & Methods

    Methods used for data analysis in qualitative research. There are several techniques to analyze the data in qualitative research, but here are some commonly used methods, Content Analysis: It is widely accepted and the most frequently employed technique for data analysis in research methodology. It can be used to analyze the documented ...

  2. Types of Data Analysis: A Guide

    Learn about the eight types of data analysis, from descriptive to prescriptive, and how they are used in different fields and purposes. See examples of each type and how they differ in complexity and methodology.

  3. Data Analysis: Types, Methods & Techniques (a Complete List)

    Learn the difference between types, methods, and techniques of data analysis, and how to apply them to various research questions. Explore the four branches of data analysis (quantitative, qualitative, mathematical, and AI) and their subtypes (descriptive, diagnostic, predictive, and prescriptive). See examples of data sets, methods, and techniques for each branch.

  4. What is Data Analysis? An Expert Guide With Examples

    Data analysis is a comprehensive method of inspecting, cleansing, transforming, and modeling data to discover useful information, draw conclusions, and support decision-making. It is a multifaceted process involving various techniques and methodologies to interpret data from various sources in different formats, both structured and unstructured.

  5. Data Analysis

    Learn about the definition, process, methods and types of data analysis in research. Explore various data analysis techniques, tools, and applications across different fields.

  6. What is data analysis? Methods, techniques, types & how-to

    A method of data analysis that is the umbrella term for engineering metrics and insights for additional value, direction, and context. By using exploratory statistical evaluation, data mining aims to identify dependencies, relations, patterns, and trends to generate advanced knowledge.

  7. What Is Data Analysis? (With Examples)

    Written by Coursera Staff • Updated on Apr 19, 2024. Data analysis is the practice of working with data to glean useful information, which can then be used to make informed decisions. "It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts," Sherlock ...

  8. Guides: Data Analysis: Introduction to Data Analysis

    Data analysis can be quantitative, qualitative, or mixed methods. Quantitative research typically involves numbers and "close-ended questions and responses" (Creswell & Creswell, 2018, p. 3).Quantitative research tests variables against objective theories, usually measured and collected on instruments and analyzed using statistical procedures (Creswell & Creswell, 2018, p. 4).

  9. Data analysis

    data analysis, the process of systematically collecting, cleaning, transforming, describing, modeling, and interpreting data, generally employing statistical techniques. Data analysis is an important part of both scientific research and business, where demand has grown in recent years for data-driven decision making.Data analysis techniques are used to gain useful insights from datasets, which ...

  10. The 4 Types of Data Analysis [Ultimate Guide]

    Learn how to use descriptive, diagnostic, predictive, and prescriptive analytics to draw insights from data and make smart decisions. See examples, techniques, and tools for each type of analysis.

  11. Learning to Do Qualitative Data Analysis: A Starting Point

    For many researchers unfamiliar with qualitative research, determining how to conduct qualitative analyses is often quite challenging. Part of this challenge is due to the seemingly limitless approaches that a qualitative researcher might leverage, as well as simply learning to think like a qualitative researcher when analyzing data. From framework analysis (Ritchie & Spencer, 1994) to content ...

  12. The Beginner's Guide to Statistical Analysis

    This article is a practical introduction to statistical analysis for students and researchers. We'll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables. Example: Causal research question.

  13. Quantitative Data Analysis Methods & Techniques 101

    Quantitative data analysis is one of those things that often strikes fear in students. It's totally understandable - quantitative analysis is a complex topic, full of daunting lingo, like medians, modes, correlation and regression.Suddenly we're all wishing we'd paid a little more attention in math class…. The good news is that while quantitative data analysis is a mammoth topic ...

  14. Research Methods

    To analyze data collected in a statistically valid manner (e.g. from experiments, surveys, and observations). Meta-analysis. Quantitative. To statistically analyze the results of a large collection of studies. Can only be applied to studies that collected data in a statistically valid manner.

  15. (PDF) Different Types of Data Analysis; Data Analysis Methods and

    Data analysis is simply the process of converting the gathered data to meanin gf ul information. Different techniques such as modeling to reach trends, relatio nships, and therefore conclusions to ...

  16. Qualitative Data Analysis Methods: Top 6 + Examples

    QDA Method #3: Discourse Analysis. Discourse is simply a fancy word for written or spoken language or debate. So, discourse analysis is all about analysing language within its social context. In other words, analysing language - such as a conversation, a speech, etc - within the culture and society it takes place.

  17. Basic statistical tools in research and data analysis

    Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if ...

  18. Qualitative Data Analysis: What is it, Methods + Examples

    Qualitative data analysis is a systematic process of examining non-numerical data to extract meaning, patterns, and insights. In contrast to quantitative analysis, which focuses on numbers and statistical metrics, the qualitative study focuses on the qualitative aspects of data, such as text, images, audio, and videos.

  19. Research Data

    Research data plays a critical role in scientific inquiry and is often subject to rigorous analysis, interpretation, and dissemination to advance knowledge and inform decision-making. Types of Research Data. There are generally four types of research data: Quantitative Data. This type of data involves the collection and analysis of numerical data.

  20. Data Collection

    Revised on June 21, 2023. Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem. While methods and aims may differ between ...

  21. Quantitative Research

    Quantitative Research. Quantitative research is a type of research that collects and analyzes numerical data to test hypotheses and answer research questions.This research typically involves a large sample size and uses statistical analysis to make inferences about a population based on the data collected.

  22. What Is Data Collection: Methods, Types, Tools

    Data collection is the process of collecting and evaluating information or data from multiple sources to find answers to research problems, answer questions, evaluate outcomes, and forecast trends and probabilities. It is an essential phase in all types of research, analysis, and decision-making, including that done in the social sciences ...

  23. A scoping review of continuous quality improvement in healthcare system

    The growing adoption of continuous quality improvement (CQI) initiatives in healthcare has generated a surge in research interest to gain a deeper understanding of CQI. However, comprehensive evidence regarding the diverse facets of CQI in healthcare has been limited. Our review sought to comprehensively grasp the conceptualization and principles of CQI, explore existing models and tools ...

  24. Crime in the U.S.: Key questions answered

    The analysis relies on statistics published by the FBI, which we accessed through the Crime Data Explorer, and the Bureau of Justice Statistics (BJS), which we accessed through the National Crime Victimization Survey data analysis tool. To measure public attitudes about crime in the U.S., we relied on survey data from Pew Research Center and ...

  25. Qualitative Data

    Analyze data: Analyze the data using appropriate qualitative data analysis methods, such as thematic analysis or content analysis. Interpret findings: Interpret the findings of the data analysis in the context of the research question and the relevant literature. This may involve developing new theories or frameworks, or validating existing ones.

  26. Business R&D Performance in the United States Tops $600 Billion in 2021

    Businesses continued to increase their research and development performance in 2021, spending $602 billion on R&D in the United States, a 12.1% increase from 2020. Funding from the companies' own sources accounted for $528 billion of this spending in 2021, a 13.2% increase from 2020. Funding from other sources accounted for $75 billion, a 4.5% increase from 2020.

  27. A New Use for Wegovy Opens the Door to Medicare Coverage for ...

    Based on KFF analysis of Medicare data from 2020, an estimated 7% of Medicare beneficiaries, or 3.6 million overall, had established cardiovascular disease and obesity or overweight in 2020, and ...

  28. Quantitative Data

    This will help determine the type of data to be collected, the sample size, and the methods of data analysis. Choose the data collection method: Select the appropriate method for collecting data based on the research question and available resources. This could include surveys, experiments, observational studies, or other methods.

  29. Journal of Medical Internet Research

    Methods: A panel data set containing information on 2880 doctors over a 22-month period was obtained from one of the largest web-based health care communities in China. Data analysis was conducted using a 2-way fixed effects model with robust clustered SEs.

  30. Food additive emulsifiers and the risk of type 2 diabetes: analysis of

    We found direct associations between the risk of type 2 diabetes and exposures to various food additive emulsifiers widely used in industrial foods, in a large prospective cohort of French adults. Further research is needed to prompt re-evaluation of regulations governing the use of additive emulsifiers in the food industry for better consumer protection.