What is causal research design?

Last updated

14 May 2023

Reviewed by

Examining these relationships gives researchers valuable insights into the mechanisms that drive the phenomena they are investigating.

Organizations primarily use causal research design to identify, determine, and explore the impact of changes within an organization and the market. You can use a causal research design to evaluate the effects of certain changes on existing procedures, norms, and more.

This article explores causal research design, including its elements, advantages, and disadvantages.

Analyze your causal research

Dovetail streamlines causal research analysis to help you uncover and share actionable insights

  • Components of causal research

You can demonstrate the existence of cause-and-effect relationships between two factors or variables using specific causal information, allowing you to produce more meaningful results and research implications.

These are the key inputs for causal research:

The timeline of events

Ideally, the cause must occur before the effect. You should review the timeline of two or more separate events to determine the independent variables (cause) from the dependent variables (effect) before developing a hypothesis. 

If the cause occurs before the effect, you can link cause and effect and develop a hypothesis .

For instance, an organization may notice a sales increase. Determining the cause would help them reproduce these results. 

Upon review, the business realizes that the sales boost occurred right after an advertising campaign. The business can leverage this time-based data to determine whether the advertising campaign is the independent variable that caused a change in sales. 

Evaluation of confounding variables

In most cases, you need to pinpoint the variables that comprise a cause-and-effect relationship when using a causal research design. This uncovers a more accurate conclusion. 

Co-variations between a cause and effect must be accurate, and a third factor shouldn’t relate to cause and effect. 

Observing changes

Variation links between two variables must be clear. A quantitative change in effect must happen solely due to a quantitative change in the cause. 

You can test whether the independent variable changes the dependent variable to evaluate the validity of a cause-and-effect relationship. A steady change between the two variables must occur to back up your hypothesis of a genuine causal effect. 

  • Why is causal research useful?

Causal research allows market researchers to predict hypothetical occurrences and outcomes while enhancing existing strategies. Organizations can use this concept to develop beneficial plans. 

Causal research is also useful as market researchers can immediately deduce the effect of the variables on each other under real-world conditions. 

Once researchers complete their first experiment, they can use their findings. Applying them to alternative scenarios or repeating the experiment to confirm its validity can produce further insights. 

Businesses widely use causal research to identify and comprehend the effect of strategic changes on their profits. 

  • How does causal research compare and differ from other research types?

Other research types that identify relationships between variables include exploratory and descriptive research . 

Here’s how they compare and differ from causal research designs:

Exploratory research

An exploratory research design evaluates situations where a problem or opportunity's boundaries are unclear. You can use this research type to test various hypotheses and assumptions to establish facts and understand a situation more clearly.

You can also use exploratory research design to navigate a topic and discover the relevant variables. This research type allows flexibility and adaptability as the experiment progresses, particularly since no area is off-limits.

It’s worth noting that exploratory research is unstructured and typically involves collecting qualitative data . This provides the freedom to tweak and amend the research approach according to your ongoing thoughts and assessments. 

Unfortunately, this exposes the findings to the risk of bias and may limit the extent to which a researcher can explore a topic. 

This table compares the key characteristics of causal and exploratory research:

Descriptive research

This research design involves capturing and describing the traits of a population, situation, or phenomenon. Descriptive research focuses more on the " what " of the research subject and less on the " why ."

Since descriptive research typically happens in a real-world setting, variables can cross-contaminate others. This increases the challenge of isolating cause-and-effect relationships. 

You may require further research if you need more causal links. 

This table compares the key characteristics of causal and descriptive research.  

Causal research examines a research question’s variables and how they interact. It’s easier to pinpoint cause and effect since the experiment often happens in a controlled setting. 

Researchers can conduct causal research at any stage, but they typically use it once they know more about the topic.

In contrast, causal research tends to be more structured and can be combined with exploratory and descriptive research to help you attain your research goals. 

  • How can you use causal research effectively?

Here are common ways that market researchers leverage causal research effectively:

Market and advertising research

Do you want to know if your new marketing campaign is affecting your organization positively? You can use causal research to determine the variables causing negative or positive impacts on your campaign. 

Improving customer experiences and loyalty levels

Consumers generally enjoy purchasing from brands aligned with their values. They’re more likely to purchase from such brands and positively represent them to others. 

You can use causal research to identify the variables contributing to increased or reduced customer acquisition and retention rates. 

Could the cause of increased customer retention rates be streamlined checkout? 

Perhaps you introduced a new solution geared towards directly solving their immediate problem. 

Whatever the reason, causal research can help you identify the cause-and-effect relationship. You can use this to enhance your customer experiences and loyalty levels.

Improving problematic employee turnover rates

Is your organization experiencing skyrocketing attrition rates? 

You can leverage the features and benefits of causal research to narrow down the possible explanations or variables with significant effects on employees quitting. 

This way, you can prioritize interventions, focusing on the highest priority causal influences, and begin to tackle high employee turnover rates. 

  • Advantages of causal research

The main benefits of causal research include the following:

Effectively test new ideas

If causal research can pinpoint the precise outcome through combinations of different variables, researchers can test ideas in the same manner to form viable proof of concepts.

Achieve more objective results

Market researchers typically use random sampling techniques to choose experiment participants or subjects in causal research. This reduces the possibility of exterior, sample, or demography-based influences, generating more objective results. 

Improved business processes

Causal research helps businesses understand which variables positively impact target variables, such as customer loyalty or sales revenues. This helps them improve their processes, ROI, and customer and employee experiences.

Guarantee reliable and accurate results

Upon identifying the correct variables, researchers can replicate cause and effect effortlessly. This creates reliable data and results to draw insights from. 

Internal organization improvements

Businesses that conduct causal research can make informed decisions about improving their internal operations and enhancing employee experiences. 

  • Disadvantages of causal research

Like any other research method, casual research has its set of drawbacks that include:

Extra research to ensure validity

Researchers can't simply rely on the outcomes of causal research since it isn't always accurate. There may be a need to conduct other research types alongside it to ensure accurate output.

Coincidence

Coincidence tends to be the most significant error in causal research. Researchers often misinterpret a coincidental link between a cause and effect as a direct causal link. 

Administration challenges

Causal research can be challenging to administer since it's impossible to control the impact of extraneous variables . 

Giving away your competitive advantage

If you intend to publish your research, it exposes your information to the competition. 

Competitors may use your research outcomes to identify your plans and strategies to enter the market before you. 

  • Causal research examples

Multiple fields can use causal research, so it serves different purposes, such as. 

Customer loyalty research

Organizations and employees can use causal research to determine the best customer attraction and retention approaches. 

They monitor interactions between customers and employees to identify cause-and-effect patterns. That could be a product demonstration technique resulting in higher or lower sales from the same customers. 

Example: Business X introduces a new individual marketing strategy for a small customer group and notices a measurable increase in monthly subscriptions. 

Upon getting identical results from different groups, the business concludes that the individual marketing strategy resulted in the intended causal relationship.

Advertising research

Businesses can also use causal research to implement and assess advertising campaigns. 

Example: Business X notices a 7% increase in sales revenue a few months after a business introduces a new advertisement in a certain region. The business can run the same ad in random regions to compare sales data over the same period. 

This will help the company determine whether the ad caused the sales increase. If sales increase in these randomly selected regions, the business could conclude that advertising campaigns and sales share a cause-and-effect relationship. 

Educational research

Academics, teachers, and learners can use causal research to explore the impact of politics on learners and pinpoint learner behavior trends. 

Example: College X notices that more IT students drop out of their program in their second year, which is 8% higher than any other year. 

The college administration can interview a random group of IT students to identify factors leading to this situation, including personal factors and influences. 

With the help of in-depth statistical analysis, the institution's researchers can uncover the main factors causing dropout. They can create immediate solutions to address the problem.

Is a causal variable dependent or independent?

When two variables have a cause-and-effect relationship, the cause is often called the independent variable. As such, the effect variable is dependent, i.e., it depends on the independent causal variable. An independent variable is only causal under experimental conditions. 

What are the three criteria for causality?

The three conditions for causality are:

Temporality/temporal precedence: The cause must precede the effect.

Rationality: One event predicts the other with an explanation, and the effect must vary in proportion to changes in the cause.

Control for extraneous variables: The covariables must not result from other variables.  

Is causal research experimental?

Causal research is mostly explanatory. Causal studies focus on analyzing a situation to explore and explain the patterns of relationships between variables. 

Further, experiments are the primary data collection methods in studies with causal research design. However, as a research design, causal research isn't entirely experimental.

What is the difference between experimental and causal research design?

One of the main differences between causal and experimental research is that in causal research, the research subjects are already in groups since the event has already happened. 

On the other hand, researchers randomly choose subjects in experimental research before manipulating the variables.

Get started today

Go from raw data to valuable insights with a flexible research platform

Editor’s picks

Last updated: 21 December 2023

Last updated: 16 December 2023

Last updated: 6 October 2023

Last updated: 5 March 2024

Last updated: 25 November 2023

Last updated: 15 February 2024

Last updated: 11 March 2024

Last updated: 12 December 2023

Last updated: 6 March 2024

Last updated: 10 April 2023

Last updated: 20 December 2023

Latest articles

Related topics, log in or sign up.

Get started for free

Root out friction in every digital experience, super-charge conversion rates, and optimize digital self-service

Uncover insights from any interaction, deliver AI-powered agent coaching, and reduce cost to serve

Increase revenue and loyalty with real-time insights and recommendations delivered to teams on the ground

Know how your people feel and empower managers to improve employee engagement, productivity, and retention

Take action in the moments that matter most along the employee journey and drive bottom line growth

Whatever they’re are saying, wherever they’re saying it, know exactly what’s going on with your people

Get faster, richer insights with qual and quant tools that make powerful market research available to everyone

Run concept tests, pricing studies, prototyping + more with fast, powerful studies designed by UX research experts

Track your brand performance 24/7 and act quickly to respond to opportunities and challenges in your market

Explore the platform powering Experience Management

  • Free Account
  • For Digital
  • For Customer Care
  • For Human Resources
  • For Researchers
  • Financial Services
  • All Industries

Popular Use Cases

  • Customer Experience
  • Employee Experience
  • Employee Exit Interviews
  • Net Promoter Score
  • Voice of Customer
  • Customer Success Hub
  • Product Documentation
  • Training & Certification
  • XM Institute
  • Popular Resources
  • Customer Stories

Market Research

  • Artificial Intelligence
  • Partnerships
  • Marketplace

The annual gathering of the experience leaders at the world’s iconic brands building breakthrough business results, live in Salt Lake City.

  • English/AU & NZ
  • Español/Europa
  • Español/América Latina
  • Português Brasileiro
  • REQUEST DEMO
  • Experience Management
  • Causal Research

Try Qualtrics for free

Causal research: definition, examples and how to use it.

16 min read Causal research enables market researchers to predict hypothetical occurrences & outcomes while improving existing strategies. Discover how this research can decrease employee retention & increase customer success for your business.

What is causal research?

Causal research, also known as explanatory research or causal-comparative research, identifies the extent and nature of cause-and-effect relationships between two or more variables.

It’s often used by companies to determine the impact of changes in products, features, or services process on critical company metrics. Some examples:

  • How does rebranding of a product influence intent to purchase?
  • How would expansion to a new market segment affect projected sales?
  • What would be the impact of a price increase or decrease on customer loyalty?

To maintain the accuracy of causal research, ‘confounding variables’ or influences — e.g. those that could distort the results — are controlled. This is done either by keeping them constant in the creation of data, or by using statistical methods. These variables are identified before the start of the research experiment.

As well as the above, research teams will outline several other variables and principles in causal research:

  • Independent variables

The variables that may cause direct changes in another variable. For example, the effect of truancy on a student’s grade point average. The independent variable is therefore class attendance.

  • Control variables

These are the components that remain unchanged during the experiment so researchers can better understand what conditions create a cause-and-effect relationship.  

This describes the cause-and-effect relationship. When researchers find causation (or the cause), they’ve conducted all the processes necessary to prove it exists.

  • Correlation

Any relationship between two variables in the experiment. It’s important to note that correlation doesn’t automatically mean causation. Researchers will typically establish correlation before proving cause-and-effect.

  • Experimental design

Researchers use experimental design to define the parameters of the experiment — e.g. categorizing participants into different groups.

  • Dependent variables

These are measurable variables that may change or are influenced by the independent variable. For example, in an experiment about whether or not terrain influences running speed, your dependent variable is the terrain.  

Why is causal research useful?

It’s useful because it enables market researchers to predict hypothetical occurrences and outcomes while improving existing strategies. This allows businesses to create plans that benefit the company. It’s also a great research method because researchers can immediately see how variables affect each other and under what circumstances.

Also, once the first experiment has been completed, researchers can use the learnings from the analysis to repeat the experiment or apply the findings to other scenarios. Because of this, it’s widely used to help understand the impact of changes in internal or commercial strategy to the business bottom line.

Some examples include:

  • Understanding how overall training levels are improved by introducing new courses
  • Examining which variations in wording make potential customers more interested in buying a product
  • Testing a market’s response to a brand-new line of products and/or services

So, how does causal research compare and differ from other research types?

Well, there are a few research types that are used to find answers to some of the examples above:

1. Exploratory research

As its name suggests, exploratory research involves assessing a situation (or situations) where the problem isn’t clear. Through this approach, researchers can test different avenues and ideas to establish facts and gain a better understanding.

Researchers can also use it to first navigate a topic and identify which variables are important. Because no area is off-limits, the research is flexible and adapts to the investigations as it progresses.

Finally, this approach is unstructured and often involves gathering qualitative data, giving the researcher freedom to progress the research according to their thoughts and assessment. However, this may make results susceptible to researcher bias and may limit the extent to which a topic is explored.

2. Descriptive research

Descriptive research is all about describing the characteristics of the population, phenomenon or scenario studied. It focuses more on the “what” of the research subject than the “why”.

For example, a clothing brand wants to understand the fashion purchasing trends amongst buyers in California — so they conduct a demographic survey of the region, gather population data and then run descriptive research. The study will help them to uncover purchasing patterns amongst fashion buyers in California, but not necessarily why those patterns exist.

As the research happens in a natural setting, variables can cross-contaminate other variables, making it harder to isolate cause and effect relationships. Therefore, further research will be required if more causal information is needed.

Get started on your market research journey with CoreXM

How is causal research different from the other two methods above?

Well, causal research looks at what variables are involved in a problem and ‘why’ they act a certain way. As the experiment takes place in a controlled setting (thanks to controlled variables) it’s easier to identify cause-and-effect amongst variables.

Furthermore, researchers can carry out causal research at any stage in the process, though it’s usually carried out in the later stages once more is known about a particular topic or situation.

Finally, compared to the other two methods, causal research is more structured, and researchers can combine it with exploratory and descriptive research to assist with research goals.

Summary of three research types

causal research table

What are the advantages of causal research?

  • Improve experiences

By understanding which variables have positive impacts on target variables (like sales revenue or customer loyalty), businesses can improve their processes, return on investment, and the experiences they offer customers and employees.

  • Help companies improve internally

By conducting causal research, management can make informed decisions about improving their employee experience and internal operations. For example, understanding which variables led to an increase in staff turnover.

  • Repeat experiments to enhance reliability and accuracy of results

When variables are identified, researchers can replicate cause-and-effect with ease, providing them with reliable data and results to draw insights from.

  • Test out new theories or ideas

If causal research is able to pinpoint the exact outcome of mixing together different variables, research teams have the ability to test out ideas in the same way to create viable proof of concepts.

  • Fix issues quickly

Once an undesirable effect’s cause is identified, researchers and management can take action to reduce the impact of it or remove it entirely, resulting in better outcomes.

What are the disadvantages of causal research?

  • Provides information to competitors

If you plan to publish your research, it provides information about your plans to your competitors. For example, they might use your research outcomes to identify what you are up to and enter the market before you.

  • Difficult to administer

Causal research is often difficult to administer because it’s not possible to control the effects of extraneous variables.

  • Time and money constraints

Budgetary and time constraints can make this type of research expensive to conduct and repeat. Also, if an initial attempt doesn’t provide a cause and effect relationship, the ROI is wasted and could impact the appetite for future repeat experiments.

  • Requires additional research to ensure validity

You can’t rely on just the outcomes of causal research as it’s inaccurate. It’s best to conduct other types of research alongside it to confirm its output.

  • Trouble establishing cause and effect

Researchers might identify that two variables are connected, but struggle to determine which is the cause and which variable is the effect.

  • Risk of contamination

There’s always the risk that people outside your market or area of study could affect the results of your research. For example, if you’re conducting a retail store study, shoppers outside your ‘test parameters’ shop at your store and skew the results.

How can you use causal research effectively?

To better highlight how you can use causal research across functions or markets, here are a few examples:

Market and advertising research

A company might want to know if their new advertising campaign or marketing campaign is having a positive impact. So, their research team can carry out a causal research project to see which variables cause a positive or negative effect on the campaign.

For example, a cold-weather apparel company in a winter ski-resort town may see an increase in sales generated after a targeted campaign to skiers. To see if one caused the other, the research team could set up a duplicate experiment to see if the same campaign would generate sales from non-skiers. If the results reduce or change, then it’s likely that the campaign had a direct effect on skiers to encourage them to purchase products.

Improving customer experiences and loyalty levels

Customers enjoy shopping with brands that align with their own values, and they’re more likely to buy and present the brand positively to other potential shoppers as a result. So, it’s in your best interest to deliver great experiences and retain your customers.

For example, the Harvard Business Review found that an increase in customer retention rates by 5% increased profits by 25% to 95%. But let’s say you want to increase your own, how can you identify which variables contribute to it?Using causal research, you can test hypotheses about which processes, strategies or changes influence customer retention. For example, is it the streamlined checkout? What about the personalized product suggestions? Or maybe it was a new solution that solved their problem? Causal research will help you find out.

Discover how to use analytics to improve customer retention.

Improving problematic employee turnover rates

If your company has a high attrition rate, causal research can help you narrow down the variables or reasons which have the greatest impact on people leaving. This allows you to prioritize your efforts on tackling the issues in the right order, for the best positive outcomes.

For example, through causal research, you might find that employee dissatisfaction due to a lack of communication and transparency from upper management leads to poor morale, which in turn influences employee retention.

To rectify the problem, you could implement a routine feedback loop or session that enables your people to talk to your company’s C-level executives so that they feel heard and understood.

How to conduct causal research first steps to getting started are:

1. Define the purpose of your research

What questions do you have? What do you expect to come out of your research? Think about which variables you need to test out the theory.

2. Pick a random sampling if participants are needed

Using a technology solution to support your sampling, like a database, can help you define who you want your target audience to be, and how random or representative they should be.

3. Set up the controlled experiment

Once you’ve defined which variables you’d like to measure to see if they interact, think about how best to set up the experiment. This could be in-person or in-house via interviews, or it could be done remotely using online surveys.

4. Carry out the experiment

Make sure to keep all irrelevant variables the same, and only change the causal variable (the one that causes the effect) to gather the correct data. Depending on your method, you could be collecting qualitative or quantitative data, so make sure you note your findings across each regularly.

5. Analyze your findings

Either manually or using technology, analyze your data to see if any trends, patterns or correlations emerge. By looking at the data, you’ll be able to see what changes you might need to do next time, or if there are questions that require further research.

6. Verify your findings

Your first attempt gives you the baseline figures to compare the new results to. You can then run another experiment to verify your findings.

7. Do follow-up or supplemental research

You can supplement your original findings by carrying out research that goes deeper into causes or explores the topic in more detail. One of the best ways to do this is to use a survey. See ‘Use surveys to help your experiment’.

Identifying causal relationships between variables

To verify if a causal relationship exists, you have to satisfy the following criteria:

  • Nonspurious association

A clear correlation exists between one cause and the effect. In other words, no ‘third’ that relates to both (cause and effect) should exist.

  • Temporal sequence

The cause occurs before the effect. For example, increased ad spend on product marketing would contribute to higher product sales.

  • Concomitant variation

The variation between the two variables is systematic. For example, if a company doesn’t change its IT policies and technology stack, then changes in employee productivity were not caused by IT policies or technology.

How surveys help your causal research experiments?

There are some surveys that are perfect for assisting researchers with understanding cause and effect. These include:

  • Employee Satisfaction Survey – An introductory employee satisfaction survey that provides you with an overview of your current employee experience.
  • Manager Feedback Survey – An introductory manager feedback survey geared toward improving your skills as a leader with valuable feedback from your team.
  • Net Promoter Score (NPS) Survey – Measure customer loyalty and understand how your customers feel about your product or service using one of the world’s best-recognized metrics.
  • Employee Engagement Survey – An entry-level employee engagement survey that provides you with an overview of your current employee experience.
  • Customer Satisfaction Survey – Evaluate how satisfied your customers are with your company, including the products and services you provide and how they are treated when they buy from you.
  • Employee Exit Interview Survey – Understand why your employees are leaving and how they’ll speak about your company once they’re gone.
  • Product Research Survey – Evaluate your consumers’ reaction to a new product or product feature across every stage of the product development journey.
  • Brand Awareness Survey – Track the level of brand awareness in your target market, including current and potential future customers.
  • Online Purchase Feedback Survey – Find out how well your online shopping experience performs against customer needs and expectations.

That covers the fundamentals of causal research and should give you a foundation for ongoing studies to assess opportunities, problems, and risks across your market, product, customer, and employee segments.

If you want to transform your research, empower your teams and get insights on tap to get ahead of the competition, maybe it’s time to leverage Qualtrics CoreXM.

Qualtrics CoreXM provides a single platform for data collection and analysis across every part of your business — from customer feedback to product concept testing. What’s more, you can integrate it with your existing tools and services thanks to a flexible API.

Qualtrics CoreXM offers you as much or as little power and complexity as you need, so whether you’re running simple surveys or more advanced forms of research, it can deliver every time.

Related resources

Market intelligence 10 min read, marketing insights 11 min read, ethnographic research 11 min read, qualitative vs quantitative research 13 min read, qualitative research questions 11 min read, qualitative research design 12 min read, primary vs secondary research 14 min read, request demo.

Ready to learn more about Qualtrics?

Causal Research: Definition, Design, Tips, Examples

Appinio Research · 21.02.2024 · 33min read

Causal Research Definition Design Tips Examples

Ever wondered why certain events lead to specific outcomes? Understanding causality—the relationship between cause and effect—is crucial for unraveling the mysteries of the world around us. In this guide on causal research, we delve into the methods, techniques, and principles behind identifying and establishing cause-and-effect relationships between variables. Whether you're a seasoned researcher or new to the field, this guide will equip you with the knowledge and tools to conduct rigorous causal research and draw meaningful conclusions that can inform decision-making and drive positive change.

What is Causal Research?

Causal research is a methodological approach used in scientific inquiry to investigate cause-and-effect relationships between variables. Unlike correlational or descriptive research, which merely examine associations or describe phenomena, causal research aims to determine whether changes in one variable cause changes in another variable.

Importance of Causal Research

Understanding the importance of causal research is crucial for appreciating its role in advancing knowledge and informing decision-making across various fields. Here are key reasons why causal research is significant:

  • Establishing Causality:  Causal research enables researchers to determine whether changes in one variable directly cause changes in another variable. This helps identify effective interventions, predict outcomes, and inform evidence-based practices.
  • Guiding Policy and Practice:  By identifying causal relationships, causal research provides empirical evidence to support policy decisions, program interventions, and business strategies. Decision-makers can use causal findings to allocate resources effectively and address societal challenges.
  • Informing Predictive Modeling:  Causal research contributes to the development of predictive models by elucidating causal mechanisms underlying observed phenomena. Predictive models based on causal relationships can accurately forecast future outcomes and trends.
  • Advancing Scientific Knowledge:  Causal research contributes to the cumulative body of scientific knowledge by testing hypotheses, refining theories, and uncovering underlying mechanisms of phenomena. It fosters a deeper understanding of complex systems and phenomena.
  • Mitigating Confounding Factors:  Understanding causal relationships allows researchers to control for confounding variables and reduce bias in their studies. By isolating the effects of specific variables, researchers can draw more valid and reliable conclusions.

Causal Research Distinction from Other Research

Understanding the distinctions between causal research and other types of research methodologies is essential for researchers to choose the most appropriate approach for their study objectives. Let's explore the differences and similarities between causal research and descriptive, exploratory, and correlational research methodologies .

Descriptive vs. Causal Research

Descriptive research  focuses on describing characteristics, behaviors, or phenomena without manipulating variables or establishing causal relationships. It provides a snapshot of the current state of affairs but does not attempt to explain why certain phenomena occur.

Causal research , on the other hand, seeks to identify cause-and-effect relationships between variables by systematically manipulating independent variables and observing their effects on dependent variables. Unlike descriptive research, causal research aims to determine whether changes in one variable directly cause changes in another variable.

Similarities:

  • Both descriptive and causal research involve empirical observation and data collection.
  • Both types of research contribute to the scientific understanding of phenomena, albeit through different approaches.

Differences:

  • Descriptive research focuses on describing phenomena, while causal research aims to explain why phenomena occur by identifying causal relationships.
  • Descriptive research typically uses observational methods, while causal research often involves experimental designs or causal inference techniques to establish causality.

Exploratory vs. Causal Research

Exploratory research  aims to explore new topics, generate hypotheses, or gain initial insights into phenomena. It is often conducted when little is known about a subject and seeks to generate ideas for further investigation.

Causal research , on the other hand, is concerned with testing hypotheses and establishing cause-and-effect relationships between variables. It builds on existing knowledge and seeks to confirm or refute causal hypotheses through systematic investigation.

  • Both exploratory and causal research contribute to the generation of knowledge and theory development.
  • Both types of research involve systematic inquiry and data analysis to answer research questions.
  • Exploratory research focuses on generating hypotheses and exploring new areas of inquiry, while causal research aims to test hypotheses and establish causal relationships.
  • Exploratory research is more flexible and open-ended, while causal research follows a more structured and hypothesis-driven approach.

Correlational vs. Causal Research

Correlational research  examines the relationship between variables without implying causation. It identifies patterns of association or co-occurrence between variables but does not establish the direction or causality of the relationship.

Causal research , on the other hand, seeks to establish cause-and-effect relationships between variables by systematically manipulating independent variables and observing their effects on dependent variables. It goes beyond mere association to determine whether changes in one variable directly cause changes in another variable.

  • Both correlational and causal research involve analyzing relationships between variables.
  • Both types of research contribute to understanding the nature of associations between variables.
  • Correlational research focuses on identifying patterns of association, while causal research aims to establish causal relationships.
  • Correlational research does not manipulate variables, while causal research involves systematically manipulating independent variables to observe their effects on dependent variables.

How to Formulate Causal Research Hypotheses?

Crafting research questions and hypotheses is the foundational step in any research endeavor. Defining your variables clearly and articulating the causal relationship you aim to investigate is essential. Let's explore this process further.

1. Identify Variables

Identifying variables involves recognizing the key factors you will manipulate or measure in your study. These variables can be classified into independent, dependent, and confounding variables.

  • Independent Variable (IV):  This is the variable you manipulate or control in your study. It is the presumed cause that you want to test.
  • Dependent Variable (DV):  The dependent variable is the outcome or response you measure. It is affected by changes in the independent variable.
  • Confounding Variables:  These are extraneous factors that may influence the relationship between the independent and dependent variables, leading to spurious correlations or erroneous causal inferences. Identifying and controlling for confounding variables is crucial for establishing valid causal relationships.

2. Establish Causality

Establishing causality requires meeting specific criteria outlined by scientific methodology. While correlation between variables may suggest a relationship, it does not imply causation. To establish causality, researchers must demonstrate the following:

  • Temporal Precedence:  The cause must precede the effect in time. In other words, changes in the independent variable must occur before changes in the dependent variable.
  • Covariation of Cause and Effect:  Changes in the independent variable should be accompanied by corresponding changes in the dependent variable. This demonstrates a consistent pattern of association between the two variables.
  • Elimination of Alternative Explanations:  Researchers must rule out other possible explanations for the observed relationship between variables. This involves controlling for confounding variables and conducting rigorous experimental designs to isolate the effects of the independent variable.

3. Write Clear and Testable Hypotheses

Hypotheses serve as tentative explanations for the relationship between variables and provide a framework for empirical testing. A well-formulated hypothesis should be:

  • Specific:  Clearly state the expected relationship between the independent and dependent variables.
  • Testable:  The hypothesis should be capable of being empirically tested through observation or experimentation.
  • Falsifiable:  There should be a possibility of proving the hypothesis false through empirical evidence.

For example, a hypothesis in a study examining the effect of exercise on weight loss could be: "Increasing levels of physical activity (IV) will lead to greater weight loss (DV) among participants (compared to those with lower levels of physical activity)."

By formulating clear hypotheses and operationalizing variables, researchers can systematically investigate causal relationships and contribute to the advancement of scientific knowledge.

Causal Research Design

Designing your research study involves making critical decisions about how you will collect and analyze data to investigate causal relationships.

Experimental vs. Observational Designs

One of the first decisions you'll make when designing a study is whether to employ an experimental or observational design. Each approach has its strengths and limitations, and the choice depends on factors such as the research question, feasibility , and ethical considerations.

  • Experimental Design: In experimental designs, researchers manipulate the independent variable and observe its effects on the dependent variable while controlling for confounding variables. Random assignment to experimental conditions allows for causal inferences to be drawn. Example: A study testing the effectiveness of a new teaching method on student performance by randomly assigning students to either the experimental group (receiving the new teaching method) or the control group (receiving the traditional method).
  • Observational Design: Observational designs involve observing and measuring variables without intervention. Researchers may still examine relationships between variables but cannot establish causality as definitively as in experimental designs. Example: A study observing the association between socioeconomic status and health outcomes by collecting data on income, education level, and health indicators from a sample of participants.

Control and Randomization

Control and randomization are crucial aspects of experimental design that help ensure the validity of causal inferences.

  • Control: Controlling for extraneous variables involves holding constant factors that could influence the dependent variable, except for the independent variable under investigation. This helps isolate the effects of the independent variable. Example: In a medication trial, controlling for factors such as age, gender, and pre-existing health conditions ensures that any observed differences in outcomes can be attributed to the medication rather than other variables.
  • Randomization: Random assignment of participants to experimental conditions helps distribute potential confounders evenly across groups, reducing the likelihood of systematic biases and allowing for causal conclusions. Example: Randomly assigning patients to treatment and control groups in a clinical trial ensures that both groups are comparable in terms of baseline characteristics, minimizing the influence of extraneous variables on treatment outcomes.

Internal and External Validity

Two key concepts in research design are internal validity and external validity, which relate to the credibility and generalizability of study findings, respectively.

  • Internal Validity: Internal validity refers to the extent to which the observed effects can be attributed to the manipulation of the independent variable rather than confounding factors. Experimental designs typically have higher internal validity due to their control over extraneous variables. Example: A study examining the impact of a training program on employee productivity would have high internal validity if it could confidently attribute changes in productivity to the training intervention.
  • External Validity: External validity concerns the extent to which study findings can be generalized to other populations, settings, or contexts. While experimental designs prioritize internal validity, they may sacrifice external validity by using highly controlled conditions that do not reflect real-world scenarios. Example: Findings from a laboratory study on memory retention may have limited external validity if the experimental tasks and conditions differ significantly from real-life learning environments.

Types of Experimental Designs

Several types of experimental designs are commonly used in causal research, each with its own strengths and applications.

  • Randomized Control Trials (RCTs): RCTs are considered the gold standard for assessing causality in research. Participants are randomly assigned to experimental and control groups, allowing researchers to make causal inferences. Example: A pharmaceutical company testing a new drug's efficacy would use an RCT to compare outcomes between participants receiving the drug and those receiving a placebo.
  • Quasi-Experimental Designs: Quasi-experimental designs lack random assignment but still attempt to establish causality by controlling for confounding variables through design or statistical analysis . Example: A study evaluating the effectiveness of a smoking cessation program might compare outcomes between participants who voluntarily enroll in the program and a matched control group of non-enrollees.

By carefully selecting an appropriate research design and addressing considerations such as control, randomization, and validity, researchers can conduct studies that yield credible evidence of causal relationships and contribute valuable insights to their field of inquiry.

Causal Research Data Collection

Collecting data is a critical step in any research study, and the quality of the data directly impacts the validity and reliability of your findings.

Choosing Measurement Instruments

Selecting appropriate measurement instruments is essential for accurately capturing the variables of interest in your study. The choice of measurement instrument depends on factors such as the nature of the variables, the target population , and the research objectives.

  • Surveys :  Surveys are commonly used to collect self-reported data on attitudes, opinions, behaviors, and demographics . They can be administered through various methods, including paper-and-pencil surveys, online surveys, and telephone interviews.
  • Observations:  Observational methods involve systematically recording behaviors, events, or phenomena as they occur in natural settings. Observations can be structured (following a predetermined checklist) or unstructured (allowing for flexible data collection).
  • Psychological Tests:  Psychological tests are standardized instruments designed to measure specific psychological constructs, such as intelligence, personality traits, or emotional functioning. These tests often have established reliability and validity.
  • Physiological Measures:  Physiological measures, such as heart rate, blood pressure, or brain activity, provide objective data on bodily processes. They are commonly used in health-related research but require specialized equipment and expertise.
  • Existing Databases:  Researchers may also utilize existing datasets, such as government surveys, public health records, or organizational databases, to answer research questions. Secondary data analysis can be cost-effective and time-saving but may be limited by the availability and quality of data.

Ensuring accurate data collection is the cornerstone of any successful research endeavor. With the right tools in place, you can unlock invaluable insights to drive your causal research forward. From surveys to tests, each instrument offers a unique lens through which to explore your variables of interest.

At Appinio , we understand the importance of robust data collection methods in informing impactful decisions. Let us empower your research journey with our intuitive platform, where you can effortlessly gather real-time consumer insights to fuel your next breakthrough.   Ready to take your research to the next level? Book a demo today and see how Appinio can revolutionize your approach to data collection!

Book a Demo

Sampling Techniques

Sampling involves selecting a subset of individuals or units from a larger population to participate in the study. The goal of sampling is to obtain a representative sample that accurately reflects the characteristics of the population of interest.

  • Probability Sampling:  Probability sampling methods involve randomly selecting participants from the population, ensuring that each member of the population has an equal chance of being included in the sample. Common probability sampling techniques include simple random sampling , stratified sampling, and cluster sampling.
  • Non-Probability Sampling:  Non-probability sampling methods do not involve random selection and may introduce biases into the sample. Examples of non-probability sampling techniques include convenience sampling, purposive sampling, and snowball sampling.

The choice of sampling technique depends on factors such as the research objectives, population characteristics, resources available, and practical constraints. Researchers should strive to minimize sampling bias and maximize the representativeness of the sample to enhance the generalizability of their findings.

Ethical Considerations

Ethical considerations are paramount in research and involve ensuring the rights, dignity, and well-being of research participants. Researchers must adhere to ethical principles and guidelines established by professional associations and institutional review boards (IRBs).

  • Informed Consent:  Participants should be fully informed about the nature and purpose of the study, potential risks and benefits, their rights as participants, and any confidentiality measures in place. Informed consent should be obtained voluntarily and without coercion.
  • Privacy and Confidentiality:  Researchers should take steps to protect the privacy and confidentiality of participants' personal information. This may involve anonymizing data, securing data storage, and limiting access to identifiable information.
  • Minimizing Harm:  Researchers should mitigate any potential physical, psychological, or social harm to participants. This may involve conducting risk assessments, providing appropriate support services, and debriefing participants after the study.
  • Respect for Participants:  Researchers should respect participants' autonomy, diversity, and cultural values. They should seek to foster a trusting and respectful relationship with participants throughout the research process.
  • Publication and Dissemination:  Researchers have a responsibility to accurately report their findings and acknowledge contributions from participants and collaborators. They should adhere to principles of academic integrity and transparency in disseminating research results.

By addressing ethical considerations in research design and conduct, researchers can uphold the integrity of their work, maintain trust with participants and the broader community, and contribute to the responsible advancement of knowledge in their field.

Causal Research Data Analysis

Once data is collected, it must be analyzed to draw meaningful conclusions and assess causal relationships.

Causal Inference Methods

Causal inference methods are statistical techniques used to identify and quantify causal relationships between variables in observational data. While experimental designs provide the most robust evidence for causality, observational studies often require more sophisticated methods to account for confounding factors.

  • Difference-in-Differences (DiD):  DiD compares changes in outcomes before and after an intervention between a treatment group and a control group, controlling for pre-existing trends. It estimates the average treatment effect by differencing the changes in outcomes between the two groups over time.
  • Instrumental Variables (IV):  IV analysis relies on instrumental variables—variables that affect the treatment variable but not the outcome—to estimate causal effects in the presence of endogeneity. IVs should be correlated with the treatment but uncorrelated with the error term in the outcome equation.
  • Regression Discontinuity (RD):  RD designs exploit naturally occurring thresholds or cutoff points to estimate causal effects near the threshold. Participants just above and below the threshold are compared, assuming that they are similar except for their proximity to the threshold.
  • Propensity Score Matching (PSM):  PSM matches individuals or units based on their propensity scores—the likelihood of receiving the treatment—creating comparable groups with similar observed characteristics. Matching reduces selection bias and allows for causal inference in observational studies.

Assessing Causality Strength

Assessing the strength of causality involves determining the magnitude and direction of causal effects between variables. While statistical significance indicates whether an observed relationship is unlikely to occur by chance, it does not necessarily imply a strong or meaningful effect.

  • Effect Size:  Effect size measures the magnitude of the relationship between variables, providing information about the practical significance of the results. Standard effect size measures include Cohen's d for mean differences and odds ratios for categorical outcomes.
  • Confidence Intervals:  Confidence intervals provide a range of values within which the actual effect size is likely to lie with a certain degree of certainty. Narrow confidence intervals indicate greater precision in estimating the true effect size.
  • Practical Significance:  Practical significance considers whether the observed effect is meaningful or relevant in real-world terms. Researchers should interpret results in the context of their field and the implications for stakeholders.

Handling Confounding Variables

Confounding variables are extraneous factors that may distort the observed relationship between the independent and dependent variables, leading to spurious or biased conclusions. Addressing confounding variables is essential for establishing valid causal inferences.

  • Statistical Control:  Statistical control involves including confounding variables as covariates in regression models to partially out their effects on the outcome variable. Controlling for confounders reduces bias and strengthens the validity of causal inferences.
  • Matching:  Matching participants or units based on observed characteristics helps create comparable groups with similar distributions of confounding variables. Matching reduces selection bias and mimics the randomization process in experimental designs.
  • Sensitivity Analysis:  Sensitivity analysis assesses the robustness of study findings to changes in model specifications or assumptions. By varying analytical choices and examining their impact on results, researchers can identify potential sources of bias and evaluate the stability of causal estimates.
  • Subgroup Analysis:  Subgroup analysis explores whether the relationship between variables differs across subgroups defined by specific characteristics. Identifying effect modifiers helps understand the conditions under which causal effects may vary.

By employing rigorous causal inference methods, assessing the strength of causality, and addressing confounding variables, researchers can confidently draw valid conclusions about causal relationships in their studies, advancing scientific knowledge and informing evidence-based decision-making.

Causal Research Examples

Examples play a crucial role in understanding the application of causal research methods and their impact across various domains. Let's explore some detailed examples to illustrate how causal research is conducted and its real-world implications:

Example 1: Software as a Service (SaaS) User Retention Analysis

Suppose a SaaS company wants to understand the factors influencing user retention and engagement with their platform. The company conducts a longitudinal observational study, collecting data on user interactions, feature usage, and demographic information over several months.

  • Design:  The company employs an observational cohort study design, tracking cohorts of users over time to observe changes in retention and engagement metrics. They use analytics tools to collect data on user behavior , such as logins, feature usage, session duration, and customer support interactions.
  • Data Collection:  Data is collected from the company's platform logs, customer relationship management (CRM) system, and user surveys. Key metrics include user churn rates, active user counts, feature adoption rates, and Net Promoter Scores ( NPS ).
  • Analysis:  Using statistical techniques like survival analysis and regression modeling, the company identifies factors associated with user retention, such as feature usage patterns, onboarding experiences, customer support interactions, and subscription plan types.
  • Findings: The analysis reveals that users who engage with specific features early in their lifecycle have higher retention rates, while those who encounter usability issues or lack personalized onboarding experiences are more likely to churn. The company uses these insights to optimize product features, improve onboarding processes, and enhance customer support strategies to increase user retention and satisfaction.

Example 2: Business Impact of Digital Marketing Campaign

Consider a technology startup launching a digital marketing campaign to promote its new product offering. The company conducts an experimental study to evaluate the effectiveness of different marketing channels in driving website traffic, lead generation, and sales conversions.

  • Design:  The company implements an A/B testing design, randomly assigning website visitors to different marketing treatment conditions, such as Google Ads, social media ads, email campaigns, or content marketing efforts. They track user interactions and conversion events using web analytics tools and marketing automation platforms.
  • Data Collection:  Data is collected on website traffic, click-through rates, conversion rates, lead generation, and sales revenue. The company also gathers demographic information and user feedback through surveys and customer interviews to understand the impact of marketing messages and campaign creatives .
  • Analysis:  Utilizing statistical methods like hypothesis testing and multivariate analysis, the company compares key performance metrics across different marketing channels to assess their effectiveness in driving user engagement and conversion outcomes. They calculate return on investment (ROI) metrics to evaluate the cost-effectiveness of each marketing channel.
  • Findings:  The analysis reveals that social media ads outperform other marketing channels in generating website traffic and lead conversions, while email campaigns are more effective in nurturing leads and driving sales conversions. Armed with these insights, the company allocates marketing budgets strategically, focusing on channels that yield the highest ROI and adjusting messaging and targeting strategies to optimize campaign performance.

These examples demonstrate the diverse applications of causal research methods in addressing important questions, informing policy decisions, and improving outcomes in various fields. By carefully designing studies, collecting relevant data, employing appropriate analysis techniques, and interpreting findings rigorously, researchers can generate valuable insights into causal relationships and contribute to positive social change.

How to Interpret Causal Research Results?

Interpreting and reporting research findings is a crucial step in the scientific process, ensuring that results are accurately communicated and understood by stakeholders.

Interpreting Statistical Significance

Statistical significance indicates whether the observed results are unlikely to occur by chance alone, but it does not necessarily imply practical or substantive importance. Interpreting statistical significance involves understanding the meaning of p-values and confidence intervals and considering their implications for the research findings.

  • P-values:  A p-value represents the probability of obtaining the observed results (or more extreme results) if the null hypothesis is true. A p-value below a predetermined threshold (typically 0.05) suggests that the observed results are statistically significant, indicating that the null hypothesis can be rejected in favor of the alternative hypothesis.
  • Confidence Intervals:  Confidence intervals provide a range of values within which the true population parameter is likely to lie with a certain degree of confidence (e.g., 95%). If the confidence interval does not include the null value, it suggests that the observed effect is statistically significant at the specified confidence level.

Interpreting statistical significance requires considering factors such as sample size, effect size, and the practical relevance of the results rather than relying solely on p-values to draw conclusions.

Discussing Practical Significance

While statistical significance indicates whether an effect exists, practical significance evaluates the magnitude and meaningfulness of the effect in real-world terms. Discussing practical significance involves considering the relevance of the results to stakeholders and assessing their impact on decision-making and practice.

  • Effect Size:  Effect size measures the magnitude of the observed effect, providing information about its practical importance. Researchers should interpret effect sizes in the context of their field and the scale of measurement (e.g., small, medium, or large effect sizes).
  • Contextual Relevance:  Consider the implications of the results for stakeholders, policymakers, and practitioners. Are the observed effects meaningful in the context of existing knowledge, theory, or practical applications? How do the findings contribute to addressing real-world problems or informing decision-making?

Discussing practical significance helps contextualize research findings and guide their interpretation and application in practice, beyond statistical significance alone.

Addressing Limitations and Assumptions

No study is without limitations, and researchers should transparently acknowledge and address potential biases, constraints, and uncertainties in their research design and findings.

  • Methodological Limitations:  Identify any limitations in study design, data collection, or analysis that may affect the validity or generalizability of the results. For example, sampling biases, measurement errors, or confounding variables.
  • Assumptions:  Discuss any assumptions made in the research process and their implications for the interpretation of results. Assumptions may relate to statistical models, causal inference methods, or theoretical frameworks underlying the study.
  • Alternative Explanations:  Consider alternative explanations for the observed results and discuss their potential impact on the validity of causal inferences. How robust are the findings to different interpretations or competing hypotheses?

Addressing limitations and assumptions demonstrates transparency and rigor in the research process, allowing readers to critically evaluate the validity and reliability of the findings.

Communicating Findings Clearly

Effectively communicating research findings is essential for disseminating knowledge, informing decision-making, and fostering collaboration and dialogue within the scientific community.

  • Clarity and Accessibility:  Present findings in a clear, concise, and accessible manner, using plain language and avoiding jargon or technical terminology. Organize information logically and use visual aids (e.g., tables, charts, graphs) to enhance understanding.
  • Contextualization:  Provide context for the results by summarizing key findings, highlighting their significance, and relating them to existing literature or theoretical frameworks. Discuss the implications of the findings for theory, practice, and future research directions.
  • Transparency:  Be transparent about the research process, including data collection procedures, analytical methods, and any limitations or uncertainties associated with the findings. Clearly state any conflicts of interest or funding sources that may influence interpretation.

By communicating findings clearly and transparently, researchers can facilitate knowledge exchange, foster trust and credibility, and contribute to evidence-based decision-making.

Causal Research Tips

When conducting causal research, it's essential to approach your study with careful planning, attention to detail, and methodological rigor. Here are some tips to help you navigate the complexities of causal research effectively:

  • Define Clear Research Questions:  Start by clearly defining your research questions and hypotheses. Articulate the causal relationship you aim to investigate and identify the variables involved.
  • Consider Alternative Explanations:  Be mindful of potential confounding variables and alternative explanations for the observed relationships. Take steps to control for confounders and address alternative hypotheses in your analysis.
  • Prioritize Internal Validity:  While external validity is important for generalizability, prioritize internal validity in your study design to ensure that observed effects can be attributed to the manipulation of the independent variable.
  • Use Randomization When Possible:  If feasible, employ randomization in experimental designs to distribute potential confounders evenly across experimental conditions and enhance the validity of causal inferences.
  • Be Transparent About Methods:  Provide detailed descriptions of your research methods, including data collection procedures, analytical techniques, and any assumptions or limitations associated with your study.
  • Utilize Multiple Methods:  Consider using a combination of experimental and observational methods to triangulate findings and strengthen the validity of causal inferences.
  • Be Mindful of Sample Size:  Ensure that your sample size is adequate to detect meaningful effects and minimize the risk of Type I and Type II errors. Conduct power analyses to determine the sample size needed to achieve sufficient statistical power.
  • Validate Measurement Instruments:  Validate your measurement instruments to ensure that they are reliable and valid for assessing the variables of interest in your study. Pilot test your instruments if necessary.
  • Seek Feedback from Peers:  Collaborate with colleagues or seek feedback from peer reviewers to solicit constructive criticism and improve the quality of your research design and analysis.

Conclusion for Causal Research

Mastering causal research empowers researchers to unlock the secrets of cause and effect, shedding light on the intricate relationships between variables in diverse fields. By employing rigorous methods such as experimental designs, causal inference techniques, and careful data analysis, you can uncover causal mechanisms, predict outcomes, and inform evidence-based practices. Through the lens of causal research, complex phenomena become more understandable, and interventions become more effective in addressing societal challenges and driving progress. In a world where understanding the reasons behind events is paramount, causal research serves as a beacon of clarity and insight. Armed with the knowledge and techniques outlined in this guide, you can navigate the complexities of causality with confidence, advancing scientific knowledge, guiding policy decisions, and ultimately making meaningful contributions to our understanding of the world.

How to Conduct Causal Research in Minutes?

Introducing Appinio , your gateway to lightning-fast causal research. As a real-time market research platform, we're revolutionizing how companies gain consumer insights to drive data-driven decisions. With Appinio, conducting your own market research is not only easy but also thrilling. Experience the excitement of market research with Appinio, where fast, intuitive, and impactful insights are just a click away.

Here's why you'll love Appinio:

  • Instant Insights:  Say goodbye to waiting days for research results. With our platform, you'll go from questions to insights in minutes, empowering you to make decisions at the speed of business.
  • User-Friendly Interface:  No need for a research degree here! Our intuitive platform is designed for anyone to use, making complex research tasks simple and accessible.
  • Global Reach:  Reach your target audience wherever they are. With access to over 90 countries and the ability to define precise target groups from 1200+ characteristics, you'll gather comprehensive data to inform your decisions.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

What is Field Research Definition Types Methods Examples

05.04.2024 | 27min read

What is Field Research? Definition, Types, Methods, Examples

What is Cluster Sampling Definition Methods Examples

03.04.2024 | 29min read

What is Cluster Sampling? Definition, Methods, Examples

Cross Tabulation Analysis Examples A Full Guide

01.04.2024 | 26min read

Cross-Tabulation Analysis: A Full Guide (+ Examples)

Research-Methodology

Causal Research (Explanatory research)

Causal research, also known as explanatory research is conducted in order to identify the extent and nature of cause-and-effect relationships. Causal research can be conducted in order to assess impacts of specific changes on existing norms, various processes etc.

Causal studies focus on an analysis of a situation or a specific problem to explain the patterns of relationships between variables. Experiments  are the most popular primary data collection methods in studies with causal research design.

The presence of cause cause-and-effect relationships can be confirmed only if specific causal evidence exists. Causal evidence has three important components:

1. Temporal sequence . The cause must occur before the effect. For example, it would not be appropriate to credit the increase in sales to rebranding efforts if the increase had started before the rebranding.

2. Concomitant variation . The variation must be systematic between the two variables. For example, if a company doesn’t change its employee training and development practices, then changes in customer satisfaction cannot be caused by employee training and development.

3. Nonspurious association . Any covarioaton between a cause and an effect must be true and not simply due to other variable. In other words, there should be no a ‘third’ factor that relates to both, cause, as well as, effect.

The table below compares the main characteristics of causal research to exploratory and descriptive research designs: [1]

Main characteristics of research designs

 Examples of Causal Research (Explanatory Research)

The following are examples of research objectives for causal research design:

  • To assess the impacts of foreign direct investment on the levels of economic growth in Taiwan
  • To analyse the effects of re-branding initiatives on the levels of customer loyalty
  • To identify the nature of impact of work process re-engineering on the levels of employee motivation

Advantages of Causal Research (Explanatory Research)

  • Causal studies may play an instrumental role in terms of identifying reasons behind a wide range of processes, as well as, assessing the impacts of changes on existing norms, processes etc.
  • Causal studies usually offer the advantages of replication if necessity arises
  • This type of studies are associated with greater levels of internal validity due to systematic selection of subjects

Disadvantages of Causal Research (Explanatory Research)

  • Coincidences in events may be perceived as cause-and-effect relationships. For example, Punxatawney Phil was able to forecast the duration of winter for five consecutive years, nevertheless, it is just a rodent without intellect and forecasting powers, i.e. it was a coincidence.
  • It can be difficult to reach appropriate conclusions on the basis of causal research findings. This is due to the impact of a wide range of factors and variables in social environment. In other words, while casualty can be inferred, it cannot be proved with a high level of certainty.
  • It certain cases, while correlation between two variables can be effectively established; identifying which variable is a cause and which one is the impact can be a difficult task to accomplish.

My e-book,  The Ultimate Guide to Writing a Dissertation in Business Studies: a step by step assistance  contains discussions of theory and application of research designs. The e-book also explains all stages of the  research process  starting from the  selection of the research area  to writing personal reflection. Important elements of dissertations such as  research philosophy ,  research approach ,  methods of data collection ,  data analysis  and  sampling  are explained in this e-book in simple words.

John Dudovskiy

Causal Research (Explanatory research)

[1] Source: Zikmund, W.G., Babin, J., Carr, J. & Griffin, M. (2012) “Business Research Methods: with Qualtrics Printed Access Card” Cengage Learning

  • What is Causal Research? Definition + Key Elements

Moradeke Owa

Cause-and-effect relationships happen in all aspects of life, from business to medicine, to marketing, to education, and so much more. They are the invisible threads that connect both our actions and inactions to their outcomes. 

Causal research is the type of research that investigates cause-and-effect relationships. It is more comprehensive than descriptive research, which just talks about how things affect each other.

Let’s take a closer look at how you can use informal research to gain insight into your research results and make more informed decisions.

What’s the Difference Between Correlation and Causation

Defining Causal Research

Causal research investigates why one variable (the independent variable) is causing things to change in another ( the dependent variable). 

For example, a causal research study about the cause-and-effect relationship between smoking and the prevalence of lung cancer. Smoking prevalence would be the independent variable, while lung cancer prevalence would be the dependent variable. 

You would establish that smoking causes lung cancer by modulating the independent variable (smoking) and observing the effects on the dependent variable (lung cancer).

What’s the Difference Between Correlation and Causation

Correlation simply means that two variables are related to each other. But it does not necessarily mean that one variable causes changes in the other. 

For example, let’s say there is a correlation between high coffee sales and low ice cream sales. This does not mean that people are not buying ice cream because they prefer coffee. 

Both of these variables correlate because they’re influenced by the same factor: cold weather.

The Need for Causal Research

Examples of Where Causal Relationships Are Critical

The major reason for investigating causal relationships between variables is better decision-making , which leads to developing effective solutions to complex problems. Here’s a breakdown of how it works:

  • Decision-Making

Causal research enables us to figure out how variables relate to each other and how a change in one variable affects another. This helps us make better decisions about resource allocation, problem-solving, and achieving our goals.

In business, for example, customer satisfaction (independent variable) directly impacts sales (dependent variable). If customers are happy with your product or service, they’re more likely to keep returning and recommending it to their friends, which translates into more sales.

  • Developing Effective Solutions to Problems

Understanding the causes of a problem,  allows you to develop more effective solutions to address it. For example, medical causal research enables you to understand symptoms better, create new prevention strategies, and provide more effective treatment for illnesses.

Key Elements of Causal Research

Examples of Where Causal Relationships Are Critical

Here are a couple of ways  you can leverage causal research:

  • Policy-making : Causal research informs policy decisions about issues such as education, healthcare, and the environment. Let’s say causal research shows that the availability of junk food in schools directly impacts the prevalence of obesity in teenagers. This would inform the decision to incorporate more healthy food options in schools.
  • Marketing strategies : Causal research studies allow you to identify factors that influence customer behavior to develop effective marketing strategies. For example, you can use causal research to reach and attract your target audience with the right content.
  • Product development : Causal research enables you to create successful products by understanding users’ pain points and providing products that meet these needs.

Research Designs for Establishing Causality

Key Elements of Causal Research

Let’s take a deep dive into what it takes to design and conduct a causal study:

  • Control and Experimental Groups

In a controlled study, the researchers randomly put people into one of two groups: the control group, who don’t get the treatment, or the experimental group, who do.

Having a control group allows you to compare the effects of the treatment to the effects of no treatment. It enables you to rule out the possibility that any changes in the dependent variable are due to factors other than the treatment.

  • Independent variable : The independent variable is the variable that affects the dependent variable. It is the variable that you alter to see the effect on the dependent variable.
  • Dependent variable : The dependent variable is the variable that is affected by the independent variable. This is what you measure to see the impact of the independent variable.

An Illustration of How Independent vs Dependent Variable Works in Causal Research

Here’s an illustration to help you understand how to differentiate and use variables in causal research:

Let’s say you want to investigate “ the effect of dieting on weight loss ”, dieting would be the independent variable, and weight loss would be the dependent variable. Next, you would vary the independent variable (dieting) by assigning some participants to a restricted diet and others to a control group. 

You will see the cause-and-effect relationship between dieting and weight loss by measuring the dependent variable (weight loss) in both groups.

Skip the setup hassle! Get a head start on your research with our ready-to-use Experimental Research Survey Template

Research Designs for Establishing Causality

There are several ways to investigate the relationship between variables, but here are the most common:

A. Experimental Design

Experimental designs are the gold standard for establishing causality. In an experimental design, the researcher randomly assigns participants to either a control group or an experimental group. The control group does not receive the treatment, while the experimental group does.

Pros of experimental designs :

  • Highly rigorous
  • Explicitly establishes causality
  • Strictly controls for extraneous variables
  • Time-consuming and expensive
  • Difficult to implement in real-world settings
  • Not always ethical

B. Quasi-Experimental Design

A quasi-experimental design attempts to determine the causal relationship without fully randomizing the participant distribution into groups. The primary reason for this is ethical or practical considerations.

Different types of quasi-experimental designs

  • Time series design : This design involves collecting data over time on the same group of participants. You see the cause-and-effect relationship by identifying the changes in the dependent variable that coincide with changes in the independent variable.
  • Nonequivalent control group design : This design involves comparing an experimental group to a control group that is not randomly assigned. The differences between the two groups explain the cause-and-effect relationship.
  • Interrupted time series design : Unlike the time series that measures changes over time, this introduces treatment at a specific point in time. You figure out the relationship between treatment and the dependent variable by looking for any changes that occurred at the time the treatment was introduced.

Pros of quasi-experimental designs

  • Cost-effective
  • More feasible to implement in real-world settings
  • More ethical than experimental designs
  • Not as thorough as experimental designs
  • May not accurately establish causality
  • More susceptible to bias

Establishing Causality without Experiments

Using experiments to determine the cause-and-effect relationship between each dependent variable and the independent variable can be time-consuming and expensive. As a result, the following are cost-effective methods for establishing a causal relationship:

  • Longitudinal Studies

Long-term studies are observational studies that follow the same participants or groups over a long period. This way, you can see changes in variables you’re studying over time, and establish a causal relationship between them.

For example, you can use a longitudinal study to determine the effect of a new education program on student performance. You then track students’ academic performance over the years to see if the program improved student performance.

Challenges of Longitudinal Studies

One of the biggest problems of longitudinal studies is confounding variables. These are factors that are related to both the independent variable and the dependent variable.

Confounding variables can make it hard to isolate the cause of an independent variable’s effect. Using the earlier example, if you’re looking at how new educational programs affect student success, you need to make sure you’re controlling for factors such as students’ socio-economic background and their prior academic performance.

  • Instrumental Variables (IV) Analysis

Instrumental variable analysis (IV) is a statistical approach that enables you to estimate causal effects in observational studies. An instrumental variable is a variable that is correlated with the independent variable but is not correlated with the dependent variable except through the independent variable.

For example, in academic achievement research, an instrumental variable could be the distance to the nearest college. This variable is correlated with family income but doesn’t correlate with academic achievement except through family income.

Challenges of Instrumental Variables (IV) Analysis

A primary limitation of IV analysis is that it can be challenging to find a good instrumental variable. IV analysis can also be very sensitive to the assumptions of the model.

Challenges and Pitfalls

Establishing Causality without Experiments

It is a powerful tool for solving problems, making better decisions, and advancing human knowledge. However, causal research is not without its challenges and pitfalls.

  • Confounding Variables

A confounding variable is a variable that correlates with both the independent and dependent variables, and it can make it difficult to isolate the causal effect of the independent variable. 

For example, let’s say you are interested in the causal effect of smoking on lung cancer. If you simply compare smokers to nonsmokers, you may find that smokers are more likely to get lung cancer. 

However, the relationship between smoking and lung cancer may be confounded by other factors, such as age, socioeconomic status, or exposure to secondhand smoke. These other factors may be responsible for the increased risk of lung cancer in smokers, rather than smoking itself.

Unlock the research secrets that top professionals use: Get the facts you need about Desk Research here 

Strategy to Control for Confounding Variables

Confounding variables can lead to misleading results and make it difficult to determine the cause-and-effect between variables. Here are some strategies that allow you to control for confounding variables and improve the reliability of causal research findings:

  • Randomized Controlled Trial (RCT)

In an RCT, participants are randomly assigned to either the treatment group or the control group. This ensures that the two groups are comparable on all confounding variables, except for the treatment itself.

  • Statistical Methods

Using statistical methods such as multivariate regression analysis allows you to control for multiple confounding variables simultaneously.

Reverse Causation

Reverse Causation is when the relationship between the cause and effect of variables is reversed. 

For example, let’s say you want to find a correlation between education and income. You’d expect people with higher levels of education to earn more, right? 

Well, what if it’s the other way around? What if people with higher income are only more college-educated because they can afford it and lower-income people can’t?

Strategy to Control for Reverse Causation

Here are some ways to prevent and mitigate the effect of reverse causation:

  • Longitudinal study

A longitudinal study follows the same individuals or groups over time. This allows researchers to see how changes in one variable (e.g., education) are associated with changes in another variable (e.g., income) over time.

  • Instrumental Variables Analysis

Instrumental variables analysis is a statistical technique that estimates the causal effect of a variable when there is reverse causation.

Real-World Applications

Causal research allows us to identify the root causes of problems and develop solutions that work. Here are some examples of the real-world applications of causal research:

  • Healthcare Research:

Causal research enables healthcare professionals to figure out what causes diseases and how to treat them.

 For example, medical researchers can use casual research to figure out if a drug or treatment is effective for a specific condition. It also helps determine what causes certain diseases.

Randomized controlled trials (RCTs) are widely regarded as the standard for determining causal relationships in healthcare research. They have been used to determine the effects of multiple medical interventions, such as the effectiveness of new drugs and vaccines, surgery, as well as lifestyle changes on health.

  • Public Policy Impact

Causal research can also be used to inform public policy decisions. For example, a causal study showed that early childhood education for disadvantaged children improved their academic performance and reduced their likelihood of dropping out. This has been leveraged to support policies that increase early childhood education access.

You can also use causal research to see if existing policies are working. For example, a causal study proves that giving ex-offenders job training reduces their chances of reoffending. The governments would be motivated to set up, fund, and mandate ex-offenders to take training programs.

Understanding causal effects helps us make informed decisions across different fields such as health, business, lifestyle, public policy, and more. But, this research method has its challenges and limitations.

Using the best practices and strategies in this guide can help you mitigate the limitations of causal research. Start your journey to seamlessly collecting valid data for your research with Formplus .

Logo

Connect to Formplus, Get Started Now - It's Free!

  • casual research
  • research design
  • Moradeke Owa

Formplus

You may also like:

43 Market Research Terminologies You Need To Know

Introduction Market research is a process of gathering information to determine the needs, wants, or behaviors of consumers or...

what type of research aims to explore cause and effect

Writing Research Proposals: Tips, Examples & Mistakes

In this article, we’ll discover several tips for writing an effective research proposal and common pitfalls you should look out for.

Desk Research: Definition, Types, Application, Pros & Cons

If you are looking for a way to conduct a research study while optimizing your resources, desk research is a great option. Desk research...

Projective Techniques In Surveys: Definition, Types & Pros & Cons

Introduction When you’re conducting a survey, you need to find out what people think about things. But how do you get an accurate and...

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Explanatory Research | Definition, Guide, & Examples

Explanatory Research | Definition, Guide, & Examples

Published on December 3, 2021 by Tegan George and Julia Merkus. Revised on November 20, 2023.

Explanatory research is a research method that explores why something occurs when limited information is available. It can help you increase your understanding of a given topic, ascertain how or why a particular phenomenon is occurring, and predict future occurrences.

Explanatory research can also be explained as a “cause and effect” model, investigating patterns and trends in existing data that haven’t been previously investigated. For this reason, it is often considered a type of causal research .

Table of contents

When to use explanatory research, explanatory research questions, explanatory research data collection, explanatory research data analysis, step-by-step example of explanatory research, explanatory vs. exploratory research, advantages and disadvantages of explanatory research, other interesting articles, frequently asked questions about explanatory research.

Explanatory research is used to investigate how or why a phenomenon takes place. Therefore, this type of research is often one of the first stages in the research process, serving as a jumping-off point for future research. While there is often data available about your topic, it’s possible the particular causal relationship you are interested in has not been robustly studied.

Explanatory research helps you analyze these patterns, formulating hypotheses that can guide future endeavors. If you are seeking a more complete understanding of a relationship between variables, explanatory research is a great place to start. However, keep in mind that it will likely not yield conclusive results.

You analyzed their final grades and noticed that the students who take your course in the first semester always obtain higher grades than students who take the same course in the second semester.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

what type of research aims to explore cause and effect

Explanatory research answers “why” and “how” questions, leading to an improved understanding of a previously unresolved problem or providing clarity for related future research initiatives.

Here are a few examples:

  • Why do undergraduate students obtain higher average grades in the first semester than in the second semester?
  • How does marital status affect labor market participation?
  • Why do multilingual individuals show more risky behavior during business negotiations than monolingual individuals?
  • How does a child’s ability to delay immediate gratification predict success later in life?
  • Why are teens more likely to litter in a highly littered area than in a clean area?

After choosing your research question, there is a variety of options for research and data collection methods to choose from.

A few of the most common research methods include:

  • Literature reviews
  • Interviews and focus groups
  • Pilot studies
  • Observations
  • Experiments

The method you choose depends on several factors, including your timeline, budget, and the structure of your question. If there is already a body of research on your topic, a literature review is a great place to start. If you are interested in opinions and behavior, consider an interview or focus group format. If you have more time or funding available, an experiment or pilot study may be a good fit for you.

In order to ensure you are conducting your explanatory research correctly, be sure your analysis is definitively causal in nature, and not just correlated.

Always remember the phrase “correlation doesn’t mean causation.” Correlated variables are merely associated with one another: when one variable changes, so does the other. However, this isn’t necessarily due to a direct or indirect causal link.

Causation means that changes in the independent variable bring about changes in the dependent variable. In other words, there is a direct cause-and-effect relationship between variables.

Causal evidence must meet three criteria:

  • Temporal : What you define as the “cause” must precede what you define as the “effect.”
  • Variation : Intervention must be systematic between your independent variable and dependent variable.
  • Non-spurious : Be careful that there are no mitigating factors or hidden third variables that confound your results.

Correlation doesn’t imply causation, but causation always implies correlation. In order to get conclusive causal results, you’ll need to conduct a full experimental design .

Prevent plagiarism. Run a free check.

Your explanatory research design depends on the research method you choose to collect your data . In most cases, you’ll use an experiment to investigate potential causal relationships. We’ll walk you through the steps using an example.

Step 1: Develop the research question

The first step in conducting explanatory research is getting familiar with the topic you’re interested in, so that you can develop a research question .

Let’s say you’re interested in language retention rates in adults.

You are interested in finding out how the duration of exposure to language influences language retention ability later in life.

Step 2: Formulate a hypothesis

The next step is to address your expectations. In some cases, there is literature available on your subject or on a closely related topic that you can use as a foundation for your hypothesis . In other cases, the topic isn’t well studied, and you’ll have to develop your hypothesis based on your instincts or on existing literature on more distant topics.

You phrase your expectations in terms of a null (H 0 ) and alternative hypothesis (H 1 ):

  • H 0 : The duration of exposure to a language in infancy does not influence language retention in adults who were adopted from abroad as children.
  • H 1 : The duration of exposure to a language in infancy has a positive effect on language retention in adults who were adopted from abroad as children.

Step 3: Design your methodology and collect your data

Next, decide what data collection and data analysis methods you will use and write them up. After carefully designing your research, you can begin to collect your data.

You compare:

  • Adults who were adopted from Colombia between 0 and 6 months of age.
  • Adults who were adopted from Colombia between 6 and 12 months of age.
  • Adults who were adopted from Colombia between 12 and 18 months of age.
  • Monolingual adults who have not been exposed to a different language.

During the study, you test their Spanish language proficiency twice in a research design that has three stages:

  • Pre-test : You conduct several language proficiency tests to establish any differences between groups pre-intervention.
  • Intervention : You provide all groups with 8 hours of Spanish class.
  • Post-test : You again conduct several language proficiency tests to establish any differences between groups post-intervention.

You made sure to control for any confounding variables , such as age, gender, proficiency in other languages, etc.

Step 4: Analyze your data and report results

After data collection is complete, proceed to analyze your data and report the results.

You notice that:

  • The pre-exposed adults showed higher language proficiency in Spanish than those who had not been pre-exposed. The difference is even greater for the post-test.
  • The adults who were adopted between 12 and 18 months of age had a higher Spanish language proficiency level than those who were adopted between 0 and 6 months or 6 and 12 months of age, but there was no difference found between the latter two groups.

To determine whether these differences are significant, you conduct a mixed ANOVA. The ANOVA shows that all differences are not significant for the pre-test, but they are significant for the post-test.

Step 5: Interpret your results and provide suggestions for future research

As you interpret the results, try to come up with explanations for the results that you did not expect. In most cases, you want to provide suggestions for future research.

However, this difference is only significant after the intervention (the Spanish class.)

You decide it’s worth it to further research the matter, and propose a few additional research ideas:

  • Replicate the study with a larger sample
  • Replicate the study for other maternal languages (e.g. Korean, Lingala, Arabic)
  • Replicate the study for other language aspects, such as nativeness of the accent

It can be easy to confuse explanatory research with exploratory research. If you’re in doubt about the relationship between exploratory and explanatory research, just remember that exploratory research lays the groundwork for later explanatory research.

Exploratory research questions often begin with “what”. They are designed to guide future research and do not usually have conclusive results. Exploratory research is often utilized as a first step in your research process, to help you focus your research question and fine-tune your hypotheses.

Explanatory research questions often start with “why” or “how”. They help you study why and how a previously studied phenomenon takes place.

Exploratory vs explanatory research

Like any other research design , explanatory research has its trade-offs: while it provides a unique set of benefits, it also has significant downsides:

  • It gives more meaning to previous research. It helps fill in the gaps in existing analyses and provides information on the reasons behind phenomena.
  • It is very flexible and often replicable , since the internal validity tends to be high when done correctly.
  • As you can often use secondary research, explanatory research is often very cost- and time-effective, allowing you to utilize pre-existing resources to guide your research prior to committing to heavier analyses.

Disadvantages

  • While explanatory research does help you solidify your theories and hypotheses, it usually lacks conclusive results.
  • Results can be biased or inadmissible to a larger body of work and are not generally externally valid . You will likely have to conduct more robust (often quantitative ) research later to bolster any possible findings gleaned from explanatory research.
  • Coincidences can be mistaken for causal relationships , and it can sometimes be challenging to ascertain which is the causal variable and which is the effect.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Ecological validity

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

Explanatory research is a research method used to investigate how or why something occurs when only a small amount of information is available pertaining to that topic. It can help you increase your understanding of a given topic.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

Explanatory research is used to investigate how or why a phenomenon occurs. Therefore, this type of research is often one of the first stages in the research process , serving as a jumping-off point for future research.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

George, T. & Merkus, J. (2023, November 20). Explanatory Research | Definition, Guide, & Examples. Scribbr. Retrieved April 8, 2024, from https://www.scribbr.com/methodology/explanatory-research/

Is this article helpful?

Tegan George

Tegan George

Other students also liked, exploratory research | definition, guide, & examples, what is a research design | types, guide & examples, qualitative vs. quantitative research | differences, examples & methods, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Published: April 2010

Cause and effect

Nature Methods volume  7 ,  page 243 ( 2010 ) Cite this article

14k Accesses

5 Citations

9 Altmetric

Metrics details

  • Biological techniques
  • Research management

The experimental tractability of biological systems makes it possible to explore the idea that causal relationships can be estimated from observational data.

“Happy is he who is able to know the causes of things.”— Virgil

The idea that one needs to do an experiment—a controlled perturbation of a single variable—to assign cause and effect is deeply embedded in traditional thinking about the way scientific knowledge is obtained, but it is largely absent from everyday life. One knows, without doing an experiment, that the street is wet on a rainy day because the rain has fallen. To be sure, this form of causal reasoning requires prior knowledge. One has seen the co-occurrence of rain and the wet street many times and been taught that rain causes wetness. And although such relationships are, in the strict sense, merely very good correlations, human beings routinely, and necessarily, use them to assign cause and effect.

As discussed on this page a year ago, this form of thinking, at least as a starting point for hypothesis-making, is in practice not uncommon in scientific research as well. Even before our data-driven age, a testable idea often began with an observation. When the structure of a voltage-gated potassium channel was first solved, for instance, the physical basis for potassium selectivity was suggested from observing the disposition of the residues known to allow potassium, but not sodium, ions to pass. In another example a century or so earlier, Ramón y Cajal famously predicted many features of the operation of the nervous system, including the directionality of neuronal signaling, based on his observations of the organization of neurons in the brain. Experiments had to be designed to test these ideas, but the hypotheses about cause and effect were generated at least in part by observation.

Many areas of contemporary biology seek to learn causal relationships from biological data. In systems biology, for instance, researchers use measurements of gene expression, cellular protein amounts or metabolite levels, among other types of data, to assign causal or regulatory relationships in models describing the cell. In the context of large-scale systems data, it is usually not possible to assign such relationships just by looking at the data by eye. Statistical and visualization tools are needed, when, for example, one is looking at lists of expression data of thousands of genes and trying to determine which genes regulate what other genes. The methods used to assign causal arrows typically involve perturbation experiments. When unperturbed data are used, additional information such as change over time or prior biological knowledge has been used to order the data.

A Correspondence by Maathuis and colleagues published in this issue (p. 247), in contrast, explores the notion that it might be possible to estimate causal relationships simply by observing random variation in unperturbed data, with no other information added. Making use of gene expression data obtained either from single gene knockouts in yeast—a classical perturbation experiment—or from parallel control measurements on wild-type yeast, an unperturbed system in which there is presumably only random variation, the authors report that, under some assumptions, statistical analysis can be used to predict the strongest causal effects from the control data alone.

The idea that such prediction is theoretically possible is not in itself new and has received some interest in, among others, the social scientific, economic and medical spheres. But it is an idea that is not easy to test in a real-world setting. In a sense, then, the study in this issue exploits the unique properties of biological systems—their complexity, the availability of good tools for precise and ethical system manipulation, and the well-developed technology for acquiring large-scale unbiased data—to test an idea that could have interest and value outside the biological realm as well.

It is worth noting that the assumptions made—in its current iteration, the approach by Maathuis and colleagues provides no allowance for feedback and does not incorporate change over time—could pose serious obstacles for understanding biological as well as other systems. What is more, statistical inference will clearly not replace perturbation experiments in systems that are amenable to manipulation.

Nonetheless, causal inference from purely observed data could have practical value in the prioritization and design of perturbation experiments. Perturbations can be impossible, for instance, if the tools available are not specific enough, unethical, for example in human studies, or simply unfeasible owing to cost or impracticality. Observational data could be used to identify candidate causal relationships, which could then be the basis for the design of targeted perturbations or for further analysis.

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Cause and effect. Nat Methods 7 , 243 (2010). https://doi.org/10.1038/nmeth0410-243

Download citation

Issue Date : April 2010

DOI : https://doi.org/10.1038/nmeth0410-243

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

what type of research aims to explore cause and effect

Logo for University of Minnesota Libraries

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Definitions of Research Designs

Systematic reviews & meta-analyses.

  • A systematic review critically assesses and evaluates all research that addresses a particular research question and presents a synthesized summary of the literature. The researchers use a systematic methodology to search and screen the literature on a particular topic. Meta-analyses use statistical methods to combine the results of individual studies and synthesize the findings.
  • The quality of a systematic review is only as good as the quality of the studies that are included. When evaluating this type of study, you want to assess the methodology of the search strategy and screening of articles to include, and the assessment by the authors of the included studies.

Randomized Controlled Trials (RCT)

  • A study design that randomly assigns participants to an experimental group (which receives the intervention) or a control group (which receives either a placebo or no intervention). The only expected difference between the two groups is the variable being studied.
  • When critically appraising an RCT, you will evaluate elements of the study such as the allocation of participants, how similar the control group and the experimental group are, and the blinding of the participants and health care workers.

Cohort Studies

  • A study type in which people who currently have a certain condition or receive a certain treatment are followed over time and compared with a group of people who are not affected by the condition or treatment.
  • When critically appraising a cohort study, you will investigate how the cohort was recruited, if the exposure was accurately measured and bias was minimized, and if authors have identified and taken account of possible confounding factors.

Case Control Studies

  • An observational study of people with a disease (or other outcome variable) of interest and a control, comparison, or reference group of people without the disease. The two groups of people are compared to determine what can be attributed to the disease or outcome variable.
  • Much like a cohort study, when critically appraising a case series you will examine whether the authors have minimized bias and properly addressed any potential confounding factors.

Some study types are not named on the EBM Pyramid, but are important study designs for answering research questions:

Diagnostic Studies

  • This type of research focuses on estimating the sensitivity and/or specificity of a particular diagnostic test, and compares the test to the standard diagnostic test.
  • When critically appraising this type of study you will want to determine if the new test was compared with an appropriate standard test, if all patients received both the new test and the standard test, and whether the health care workers administering the tests were properly blinded to the results of the standard test.

Economic Evaluation

  • This type of study compares the costs and outcomes of healthcare interventions.
  • When critically appraising an economic evaluation, you will determine if there is evidence that the new intervention or program is effective, if the effects of the intervention were measured appropriately, and were the costs valued in a credible manner.

Qualitative Studies

  • Qualitative research aims to identify what matters most to patients or populations and how their experience can be improved. In public health, this type of research allows researchers to explore social and behavioral issues and explore other social or human problems.
  • Critical appraisal of a qualitative study will determine if there was a clear aim for the research, if the qualitative methodology and research design were appropriate for the aims, and if the data analysis was sufficiently rigorous.

Evidence-Based Practice Copyright © by Various Authors - See Each Chapter Attribution is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Tell cause from effect: models and evaluation

  • Regular Paper
  • Published: 21 July 2017
  • Volume 4 , pages 99–112, ( 2017 )

Cite this article

  • Jing Song 1 ,
  • Satoshi Oyama 1 &
  • Masahito Kurihara 1  

4744 Accesses

6 Citations

Explore all metrics

Causal relationships differ from statistical relationships, and distinguishing cause from effect is a fundamental scientific problem that has attracted the interest of many researchers. Among causal discovery problems, discovering bivariate causal relationships is a special case. Causal relationships between two variables (“X causes Y” or “Y causes X”) belong to the same Markov equivalence class, and the well-known independence tests and conditional independence tests cannot distinguish directed acyclic graphs in the same Markov equivalence class. We empirically evaluated the performance of three state-of-the-art models for causal discovery in the bivariate case using both simulation and real-world data: the additive-noise model (ANM), the post-nonlinear (PNL) model, and the information geometric causal inference (IGCI) model. The performance metrics were accuracy, area under the ROC curve, and time to make a decision. The IGCI model was the fastest in terms of algorithm efficiency even when the dataset was large, while the PNL model took the most time to make a decision. In terms of decision accuracy, the IGCI model was susceptible to noise and thus performed well only under low-noise conditions. The PNL model was the most robust to noise. Simulation experiments showed that the IGCI model was the most susceptible to “confounding,” while the ANM and PNL models were able to avoid the effects of confounding to some degree.

Similar content being viewed by others

what type of research aims to explore cause and effect

The Effect of Noise Level on the Accuracy of Causal Discovery Methods with Additive Noise Models

what type of research aims to explore cause and effect

On the Relations of Theoretical Foundations of Different Causal Inference Algorithms

what type of research aims to explore cause and effect

Data-driven discovery of causal interactions

Saisai Ma, Lin Liu, … Thuc Duy Le

Avoid common mistakes on your manuscript.

1 Introduction

People are generally more concerned with causal relationships between variables than with statistical relationships between variables, and the concept of causality has been widely discussed [ 9 , 12 ]. The best way to demonstrate a causal relationship between variables is to conduct a controlled randomized experiment. However, real-world experiments are often expensive, unethical, or even impossible. Many researchers working in various fields (economics, sociology, machine learning, etc.) are thus using statistical methods to analyze causal relationships between variables [ 2 , 3 , 16 , 19 , 21 , 25 , 31 , 37 , 48 , 55 ].

Possible relationships between X and Y. a Independent. b Feedback. c X causes Y. d Y causes X. e “common cause.” f “selection bias”

Directed acyclic graphs (DAGs) have been used to formalize the concept of causality [ 29 ]. Although a conditional independence test cannot tell the full story of a causal relationship, it can be used to exclude irrelevant relationships between variables [ 29 , 44 ]. However, a conditional independence test is impossible when there are only two variables. Several models have been proposed to solve this problem [ 20 , 38 , 46 , 47 , 49 ]. For two variables X and Y, there are at least six possible relationships between them (Fig. 1 ). The top two diagrams show the independent case and the feedback case, respectively. The middle two show the two possible causal relationships between X and Y: “X causes Y” and “Y causes X.” The bottom two show the “common cause” case and the “selection bias” case. Unobserved variables Z are “confounders” Footnote 1 for causal discovery between X and Y. The existence of “confounders” Footnote 2 creates a spurious correlation between X and Y. Distinguishing spurious correlation due to unobserved confounders from actual causality remains a challenging task in the field of causal discovery. Many models are based on the assumption that no unobserved confounders exist. Footnote 3

In the work reported here, we experimentally compared the performance of three state-of-the-art causal discovery model, the additive-noise model (ANM) [ 15 ], the post-nonlinear (PNL) model [ 54 ], and the information geometric causal inference (IGCI) model [ 22 ]. We used three metrics: accuracy, area under the ROC curve (AUC), and time to make a decision. This paper is an extended and revised version of our conference paper [ 23 ]. It includes new AUC results and the updated time to make a decision for the PNL model. It also describes typical examples of model failure and discusses the reasons for failure. Finally, it describes new experiments on the responses of the models to spurious correlation caused by confounders using simulation and real-world data.

In Sect.  2 , we discuss related work in the field of causal discovery. In Sect.  3 , we briefly describe the three models. In Sect.  4 , we describe the dataset we used and the implementations of the three models. In Sect.  5 , we present the results and give a detailed analysis of the performances of the three models. We conclude in Sect.  6 by summarizing the strengths and weaknesses of the three models and mentioning future tasks.

2 Related work

Temporal information is useful for causal discovery modeling [ 30 , 32 ]. Granger [ 7 ] proposed detecting the causal direction of time series data on the basis of the temporal ordering of the variables and used linear systems to make it more operational. He formulated the definition of causality in terms of conditional independence relations [ 8 ]. Chen et al. extended the linear stochastic systems he proposed [ 7 ] to work on nonlinear systems [ 1 ]. Shajarisales et al. [ 39 ] proposed using the spectral independence criterion (SIC) for causal inference from time series data and a mechanism different from that used in Granger causality and compared the two methods. For Granger causality [ 7 ] and extended Granger causality [ 1 ], temporal information is needed. When discovering causal relationship from time series data, the data resolution might be different from the true causal frequency. Gong et al. [ 6 ] discussed this issue and showed that using non-Gaussian of data can help identify the underlying model under some conditions.

Shimizu et al. [ 41 ] proposed using a linear non-Gaussian acyclic model (LiNGAM for short) to detect the causal direction of variables without the need for temporal information. LiNGAM works when the causal relationship between variables is linear, the distributions of disturbance variables are non-Gaussian, and the network structure can be expressed using a DAG. Several extensions of LiNGAM have been proposed [ 14 , 18 , 40 , 42 ].

LiNGAM is based on the assumption of linear relationships between variables. Hoyer et al. [ 15 ] proposed using an additive- noise model (ANM) to deal with nonlinear relationships. If the regression function is linear, ANM works in the same way as LiNGAM. Zhang et al. [ 52 , 53 , 54 , 56 ] proposed using a PNL model that takes into account the nonlinear effect of causes, inner additive noise, and external sensor distortion. The ANM and PNL model are briefly introduced in the following section.

While the above models are based on structural equation modeling (SEM), which requires structural constraints on the data generating process, another research direction is based on the assumption that independent mechanisms in nature generate causes and effects. The idea is that the shortest description of joint distribution \(p(\mathrm {cause, effect})\) can be expressed by \(p(\mathrm {cause})p(\mathrm {effect|cause})\) . Compared with factorization into p (cause) p (effect|cause), p (effect) p (cause|effect) has lower total complexity. Although comparing total complexity is an intuitive idea, Kolmogorov complexity and algorithmic information could be used to measure it [ 19 ].

Janzing et al. [ 4 , 22 ] proposed IGCI to infer asymmetry between cause and effect through the complexity loss of distributions. The IGCI model is briefly introduced in the following section. Zhang et al. [ 57 ] proposed using a bootstrap-based approach to detect causal direction. It is based on the assumption that the parameters of the causes involved in the causality data generation process are exogenous to those of the cause to the effect. Stegle et al. [ 45 ] proposed using a probabilistic latent variable model (GPI) to distinguish between cause and effect using standard Bayesian model selection.

In addition to the above studies on the causal relationship between two variables, there have been several reviews. Spirtes et al. discussed the main concepts of causal discovery and introduced several models based on SEM [ 43 ]. Eberhardt [ 5 ] discussed the fundamental assumptions of causal discovery and gave an overview of related causal discovery methods. Several methods have been proposed for deciding the causal direction between two variables, and specific methods have been compared. However, as far as we know, there has been little discussion of how to fairly compare methods based on different assumptions. In the work described above, accuracy was usually used as the evaluation metric. Another commonly used metric for a binary classifier is the AUC. Compared with accuracy, the ROC curve can show the trade-off between the true-positive rate (TPR) and the false-positive rate (FPR) of a binary classifier. In our framework for causal discovery models, we use AUC as an evaluation metric. We have used it to obtain several new insights. We also used the time to make a decision as an evaluation metric since it may become a performance bottleneck when dealing with big data.

We used the ANM, the PNL model, and the IGCI model in the comparison experiments. The ANM and PNL model define how causality data are generated in nature through SEM. The assumption of additive noise is enlightening. The IGCI model finds the asymmetry between cause and effect through the complexity loss of distributions. The assumption of IGCI is intuitive and how well it works needs to be further researched.

The additive-noise model of Hoyer et al. [ 15 ] is based on two assumptions: (1) the observed effects (Y) can be expressed using functional models of the cause (X) and additive noise (N) (Eq.  1 ); (2) the cause and additive noise are independent. If f () is a linear function and the noise has a non-Gaussian distribution, the ANM works in the same way as LiNGAM [ 41 ]. The model is learned by performing regression in both directions and testing the independence between the assumed cause and noise (residuals) for each direction. The decision rule is to choose the direction with the less dependence as the true causal direction. The ANM cannot handle the linear Gaussian case since the data can fit the model in both directions, so the asymmetry between cause and effect disappears. Gretton et al. improved the algorithm and extended the ANM to work even in the linear Gaussian case [ 50 ]. The improved model also works more efficiently in the multivariate case.

3.2 PNL model

In the post-nonlinear model of Zhang et al. [ 53 , 54 ], effects are nonlinear transformations of causes with some inner additive noise, followed by external nonlinear distortion (Eq.  2 ). From Eq.  2 , we obtain \(N={f}^{-1}(Y)-g(X)\) , where X and Y are the two observed variables representing cause and effect, respectively. To identify the cause and effect, a particular type of constrained nonlinear ICA [ 17 , 53 ] is performed to extract two components that are as independent as possible. The two extracted components are the assumed cause and corresponding additive noise, respectively. The identification method of the model is described elsewhere ([ 53 ], Section 4). The identifiability of the causal direction inferred by the PNL model has been proven [ 54 ]. The PNL model can identify the causal direction of data generated in accordance with the model except for the five situations described in Table 1 in [ 54 ].

The IGCI model [ 4 , 22 ] is based on the hypothesis that if “X causes Y,” the marginal distribution p ( x ) and the conditional distribution p ( y | x ) are independent in a particular way. The IGCI model gives an information-theoretic view of additive noise and defines independence by using orthogonality. With ANM [ 15 ], if there is no additive noise, inference is impossible, while it is possible with the IGCI model.

The IGCI model determines the causal direction on the basis of complexity loss. According to IGCI, the choice of p ( x ) is independent of the choice of f for the relationship \(y=f(x)+n\) . Let \(\nu _{x}\) and \(\nu _{y}\) be the reference distributions Footnote 4 for X and Y.

is the KL-distance between \(P _{x}\) and \(\nu _{x}\) . \(D(P_{x} \left| \right| \nu _{x})\) works as a feature of the complexity of the distribution. The complexity loss from X to Y is given by

The decision rule of the IGCI model is that if \(V_{X\rightarrow Y} < 0\) , infer “X causes Y,” and if \(V_{X\rightarrow Y} > 0\) , infer “Y causes X.” This rule is rather theoretical. An applicable and explicit form for the reference measure is entropy-based IGCI or slope-based IGCI.

Entropy-based IGCI:

where \(\psi ()\) is the digamma function Footnote 5 and m is the number of data points.

Slope-based IGCI:

These explicit forms are simpler, and we can see that the two calculation methods coincide. The calculation does not take much time even when dealing with a large amount of data. However, the IGCI model assumes that the causal process is noiseless and may perform poorly under high-noise conditions. We discuss the performance of the three models in Sect.  5 .

4 Experiments

Here we describe the dataset used in our experiments and the implementation of each model.

4.1 Dataset

We used the cause effect pairs (CEP) [ 27 ] dataset, which contains 97 pairs of real-world causal variables with the cause and effect labeled for each pair. The dataset is publicly available online [ 27 ]. Some of the data were collected from the UCI machine learning repository [ 24 ]. The data come from various fields, including geography, biology, physics, and economics. The dataset also contains time series data. Most of the data are noisy. An appendix in  [ 28 ] contains a detailed description of each pair of variables.

We used 91 of the pairs in our experiments since some of the data (e.g., pair0052) contain multi-dimensional variables. Footnote 6 The 91 pairs are listed in Table 4 in “Appendix.” Some contain the same variables collected in different countries or at different times. Footnote 7 The variables range in size from 126 to 16,382. Footnote 8 The variety of data types in the CEP dataset makes causal analysis using real-world data challenging.

4.2 Implementation

We implemented the three models following the original work [ 15 , 22 , 53 ]. A brief introduction is given blow.

ANM Using the reported experimental settings [ 15 ], we performed Gaussian processes for machine learning regression [ 33 , 34 , 35 ]. We then used the Hilbert–Schmidt Independence Criterion (HSIC) [ 10 ] to test the independence between the assumed cause and residuals. The dataset used had been labeled with the true causal direction for each pair with no cases of independence or feedback. Using the decision rule of ANM, we determined that the direction with the greater independence was the true causal direction.

PNL Model We used a particular type of constrained nonlinear ICA to extract the two components that would be the cause and noise if the model had been learned in the correct direction. The nonlinearities of g () and \(f^{-1}()\) in Eq.  2 were modeled using multilayer perceptrons. By minimizing the mutual information between the two output components, we made the output as independent as possible. After extracting two independent components, we tested their independence by using the HSIC [ 10 , 11 ]. Finally, in the same way as for the ANM, we determined that the direction with the greater independence was the correct one.

IGCI \(\mathrm {(entropy, uniform)}\) Compared with the first two models, the implementation of the IGCI (entropy, uniform) model was simpler. We used reported equations ( 3 , 4 ) to calculate \(\hat{V}_{X\rightarrow Y}\) and \(\hat{V}_{Y\rightarrow X}\) and determined that the direction in which entropy decreased was the correct direction. If \(\hat{V}_{X\rightarrow Y}<0\) , the inferred causal direction was “X causes Y”; otherwise, it was “Y causes X.” For the IGCI model, the data should be normalized before calculating \(\hat{V}_{X\rightarrow Y}\) and \(\hat{V}_{Y\rightarrow X}\) . In accordance with the reported experimental results, we used the uniform distribution as the reference distribution because of its good performance. For the repetitive data in the dataset, we set \(\log 0=0\) .

IGCI \(\mathrm {(slope, uniform)}\) The implementation of the IGCI (slope, uniform) model was similar to that of the IGCI (entropy, uniform) one. We used Eq.  5 to calculate \(\hat{V}_{X\rightarrow Y}\) and \(\hat{V}_{Y\rightarrow X}\) and determined that the direction with a negative value was the correct one. For the same reason as above, we normalized the data to [0,1] before calculating \(\hat{V}_{X\rightarrow Y}\) and \(\hat{V}_{Y\rightarrow X}\) . To make Eq.  5 meaningful, we filtered out the repetitive data.

Here, we first compare model accuracy for different decision rates. Footnote 9 We changed the threshold and calculated the corresponding decision rate and accuracy for each model. The accuracy of the models for different decision rates has been compared in the original study [ 4 ]. Compared with [ 4 ], we used more real-world data in our experiments. Besides, how accuracy changed under different decision rates was showed. Our results are consistent with those shown of Mooij et al. [ 26 ]. The performance of the models for different decision rates is discussed in Sect.  5.1 .

Accuracy of three models for different decision rates. Decision rate changed when threshold was changed: the larger the threshold, the smaller the decision rate. In an ideal case, the accuracy of each model should improve with a decrease in the decision rate

Since causal discovery models in the bivariate case make a decision between two choices, we can regard these models as binary classifiers and evaluate them using AUC. We previously divided the data into two groups (inferred as “X causes Y” and inferred as “Y causes X”) and evaluated the performance of each model for each group [ 23 ]. Here we give the results for the entire (undivided) dataset.

Finally, we compare model efficiency by using the average time needed to make a decision. This is described in Sect.  5.3 .

5.1 Accuracy for different decision rates

We calculated the accuracy of each model for different decision rates using Eqs.  6 and 7 . The results are plotted in (Fig. 2 ). The decision rate changed when the threshold was changed. The larger the threshold, the more stringent the decision rule. In an ideal situation, accuracy decreases as the decision rate increases, with the starting point at 1.0. However, the results with real-world data were not perfect because the data were noisy.

As shown in Fig. 2 , the accuracy started from 1.0 for the ANM and IGCI and from 0.0 for the PNL model. This means that the PNL model made an incorrect decision when it had the highest confidence. Although the accuracies of the IGCI models started from 1.0, they dropped sharply when the decision rate was between 0 and 0.2. The reasons for this are discussed in detail in Sect.  5.4 . After reaching a minimum, the accuracies increased almost continuously and stabilized. The accuracy of the ANM was more stable than those of the other models. When all decisions had been made, the model accuracies were ranked IGCI > ANM > PNL.

5.2 Area under ROC curve (AUC)

Besides calculating the accuracy of the three models for different decision rates, we used the AUC to evaluate their performance. Some of the experimental results were presented in our conference paper [ 23 ], for which the dataset was divided into two groups: inferred as “X causes Y” and inferred as “Y causes X.” Here we present updated experimental results for the entire dataset.

The following steps were taken to get the experimental results:

Set X as the cause and Y as the effect in the input data.

Set \({V}_{X \rightarrow Y}\) and \({V}_{Y \rightarrow X}\) to be the outputs.

Calculate the absolute value of the difference between \({V}_{X \rightarrow Y}\) and \({V}_{Y \rightarrow X}\) (Eq.  8 ) and map \(V_{\mathrm {diff}}\) to [0,1].

Assign a positive label to the pairs inferred as “X causes Y” and a negative one to the pairs inferred as “Y causes X.”

Use \(V_{\mathrm {diff}}\) and the labels assigned in step 4 to calculate the true-positive rate (TPR) and false-positive rate (FPR) for different thresholds.

Plot the ROC curve and calculate the corresponding AUC value.

In step (1), instead of dividing the data into two groups as done previously [ 23 ], we set the input matrix so that the first column was the cause and the second column was the effect. Then, if the inferred causal direction for a pair was “X causes Y,” a positive label was assigned to that pair; otherwise, a negative label was assigned.

In step (3), we used the absolute value of the difference between \({V}_{X \rightarrow Y}\) and \({V}_{Y \rightarrow X}\) as the “confidence” of the model when making a decision. The larger the \(V_{\mathrm {diff}}\) , the greater the confidence. We did not use division because, if one of \({V}_{X \rightarrow Y}\) and \({V}_{Y \rightarrow X}\) was very small, the division result would be very large. We mapped \(V_{\mathrm {diff}}\) to [0,1] to make the calculation more convenient. In this way, \(V_{\mathrm {diff}}\) could be used in the same way as the output of a binary classifier. For causal discovery, the larger the \(V_{\mathrm {diff}}\) , the greater the confidence in the decision. At the same time, more punishment should be given when the decision is incorrect.

In step (4), we labeled the data in accordance with the inferred causal direction. Since the correct label for all the pairs was “X causes Y,” if the inferred result for a pair was “Y causes X,” it was assigned a negative label.

In step (5), we used the normalized \(V_{\mathrm {diff}}\) and the label assigned in step (4) to calculate TPR and FPR for different thresholds. We plotted TPR and FPR to get the ROC curve and calculated the corresponding AUC value.

The ROC results are plotted in Fig. 3 . The corresponding AUC values are shown in Table 1 . Different from the results shown in Fig. 2 , both IGCI models performed poorly in terms of AUC. The AUC values for IGCI were smaller than 0.5, which means their performances were even worse than that of a random classifier. However, as described in Sect.  5.1 , when we used different decision rates, the IGCI models had better performance.

We checked the decisions made by the IGCI models and found that they made several incorrect decisions when the threshold was large. Such decisions with a large threshold are punished severely when using the AUC metric. As shown in Fig. 2 , although the accuracies of the IGCI models started from 1.0, they dropped sharply when the decision rate was between 0 and 0.2. An incorrect decision with a low decision rate was not punished much when evaluating accuracy for different decision rates. However, for the AUC, an incorrect decision when the threshold was large was punished more than when the threshold was small. For these reasons, the starting point of the ROC curve for the IGCI models in Fig.  3 was shifted to the right, making the AUC less than 0.5. In Sect.  5.4 , we will discuss why the IGCI models failed.

ROC of three models. Four graphs are shown because IGCI has two explicit forms

5.3 Algorithm efficiency

Besides comparing the accuracy and ROC of the three models, we also compared the average time for the algorithm to make a decision. Footnote 10 We performed the experiment on the MATLAB platform with an Intel Core i7-4770 3.40 GHz CPU and 8 GB memory. From Table 2 , we can see that the IGCI models were the most efficient one, while the PNL model was the least efficient. The ANM was in the middle. The longer time to make a decision for the PNL model was due to the estimation procedure of \({f}^{-1}\) and g in Eq.  2 .

5.4 Typical examples of model failure

5.4.1 discretization.

In Sect.  5.1 , we explained that the PNL model gives an incorrect decision when the threshold is set the highest, i.e., the accuracy for different decision rates starts from 0.0. We investigated the reason for the PNL model making an incorrect decision when the threshold was the highest. We found that it happens due to the discretization of data. A scatter plot for a pair of variables (pair0070) is shown in Fig. 4 . The data have two variables. Variable \({x}_{1}\) is a value between 0 and 14 reflecting the features of an artificial face. It is used to decide whether the face is that of a man or a woman. A value of 0 means very female, and a value of 14 means very male. Variable \({x}_{2}\) is a value of 0 (female) or 1 (male) reflecting the gender decision. Since variable \({x}_{2}\) has only two values, no matter what nonlinear transformation is made to \({x}_{2}\) , the transformation result is two discretized values. According to the mechanism of the PNL model, \(x_{2}\) with two discretized values is inferred to be the cause since the independency is larger if \({x}_{2}\) is the cause. In fact, for this pair of variables, all three models made incorrect decisions in our experiments. For the ANM, the discretization of data makes regression analysis difficult. And the poor regression results negatively affect testing of the independence between the assumed cause and residuals using HSIC. For IGCI, to make Eqs.  3 and 5 meaningful, the repetitive data have to be filtered out. This means that only a few data points are actually used in the final IGCI calculation.

Example of discretized data in CEP dataset. Variable \({x}_{1}\) is between 0 (very female) and 14 (very male), reflecting features of artificial face. Variable \({x}_{2}\) is 0 (female) or 1 (male), reflecting gender decision

5.4.2 Noisiness

In Sect.  5.1 , we showed that IGCI had good performance in general. However, its accuracy dropped sharply when the decision rate was between 0 and 0.2. The reason for this is that it made incorrect decisions with high confidence when dealing with pair0056-pair00 63. These eight pairs contain much noise, which degraded model performance. Moreover, there were outliers for the eight pairs, which greatly affected the decision result. A scatter plot for one example pair is shown in Fig. 5 (pair0056). It shows that the two variables have relatively small correlation and that there are outliers in the data. The calculation method used in IGCI is such that these kinds of outliers affect the inference result more than the other data points. The incorrect decisions that IGCI made about pair0056-pair0063 account for the small AUC value for the IGCI models given in Sect.  5.2 . The ANM also made an incorrect inference about these pairs. This is because noise and outliers make overfitting more likely to occur for these variables. For these noisy pairs, the PNL model had the best performance.

Example scatter plot of noisy data in CEP. Variable \({x}_{1}\) is female life expectancy at birth for 20002005. Variable \({x}_{2}\) is latitude of birth country’s capital (data for China, Russia, and Canada were removed)

5.5 Response to spurious correlation caused by “confounding”

A causal relationship differs from a statistical one, and a statistical relationship is usually not enough to explain a causal one. Even if we observe that two variables are highly correlated, we cannot say that they have a causal relationship. As shown in Table 4 in “Appendix,” the causal direction of the variables in CEP is obvious by common sense. The dataset has been collected for evaluating causal discovery models, and the causal direction of most pairs is obvious from knowledge. However, the relationships of variables that are of general interest in the real world are usually more controversial, e.g., smoking and lung cancer, for which the existence of confounding is usually a bone of contention. A good causal discovery model for two variables should have the ability to avoid the effect of “confounding” to some degree. To test how the ANM, the PNL model, and IGCI perform when dealing with spurious correlation caused by confounding, we first simulated the “common cause” case shown in Fig. 1 . We controlled the data generating process to simulate different degrees of confounding. In addition to simulation, we used real-world data from the CEP to evaluate model performance when dealing with real-world data.

Scatter plots of generated data. a a/b: 0.1, b a/b: 1, c a/b: 10, d a/b: 100, e a/b: 1000

Estimated results of IGCI for generated data. Inference was “Y causes X” when estimated result was larger than 0

Test statistics for PNL model for generated data

5.5.1 Simulation

We conducted simulation experiments of the “common cause” confounding case shown in Fig. 1 . We generated data using two equations: \(x=a \times z^{3}+b \times n_{1}\) and \(y=a \times z+b \times n_{2}\) , where \(z,n_{1},n_{2} \in U(0,1)\) . We used the quotient a  /  b to control the degree of confounding. There was no direct causal relationship between variables x and y except for the various degrees of confounding. Scatter plots of the data generated using five different quotients are shown in Fig. 6 .

We used the generated data to test the performance of the three models. We performed ten random experiments for each quotient and calculated the average of the inferred results. The experimental results are shown in Figs. 7 , 8 , and Table  3 . For IGCI, when the degree of confounding was low, the mean of the estimated results Footnote 11 was almost zero. Each estimated result got a positive or negative value randomly. As the degree of confounding increased, the estimated results tended to approve “Y causes X.” For the PNL model, the independence assumptions for both directions were accepted when \(a/b=0.1, 1, 10\) ( \(\alpha =0.01\) ) and the means of the test statistics (Equation 4 in [ 11 ]) were almost equal. When \(a/b=100\) , the PNL model rejected the independence assumption for direction “X causes Y” and accepted that “Y causes X.” When a/b was even larger, the independence assumptions were rejected for both directions, especially that of “X causes Y.” For the ANM, when the degree of confounding was low, the independence assumptions were accepted for both directions. When the degree was high, the independence assumptions were rejected for both directions, especially “X causes Y.” From these results, we can see that the IGCI is the most susceptible of the three models tested to confounding, while the PNL model and the ANM can avoid the effect of confounding to a certain extent.

5.5.2 Real-world data

In addition to the simulation experiments described above, we conducted experiments using real-world data. The generation of real- world data can be very complex, which increases the difficulty of the causal discovery task. Here we describe the “common cause” and “selection bias” cases (Fig. 1 ).

Common cause case Although data for “common cause” are not included in the CEP dataset, some CEP pairs contain the same cause, as shown in “Appendix.” We combined data containing the same cause to obtain pairs of variables, such as “length and diameter.” Footnote 12 A scatter plot of the results is shown in Fig. 9 for the pair “length, diameter.” For the ANM, the p value for the forward direction, “length causes diameter,” was \(8.28 \times 10^{-5}\) while that for the backward direction was \(1.05 \times 10^{-3}\) . For the PNL model, the independence test statistic was \(1.11 \times 10^{-3}\) for the forward direction and \(9.50 \times 10^{-4}\) for the backward direction. For IGCI, the result estimated by calculating the entropy was 0.2197, while the other was 0.056. For this pair, although the tendency was not strong, the three models tended to approve “diameter is the cause of length.”

Results for “common cause” case with real-world data. Variable \(x_{1}\) : “length”; variable \(x_{2}\) : “diameter”

Selection bias case Although there is no causal relationship between X and Y in the selection bias case, independence does not hold between X and Y when conditioned on variable Z. This is the well-known Berkson paradox. Footnote 13 We used the variables “altitude and longitude” (Fig. 10 ) contained in CEP to test how the three models perform when dealing with the “selection bias” case. The variables were obtained by combining “cause: altitude, effect: temperature” and “cause: longitude, effect: temperature.” The data came from 349 weather stations in Germany [ 27 ]. A scatter plot of the results is shown in Fig. 10 . For the ANM, the p value for the forward direction, “altitude causes longitude,” was \(4.21 \times 10^{-2}\) while that for the backward direction was \(8.73 \times 10^{-2}\) . The independence assumptions were accepted for both directions although “longitude causes altitude” was favored. For the PNL model, the test statistic was \(2.46 \times 10^{-3}\) for the forward direction and \(3.30 \times 10^{-3}\) for the backward direction. The independence assumptions were accepted for both directions, and the independence test results were similar. For IGCI, the result estimated by calculating the entropy was 1.2742 and that estimated by calculating the slope was 1.9032. Both results were positive, and the estimated causal direction was “longitude causes altitude.” Although there should be no causal relationship between “altitude” and “longitude,” it was hard for the three models to determine that from the limited amount of observational data available.

For most cases of causal discovery in the real world, only limited observational data can be obtained, and in some cases the data are incomplete. Moreover, for the case of two variables, the causal sufficiency assumption [ 36 ] is easily violated if there is an unobserved common cause. The limited amount of data and unobserved confounders make causal discovery in the bivariate case challenging.

Results for “selection bias” case with real-world data. Variable \(x_{1}\) : “altitude”; variable \(x_{2}\) : “longitude”

6 Conclusion

We compared three state-of-the-art models (ANM, PNL model, IGCI) for causal discovery in the binary case using simulation and real-world data. Testing using different decision rates showed that the three models had similar accuracies when all the decisions were made. To check whether the decisions made were reasonable, we used a binary classifier metric: the area under the ROC curve (AUC). The IGCI model had a small AUC value because it made several incorrect decisions when the threshold was high. Compared with those of the other models, the accuracy of the ANM was relatively stable. A comparison of the time to make a decision showed that IGCI was the fastest even when the dataset was large. The PNL model took the most time to make a decision.

Of the three models, the IGCI one had the best performance when there was little noise and the data were not discretized much. Improving the performance of the IGCI model when there is much noise and how to deal with discretized data are future tasks. Although the performance of the ANM was relatively stable, overfitting should be avoided for ANM because it will negatively affect the subsequent independence test. Of the three models, the PNL model is the most generalized one as it takes into account the nonlinear effect of causes, inner additive noise, and external sensor distortion. However, estimation procedure of g () and \({f}^{-1}()\) is a lengthy procedure. Finally, testing the responses of the models to “confounding” showed that the ANM and the PNL model can avoid the effect of “confounding” to some degree, while IGCI is the most susceptible to confounding.

For the definition of “confounding,” please refer to [ 13 , 51 ].

The number of “confounders” is not limited to one.

There have been some efforts to deal with confounders. For example, Shimizu et al. [ 41 ] extended the linear non-Gaussian acyclic model to detect causal direction when there are “common causes” [ 14 , 40 ].

Reference distributions are used to measure the complexity of \(P_{x}\) and \(P_{y}\) . In [ 22 ], non-informative distributions like uniform and Gaussian ones are recommended.

https://en.wikipedia.org/wiki/Digamma_function .

The three models we evaluated cannot deal with multi-dimensional data.

Country and time information is not included in the table.

To avoid overfitting, we limited the size to 500 or less.

Since all three models have two outputs, e.g., \(\hat{V}_{X\rightarrow Y}\) , \(\hat{V}_{Y\rightarrow X}\) corresponding to the two possible causal directions, we set thresholds based on the absolute difference between them \(|\hat{V}_{X \rightarrow Y}-\hat{V}_{Y \rightarrow X}|\) for use in deciding each decision rate and model accuracy.

Compared to our previous report [ 23 ], we reduced the program output so that the PNL model works faster. We have updated the results for time to make a decision for the PNL model accordingly.

We used the difference between \(\hat{V}_{X\rightarrow Y}\) and \(\hat{V}_{Y\rightarrow X}\) (Eqs.  4 , 5 ) as the estimated result. For IGCI, \(\hat{V}_{X\rightarrow Y}\) is the opposite of \(\hat{V}_{Y\rightarrow X}\) if there is no repetitive data. Thus, we can infer that the correct causal direction is \(X\rightarrow Y\) if the estimated result is negative and that \(Y\rightarrow X\) is correct if the estimated result is positive.

The pair “length, diameter” was created from “cause: rings (abalone), effect: length” and “cause: rings (abalone), effect: diameter” using data for abalone [ 24 ].

https://en.wikipedia.org/wiki/Berkson’s_paradox .

Chen, Y., Rangarajan, G., Feng, J., Ding, M.: Analyzing multiple nonlinear time series with extended granger causality. Phys. Lett. A 324 (1), 26–35 (2004)

Article   MathSciNet   MATH   Google Scholar  

Chen, Z., Zhang, K., Chan, L.: Causal discovery with scale-mixture model for spatiotemporal variance dependencies. In: Advances in Neural Information Processing Systems, pp. 1727–1735 (2012)

Chen, Z., Zhang, K., Chan, L., Schölkopf, B.: Causal discovery via reproducing kernel hilbert space embeddings. Neural Comput. 26 (7), 1484–1517 (2014)

Article   MathSciNet   Google Scholar  

Daniusis, P., Janzing, D., Mooij, J., Zscheischler, J., Steudel, B., Zhang, K., Schölkopf, B.: Inferring deterministic causal relations. arXiv preprint arXiv:1203.3475 (2012)

Eberhardt, F.: Introduction to the foundations of causal discovery. Int. J. Data Sci. Anal., 1–11 (2017)

Gong, M., Zhang, K., Schoelkopf, B., Tao, D., Geiger, P.: Discovering temporal causal relations from subsampled data. In: Proceedings of 32th International Conference on Machine Learning (ICML 2015) (2015)

Granger, C.W.: Investigating causal relations by econometric models and cross-spectral methods. Econom. J. Econom. Soc. 37 , 424–438 (1969)

MATH   Google Scholar  

Granger, C.W.: Testing for causality: a personal viewpoint. J. Econ. Dyn. Control 2 , 329–352 (1980)

Granger, C.W.: Some recent development in a concept of causality. J. Econom. 39 (1), 199–211 (1988)

Gretton, A., Herbrich, R., Smola, A., Bousquet, O., Schölkopf, B.: Kernel methods for measuring independence. J. Mach. Learn. Res. 6 (Dec), 2075–2129 (2005)

MathSciNet   MATH   Google Scholar  

Gretton, A., Fukumizu, K., Teo, C.H., Song, L., Schölkopf, B., Smola, A.J., et al.: A kernel statistical test of independence. NIPS 20 , 585–592 (2008)

Google Scholar  

Halpern, J.Y.: A modification of the halpern-pearl definition of causality. arXiv preprint arXiv:1505.00162 (2015)

Howards, P.P., Schisterman, E.F., Poole, C., Kaufman, J.S., Weinberg, C.R.: “Toward a clearer definition of confounding” revisited with directed acyclic graphs. Am. J. Epidemiol. 176 (6), 506–511 (2012)

Article   Google Scholar  

Hoyer, P.O., Shimizu, S., Kerminen, A.J., Palviainen, M.: Estimation of causal effects using linear non-gaussian causal models with hidden variables. Int. J. Approx. Reason. 49 (2), 362–378 (2008)

Hoyer, P.O., Janzing, D., Mooij, J.M., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. In: Advances in neural information processing systems, pp. 689–696 (2009)

Huang, B., Zhang, K., Schölkopf, B.: Identification of time-dependent causal model: A gaussian process treatment. In: The 24th International Joint Conference on Artificial Intelligence, Machine Learning Track, pp. 3561–3568. Buenos, Argentina (2015)

Hyvärinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Netw. 13 (4), 411–430 (2000)

Hyvärinen, A., Zhang, K., Shimizu, S., Hoyer, P.O.: Estimation of a structural vector autoregression model using non-gaussianity. J. Mach. Learn. Res. 11 (May), 1709–1731 (2010)

Janzing, D., Scholkopf, B.: Causal inference using the algorithmic markov condition. IEEE Trans. Inf. Theory 56 (10), 5168–5194 (2010)

Janzing, D., Hoyer, P.O., Schölkopf, B.: Telling cause from effect based on high-dimensional observations. arXiv preprint arXiv:0909.4386 (2009)

Janzing, D., MPG, T., Schölkopf, B.: Causality: Objectives and assessment (2010)

Janzing, D., Mooij, J., Zhang, K., Lemeire, J., Zscheischler, J., Daniušis, P., Steudel, B., Schölkopf, B.: Information-geometric approach to inferring causal directions. Artif. Intell. 182 , 1–31 (2012)

Jing, S., Satoshi, O., Haruhiko, S., Masahito, K.: Evaluation of causal discovery models in bivariate case using real world data. In: Proceedings of the International MultiConference of Engineers and Computer Scientists 2016, pp. 291–296 (2016)

Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2013)

Lopez-Paz, D., Muandet, K., Schölkopf, B., Tolstikhin, I.: Towards a learning theory of cause-effect inference. In: Proceedings of the 32nd International Conference on Machine Learning. JMLR: W&CP, Lille, France (2015)

Mooij, J.M., Janzing, D., Schölkopf, B.: Distinguishing between cause and effect. In: NIPS Causality: Objectives and Assessment, pp. 147–156 (2010)

Mooij J.M., Janzing, D., Zscheischler, J., Schölkopf, B.: Cause effect pairs repository. https://webdav.tuebingen.mpg.de/cause-effect/ (2014a)

Mooij, J.M., Peters, J., Janzing, D., Zscheischler, J., Schölkopf, B.: Distinguishing cause from effect using observational data: methods and benchmarks. arXiv preprint arXiv:1412.3773 (2014b)

Pearl, J., et al.: Models, reasoning and inference (2000)

Peters, J., Janzing, D., Gretton, A., Schölkopf, B.: Detecting the direction of causal time series. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 801–808. ACM, New York (2009)

Peters, J., Mooij, J., Janzing, D., Schölkopf, B.: Identifiability of causal graphs using functional models. arXiv preprint arXiv:1202.3757 (2012)

Peters, J., Janzing, D., Schölkopf, B.: Causal inference on time series using restricted structural equation models. In: Advances in Neural Information Processing Systems, pp. 154–162 (2013)

Rasmussen, C.E.: Gaussian Processes for Machine Learning. MIT press, Cambridge (2006)

Rasmussen, C.E., Nickisch, H.: Gaussian processes for machine learning (gpml) toolbox. J. Mach. Learn. Res. 11 (Nov), 3011–3015 (2010a)

Rasmussen, C.E., Nickisch, H.: GPML code. http://www.gaussianprocess.org/gpml/code/matlab/doc/ (2010b)

Scheines, R.: An introduction to causal inference (1997)

Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J.: On causal and anticausal learning. arXiv preprint arXiv:1206.6471 (2012)

Sgouritsa, E., Janzing, D., Hennig, P., Schölkopf, B.: Inference of cause and effect with unsupervised inverse regression. In: AISTATS (2015)

Shajarisales, N., Janzing, D., Schoelkopf, B., Besserve, M.: Telling cause from effect in deterministic linear dynamical systems. arXiv preprint arXiv:1503.01299 (2015)

Shimizu, S., Bollen, K.: Bayesian estimation of causal direction in acyclic structural equation models with individual-specific confounder variables and non-gaussian distributions. J. Mach. Learn. Res. 15 (1), 2629–2652 (2014)

Shimizu, S., Hoyer, P.O., Hyvärinen, A., Kerminen, A.: A linear non-gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 7 (Oct), 2003–2030 (2006)

Shimizu, S., Inazumi, T., Sogawa, Y., Hyvärinen, A., Kawahara, Y., Washio, T., Hoyer, P.O., Bollen, K.: Directlingam: a direct method for learning a linear non-gaussian structural equation model. J. Mach. Learn. Res. 12 (Apr), 1225–1248 (2011)

Spirtes, P., Zhang, K.: Causal discovery and inference: concepts and recent methodological advances. Appl. Inform. 3 , 3 (2016)

Spirtes, P., Glymour, C.N., Scheines, R.: Causation, Prediction, and Search. MIT press, Cambridge (2000)

Stegle, O., Janzing, D., Zhang, K., Mooij, J.M., Schölkopf, B.: Probabilistic latent variable models for distinguishing between cause and effect. In: Advances in Neural Information Processing Systems, pp. 1687–1695 (2010)

Sun, X., Janzing, D., Schölkopf, B.: Causal inference by choosing graphs with most plausible markov kernels. In: ISAIM (2006)

Sun, X., Janzing, D., Schölkopf, B.: Distinguishing between cause and effect via kernel-based complexity measures for conditional distributions. In: ESANN, pp. 441–446 (2007a)

Sun, X., Janzing, D., Schölkopf, B., Fukumizu, K.: A kernel-based causal learning algorithm. In: Proceedings of the 24th international conference on Machine learning, pp. 855–862. ACM, New York (2007b)

Sun, X., Janzing, D., Schölkopf, B.: Causal reasoning by evaluating the complexity of conditional densities with kernel methods. Neurocomputing 71 (7), 1248–1256 (2008)

Tillman, R.E., Gretton, A., Spirtes, P.: Nonlinear directed acyclic structure learning with weakly additive noise models. In: Advances in Neural Information Processing Systems, pp. 1847–1855 (2009)

Weinberg, C.R.: Toward a clearer definition of confounding. Am. J. Epidemiol. 137 (1), 1–8 (1993)

Zhang, K., Chan, L.W.: Extensions of ICA for causality discovery in the Hong Kong stock market. In: International Conference on Neural Information Processing, pp. 400–409. Springer, Berlin (2006)

Zhang, K., Hyvärinen, A.: Distinguishing causes from effects using nonlinear acyclic causal models. In: Journal of Machine Learning Research, Workshop and Conference Proceedings (NIPS 2008 Causality Workshop), vol. 6, pp. 157–164 (2008)

Zhang, K., Hyvärinen, A.: On the identifiability of the post-nonlinear causal model. In: Proceedings of the Twenty-fifth Conference on Uncertainty in Artificial Intelligence, pp. 647–655. AUAI Press, Corvallis (2009)

Zhang, K., Schölkopf, B., Janzing, D.: Invariant gaussian process latent variable models and application in causal discovery. arXiv preprint arXiv:1203.3534 (2012)

Zhang, K., Wang, Z., Schölkopf, B.: On estimation of functional causal models: post-nonlinear causal model as an example. In: 2013 IEEE 13th International Conference on Data Mining Workshops, pp. 139–146. IEEE (2013)

Zhang, K., Zhang, J., Schölkopf, B.: Distinguishing cause from effect based on exogeneity. arXiv preprint arXiv:1504.05651 (2015)

Download references

Acknowledgements

We thank the anonymous reviewers for their helpful comments to improve the paper. The work was supported in part by a Grant-in-Aid for Scientific Research (15K12148) from the Japan Society for the Promotion of Science.

Author information

Authors and affiliations.

Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan

Jing Song, Satoshi Oyama & Masahito Kurihara

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Jing Song .

Additional information

This paper is an revised and extended version of our conference paper [ 23 ].

Appendix: Pairs of variables used in experiment

See Table  4 .

Rights and permissions

Reprints and permissions

About this article

Song, J., Oyama, S. & Kurihara, M. Tell cause from effect: models and evaluation. Int J Data Sci Anal 4 , 99–112 (2017). https://doi.org/10.1007/s41060-017-0063-0

Download citation

Received : 20 July 2016

Accepted : 30 June 2017

Published : 21 July 2017

Issue Date : September 2017

DOI : https://doi.org/10.1007/s41060-017-0063-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Causal discovery
  • Time to make a decision
  • Find a journal
  • Publish with us
  • Track your research

From cause and effect to causes and effects

Affiliations.

  • 1 School of Medicine and Public Health, Faculty of Health and Medicine, University of Newcastle, Holgate, New South Wales, Australia.
  • 2 Foundation President, International Society for Systems and Complexity Sciences for Health, Waitsfield, Vermont, USA.
  • 3 Department of Philosophy, Baylor University, Waco, Texas, USA.
  • PMID: 36779244
  • DOI: 10.1111/jep.13814

It is now-at least loosely-acknowledged that most health and clinical outcomes are influenced by different interacting causes. Surprisingly, medical research studies are nearly universally designed to study-usually in a binary way-the effect of a single cause. Recent experiences during the coronavirus disease 2019 pandemic brought to the forefront that most of our challenges in medicine and healthcare deal with systemic, that is, interdependent and interconnected problems. Understanding these problems defy simplistic dichotomous research methodologies. These insights demand a shift in our thinking from 'cause and effect' to 'causes and effects' since this transcends the classical way of Cartesian reductionist thinking. We require a shift to a 'causes and effects' frame so we can choose the research methodology that reflects the relationships between variables of interest-one-to-one, one-to-many, many-to-one or many-to-many. One-to-one (or cause and effect) relationships are amenable to the traditional randomized control trial design, while all others require systemic designs to understand 'causes and effects'. Researchers urgently need to re-evaluate their science models and embrace research designs that allow an exploration of the clinically obvious multiple 'causes and effects' on health and disease. Clinical examples highlight the application of various systemic research methodologies and demonstrate how 'causes and effects' explain the heterogeneity of clinical outcomes. This shift in scientific thinking will allow us to find the necessary personalized or precise clinical interventions that address the underlying reasons for the variability of clinical outcomes and will contribute to greater health equity.

Keywords: heterogeneity; philosophy of science; reductionism; reductionist thinking; systems thinking.

© 2023 The Authors. Journal of Evaluation in Clinical Practice published by John Wiley & Sons Ltd.

  • Delivery of Health Care
  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Explanatory Research – Types, Methods, Guide

Explanatory Research – Types, Methods, Guide

Table of Contents

Explanatory Research

Explanatory Research

Definition :

Explanatory research is a type of research that aims to uncover the underlying causes and relationships between different variables. It seeks to explain why a particular phenomenon occurs and how it relates to other factors.

This type of research is typically used to test hypotheses or theories and to establish cause-and-effect relationships. Explanatory research often involves collecting data through surveys , experiments , or other empirical methods, and then analyzing that data to identify patterns and correlations. The results of explanatory research can provide a better understanding of the factors that contribute to a particular phenomenon and can help inform future research or policy decisions.

Types of Explanatory Research

There are several types of explanatory research, each with its own approach and focus. Some common types include:

Experimental Research

This involves manipulating one or more variables to observe the effect on other variables. It allows researchers to establish a cause-and-effect relationship between variables and is often used in natural and social sciences.

Quasi-experimental Research

This type of research is similar to experimental research but lacks full control over the variables. It is often used in situations where it is difficult or impossible to manipulate certain variables.

Correlational Research

This type of research aims to identify relationships between variables without manipulating them. It involves measuring and analyzing the strength and direction of the relationship between variables.

Case study Research

This involves an in-depth investigation of a specific case or situation. It is often used in social sciences and allows researchers to explore complex phenomena and contexts.

Historical Research

This involves the systematic study of past events and situations to understand their causes and effects. It is often used in fields such as history and sociology.

Survey Research

This involves collecting data from a sample of individuals through structured questionnaires or interviews. It allows researchers to investigate attitudes, behaviors, and opinions.

Explanatory Research Methods

There are several methods that can be used in explanatory research, depending on the research question and the type of data being collected. Some common methods include:

Experiments

In experimental research, researchers manipulate one or more variables to observe their effect on other variables. This allows them to establish a cause-and-effect relationship between the variables.

Surveys are used to collect data from a sample of individuals through structured questionnaires or interviews. This method can be used to investigate attitudes, behaviors, and opinions.

Correlational studies

This method aims to identify relationships between variables without manipulating them. It involves measuring and analyzing the strength and direction of the relationship between variables.

Case studies

Case studies involve an in-depth investigation of a specific case or situation. This method is often used in social sciences and allows researchers to explore complex phenomena and contexts.

Secondary Data Analysis

This method involves analyzing data that has already been collected by other researchers or organizations. It can be useful when primary data collection is not feasible or when additional data is needed to support research findings.

Data Analysis Methods

Explanatory research data analysis methods are used to explore the relationships between variables and to explain how they interact with each other. Here are some common data analysis methods used in explanatory research:

Correlation Analysis

Correlation analysis is used to identify the strength and direction of the relationship between two or more variables. This method is particularly useful when exploring the relationship between quantitative variables.

Regression Analysis

Regression analysis is used to identify the relationship between a dependent variable and one or more independent variables. This method is particularly useful when exploring the relationship between a dependent variable and several predictor variables.

Path Analysis

Path analysis is a method used to examine the direct and indirect relationships between variables. It is particularly useful when exploring complex relationships between variables.

Structural Equation Modeling (SEM)

SEM is a statistical method used to test and validate theoretical models of the relationships between variables. It is particularly useful when exploring complex models with multiple variables and relationships.

Factor Analysis

Factor analysis is used to identify underlying factors that contribute to the variation in a set of variables. This method is particularly useful when exploring relationships between multiple variables.

Content Analysis

Content analysis is used to analyze qualitative data by identifying themes and patterns in text, images, or other forms of data. This method is particularly useful when exploring the meaning and context of data.

Applications of Explanatory Research

The applications of explanatory research include:

  • Social sciences: Explanatory research is commonly used in social sciences to investigate the causes and effects of social phenomena, such as the relationship between poverty and crime, or the impact of social policies on individuals or communities.
  • Marketing : Explanatory research can be used in marketing to understand the reasons behind consumer behavior, such as why certain products are preferred over others or why customers choose to purchase from certain brands.
  • Healthcare : Explanatory research can be used in healthcare to identify the factors that contribute to disease or illness, as well as the effectiveness of different treatments and interventions.
  • Education : Explanatory research can be used in education to investigate the causes of academic achievement or failure, as well as the factors that influence teaching and learning processes.
  • Business : Explanatory research can be used in business to understand the factors that contribute to the success or failure of different strategies, as well as the impact of external factors, such as economic or political changes, on business operations.
  • Public policy: Explanatory research can be used in public policy to evaluate the effectiveness of policies and programs, as well as to identify the factors that contribute to social problems or inequalities.

Explanatory Research Question

An explanatory research question is a type of research question that seeks to explain the relationship between two or more variables, and to identify the underlying causes of that relationship. The goal of explanatory research is to test hypotheses or theories about the relationship between variables, and to gain a deeper understanding of complex phenomena.

Examples of explanatory research questions include:

  • What is the relationship between sleep quality and academic performance among college students, and what factors contribute to this relationship?
  • How do environmental factors, such as temperature and humidity, affect the spread of infectious diseases?
  • What are the factors that contribute to the success or failure of small businesses in a particular industry, and how do these factors interact with each other?
  • How do different teaching strategies impact student engagement and learning outcomes in the classroom?
  • What is the relationship between social support and mental health outcomes among individuals with chronic illnesses, and how does this relationship vary across different populations?

Examples of Explanatory Research

Here are a few Real-Time Examples of explanatory research:

  • Exploring the factors influencing customer loyalty: A business might conduct explanatory research to determine which factors, such as product quality, customer service, or price, have the greatest impact on customer loyalty. This research could involve collecting data through surveys, interviews, or other means and analyzing it using methods such as correlation or regression analysis.
  • Understanding the causes of crime: Law enforcement agencies might conduct explanatory research to identify the factors that contribute to crime in a particular area. This research could involve collecting data on factors such as poverty, unemployment, drug use, and social inequality and analyzing it using methods such as regression analysis or structural equation modeling.
  • Investigating the effectiveness of a new medical treatment: Medical researchers might conduct explanatory research to determine whether a new medical treatment is effective and which variables, such as dosage or patient age, are associated with its effectiveness. This research could involve conducting clinical trials and analyzing data using methods such as path analysis or SEM.
  • Exploring the impact of social media on mental health : Researchers might conduct explanatory research to determine whether social media use has a positive or negative impact on mental health and which variables, such as frequency of use or type of social media, are associated with mental health outcomes. This research could involve collecting data through surveys or interviews and analyzing it using methods such as factor analysis or content analysis.

When to use Explanatory Research

Here are some situations where explanatory research might be appropriate:

  • When exploring a new or complex phenomenon: Explanatory research can be used to understand the mechanisms of a new or complex phenomenon and to identify the variables that are most strongly associated with it.
  • When testing a theoretical model: Explanatory research can be used to test a theoretical model of the relationships between variables and to validate or modify the model based on empirical data.
  • When identifying the causal relationships between variables: Explanatory research can be used to identify the causal relationships between variables and to determine which variables have the greatest impact on the outcome of interest.
  • When conducting program evaluation: Explanatory research can be used to evaluate the effectiveness of a program or intervention and to identify the factors that contribute to its success or failure.
  • When making informed decisions: Explanatory research can be used to provide a basis for informed decision-making in business, government, or other contexts by identifying the factors that contribute to a particular outcome.

How to Conduct Explanatory Research

Here are the steps to conduct explanatory research:

  • Identify the research problem: Clearly define the research question or problem you want to investigate. This should involve identifying the variables that you want to explore, and the potential relationships between them.
  • Conduct a literature review: Review existing research on the topic to gain a deeper understanding of the variables and relationships you plan to explore. This can help you develop a hypothesis or research questions to guide your study.
  • Develop a research design: Decide on the research design that best suits your study. This may involve collecting data through surveys, interviews, experiments, or observations.
  • Collect and analyze data: Collect data from your selected sample and analyze it using appropriate statistical methods to identify any significant relationships between variables.
  • Interpret findings: Interpret the results of your analysis in light of your research question or hypothesis. Identify any patterns or relationships between variables, and discuss the implications of your findings for the wider field of study.
  • Draw conclusions: Draw conclusions based on your analysis and identify any areas for further research. Make recommendations for future research or policy based on your findings.

Purpose of Explanatory Research

The purpose of explanatory research is to identify and explain the relationships between different variables, as well as to determine the causes of those relationships. This type of research is often used to test hypotheses or theories, and to explore complex phenomena that are not well understood.

Explanatory research can help to answer questions such as “why” and “how” by providing a deeper understanding of the underlying causes and mechanisms of a particular phenomenon. For example, explanatory research can be used to determine the factors that contribute to a particular health condition, or to identify the reasons why certain marketing strategies are more effective than others.

The main purpose of explanatory research is to gain a deeper understanding of a particular phenomenon, with the goal of developing more effective solutions or interventions to address the problem. By identifying the underlying causes and mechanisms of a phenomenon, explanatory research can help to inform decision-making, policy development, and best practices in a wide range of fields, including healthcare, social sciences, business, and education

Advantages of Explanatory Research

Here are some advantages of explanatory research:

  • Provides a deeper understanding: Explanatory research aims to uncover the underlying causes and mechanisms of a particular phenomenon, providing a deeper understanding of complex phenomena that is not possible with other research designs.
  • Test hypotheses or theories: Explanatory research can be used to test hypotheses or theories by identifying the relationships between variables and determining the causes of those relationships.
  • Provides insights for decision-making: Explanatory research can provide insights that can inform decision-making in a wide range of fields, from healthcare to business.
  • Can lead to the development of effective solutions: By identifying the underlying causes of a problem, explanatory research can help to develop more effective solutions or interventions to address the problem.
  • Can improve the validity of research: By identifying and controlling for potential confounding variables, explanatory research can improve the validity and reliability of research findings.
  • Can be used in combination with other research designs : Explanatory research can be used in combination with other research designs, such as exploratory or descriptive research, to provide a more comprehensive understanding of a phenomenon.

Limitations of Explanatory Research

Here are some limitations of explanatory research:

  • Limited generalizability: Explanatory research typically involves studying a specific sample, which can limit the generalizability of findings to other populations or settings.
  • Time-consuming and resource-intensive: Explanatory research can be time-consuming and resource-intensive, particularly if it involves collecting and analyzing large amounts of data.
  • Limited scope: Explanatory research is typically focused on a narrow research question or hypothesis, which can limit its scope in comparison to other research designs such as exploratory or descriptive research.
  • Limited control over variables: Explanatory research can be limited by the researcher’s ability to control for all possible variables that may influence the relationship between variables of interest.
  • Potential for bias: Explanatory research can be subject to various types of bias, such as selection bias, measurement bias, and recall bias, which can influence the validity of research findings.
  • Ethical considerations: Explanatory research may involve the use of invasive or risky procedures, which can raise ethical concerns and require careful consideration of the potential risks and benefits of the study.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Case Study Research

Case Study – Methods, Examples and Guide

Observational Research

Observational Research – Methods and Guide

Quantitative Research

Quantitative Research – Methods, Types and...

Survey Research

Survey Research – Types, Methods, Examples

Experimental Research Design

Experimental Design – Types, Methods, Guide

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Ch 2: Psychological Research Methods

Children sit in front of a bank of television screens. A sign on the wall says, “Some content may not be suitable for children.”

Have you ever wondered whether the violence you see on television affects your behavior? Are you more likely to behave aggressively in real life after watching people behave violently in dramatic situations on the screen? Or, could seeing fictional violence actually get aggression out of your system, causing you to be more peaceful? How are children influenced by the media they are exposed to? A psychologist interested in the relationship between behavior and exposure to violent images might ask these very questions.

The topic of violence in the media today is contentious. Since ancient times, humans have been concerned about the effects of new technologies on our behaviors and thinking processes. The Greek philosopher Socrates, for example, worried that writing—a new technology at that time—would diminish people’s ability to remember because they could rely on written records rather than committing information to memory. In our world of quickly changing technologies, questions about the effects of media continue to emerge. Is it okay to talk on a cell phone while driving? Are headphones good to use in a car? What impact does text messaging have on reaction time while driving? These are types of questions that psychologist David Strayer asks in his lab.

Watch this short video to see how Strayer utilizes the scientific method to reach important conclusions regarding technology and driving safety.

You can view the transcript for “Understanding driver distraction” here (opens in new window) .

How can we go about finding answers that are supported not by mere opinion, but by evidence that we can all agree on? The findings of psychological research can help us navigate issues like this.

Introduction to the Scientific Method

Learning objectives.

  • Explain the steps of the scientific method
  • Describe why the scientific method is important to psychology
  • Summarize the processes of informed consent and debriefing
  • Explain how research involving humans or animals is regulated

photograph of the word "research" from a dictionary with a pen pointing at the word.

Scientists are engaged in explaining and understanding how the world around them works, and they are able to do so by coming up with theories that generate hypotheses that are testable and falsifiable. Theories that stand up to their tests are retained and refined, while those that do not are discarded or modified. In this way, research enables scientists to separate fact from simple opinion. Having good information generated from research aids in making wise decisions both in public policy and in our personal lives. In this section, you’ll see how psychologists use the scientific method to study and understand behavior.

The Scientific Process

A skull has a large hole bored through the forehead.

The goal of all scientists is to better understand the world around them. Psychologists focus their attention on understanding behavior, as well as the cognitive (mental) and physiological (body) processes that underlie behavior. In contrast to other methods that people use to understand the behavior of others, such as intuition and personal experience, the hallmark of scientific research is that there is evidence to support a claim. Scientific knowledge is empirical : It is grounded in objective, tangible evidence that can be observed time and time again, regardless of who is observing.

While behavior is observable, the mind is not. If someone is crying, we can see the behavior. However, the reason for the behavior is more difficult to determine. Is the person crying due to being sad, in pain, or happy? Sometimes we can learn the reason for someone’s behavior by simply asking a question, like “Why are you crying?” However, there are situations in which an individual is either uncomfortable or unwilling to answer the question honestly, or is incapable of answering. For example, infants would not be able to explain why they are crying. In such circumstances, the psychologist must be creative in finding ways to better understand behavior. This module explores how scientific knowledge is generated, and how important that knowledge is in forming decisions in our personal lives and in the public domain.

Process of Scientific Research

Flowchart of the scientific method. It begins with make an observation, then ask a question, form a hypothesis that answers the question, make a prediction based on the hypothesis, do an experiment to test the prediction, analyze the results, prove the hypothesis correct or incorrect, then report the results.

Scientific knowledge is advanced through a process known as the scientific method. Basically, ideas (in the form of theories and hypotheses) are tested against the real world (in the form of empirical observations), and those empirical observations lead to more ideas that are tested against the real world, and so on.

The basic steps in the scientific method are:

  • Observe a natural phenomenon and define a question about it
  • Make a hypothesis, or potential solution to the question
  • Test the hypothesis
  • If the hypothesis is true, find more evidence or find counter-evidence
  • If the hypothesis is false, create a new hypothesis or try again
  • Draw conclusions and repeat–the scientific method is never-ending, and no result is ever considered perfect

In order to ask an important question that may improve our understanding of the world, a researcher must first observe natural phenomena. By making observations, a researcher can define a useful question. After finding a question to answer, the researcher can then make a prediction (a hypothesis) about what he or she thinks the answer will be. This prediction is usually a statement about the relationship between two or more variables. After making a hypothesis, the researcher will then design an experiment to test his or her hypothesis and evaluate the data gathered. These data will either support or refute the hypothesis. Based on the conclusions drawn from the data, the researcher will then find more evidence to support the hypothesis, look for counter-evidence to further strengthen the hypothesis, revise the hypothesis and create a new experiment, or continue to incorporate the information gathered to answer the research question.

Basic Principles of the Scientific Method

Two key concepts in the scientific approach are theory and hypothesis. A theory is a well-developed set of ideas that propose an explanation for observed phenomena that can be used to make predictions about future observations. A hypothesis is a testable prediction that is arrived at logically from a theory. It is often worded as an if-then statement (e.g., if I study all night, I will get a passing grade on the test). The hypothesis is extremely important because it bridges the gap between the realm of ideas and the real world. As specific hypotheses are tested, theories are modified and refined to reflect and incorporate the result of these tests.

A diagram has four boxes: the top is labeled “theory,” the right is labeled “hypothesis,” the bottom is labeled “research,” and the left is labeled “observation.” Arrows flow in the direction from top to right to bottom to left and back to the top, clockwise. The top right arrow is labeled “use the hypothesis to form a theory,” the bottom right arrow is labeled “design a study to test the hypothesis,” the bottom left arrow is labeled “perform the research,” and the top left arrow is labeled “create or modify the theory.”

Other key components in following the scientific method include verifiability, predictability, falsifiability, and fairness. Verifiability means that an experiment must be replicable by another researcher. To achieve verifiability, researchers must make sure to document their methods and clearly explain how their experiment is structured and why it produces certain results.

Predictability in a scientific theory implies that the theory should enable us to make predictions about future events. The precision of these predictions is a measure of the strength of the theory.

Falsifiability refers to whether a hypothesis can be disproved. For a hypothesis to be falsifiable, it must be logically possible to make an observation or do a physical experiment that would show that there is no support for the hypothesis. Even when a hypothesis cannot be shown to be false, that does not necessarily mean it is not valid. Future testing may disprove the hypothesis. This does not mean that a hypothesis has to be shown to be false, just that it can be tested.

To determine whether a hypothesis is supported or not supported, psychological researchers must conduct hypothesis testing using statistics. Hypothesis testing is a type of statistics that determines the probability of a hypothesis being true or false. If hypothesis testing reveals that results were “statistically significant,” this means that there was support for the hypothesis and that the researchers can be reasonably confident that their result was not due to random chance. If the results are not statistically significant, this means that the researchers’ hypothesis was not supported.

Fairness implies that all data must be considered when evaluating a hypothesis. A researcher cannot pick and choose what data to keep and what to discard or focus specifically on data that support or do not support a particular hypothesis. All data must be accounted for, even if they invalidate the hypothesis.

Applying the Scientific Method

To see how this process works, let’s consider a specific theory and a hypothesis that might be generated from that theory. As you’ll learn in a later module, the James-Lange theory of emotion asserts that emotional experience relies on the physiological arousal associated with the emotional state. If you walked out of your home and discovered a very aggressive snake waiting on your doorstep, your heart would begin to race and your stomach churn. According to the James-Lange theory, these physiological changes would result in your feeling of fear. A hypothesis that could be derived from this theory might be that a person who is unaware of the physiological arousal that the sight of the snake elicits will not feel fear.

Remember that a good scientific hypothesis is falsifiable, or capable of being shown to be incorrect. Recall from the introductory module that Sigmund Freud had lots of interesting ideas to explain various human behaviors (Figure 5). However, a major criticism of Freud’s theories is that many of his ideas are not falsifiable; for example, it is impossible to imagine empirical observations that would disprove the existence of the id, the ego, and the superego—the three elements of personality described in Freud’s theories. Despite this, Freud’s theories are widely taught in introductory psychology texts because of their historical significance for personality psychology and psychotherapy, and these remain the root of all modern forms of therapy.

(a)A photograph shows Freud holding a cigar. (b) The mind’s conscious and unconscious states are illustrated as an iceberg floating in water. Beneath the water’s surface in the “unconscious” area are the id, ego, and superego. The area just below the water’s surface is labeled “preconscious.” The area above the water’s surface is labeled “conscious.”

In contrast, the James-Lange theory does generate falsifiable hypotheses, such as the one described above. Some individuals who suffer significant injuries to their spinal columns are unable to feel the bodily changes that often accompany emotional experiences. Therefore, we could test the hypothesis by determining how emotional experiences differ between individuals who have the ability to detect these changes in their physiological arousal and those who do not. In fact, this research has been conducted and while the emotional experiences of people deprived of an awareness of their physiological arousal may be less intense, they still experience emotion (Chwalisz, Diener, & Gallagher, 1988).

Link to Learning

Why the scientific method is important for psychology.

The use of the scientific method is one of the main features that separates modern psychology from earlier philosophical inquiries about the mind. Compared to chemistry, physics, and other “natural sciences,” psychology has long been considered one of the “social sciences” because of the subjective nature of the things it seeks to study. Many of the concepts that psychologists are interested in—such as aspects of the human mind, behavior, and emotions—are subjective and cannot be directly measured. Psychologists often rely instead on behavioral observations and self-reported data, which are considered by some to be illegitimate or lacking in methodological rigor. Applying the scientific method to psychology, therefore, helps to standardize the approach to understanding its very different types of information.

The scientific method allows psychological data to be replicated and confirmed in many instances, under different circumstances, and by a variety of researchers. Through replication of experiments, new generations of psychologists can reduce errors and broaden the applicability of theories. It also allows theories to be tested and validated instead of simply being conjectures that could never be verified or falsified. All of this allows psychologists to gain a stronger understanding of how the human mind works.

Scientific articles published in journals and psychology papers written in the style of the American Psychological Association (i.e., in “APA style”) are structured around the scientific method. These papers include an Introduction, which introduces the background information and outlines the hypotheses; a Methods section, which outlines the specifics of how the experiment was conducted to test the hypothesis; a Results section, which includes the statistics that tested the hypothesis and state whether it was supported or not supported, and a Discussion and Conclusion, which state the implications of finding support for, or no support for, the hypothesis. Writing articles and papers that adhere to the scientific method makes it easy for future researchers to repeat the study and attempt to replicate the results.

Ethics in Research

Today, scientists agree that good research is ethical in nature and is guided by a basic respect for human dignity and safety. However, as you will read in the Tuskegee Syphilis Study, this has not always been the case. Modern researchers must demonstrate that the research they perform is ethically sound. This section presents how ethical considerations affect the design and implementation of research conducted today.

Research Involving Human Participants

Any experiment involving the participation of human subjects is governed by extensive, strict guidelines designed to ensure that the experiment does not result in harm. Any research institution that receives federal support for research involving human participants must have access to an institutional review board (IRB) . The IRB is a committee of individuals often made up of members of the institution’s administration, scientists, and community members (Figure 6). The purpose of the IRB is to review proposals for research that involves human participants. The IRB reviews these proposals with the principles mentioned above in mind, and generally, approval from the IRB is required in order for the experiment to proceed.

A photograph shows a group of people seated around tables in a meeting room.

An institution’s IRB requires several components in any experiment it approves. For one, each participant must sign an informed consent form before they can participate in the experiment. An informed consent  form provides a written description of what participants can expect during the experiment, including potential risks and implications of the research. It also lets participants know that their involvement is completely voluntary and can be discontinued without penalty at any time. Furthermore, the informed consent guarantees that any data collected in the experiment will remain completely confidential. In cases where research participants are under the age of 18, the parents or legal guardians are required to sign the informed consent form.

While the informed consent form should be as honest as possible in describing exactly what participants will be doing, sometimes deception is necessary to prevent participants’ knowledge of the exact research question from affecting the results of the study. Deception involves purposely misleading experiment participants in order to maintain the integrity of the experiment, but not to the point where the deception could be considered harmful. For example, if we are interested in how our opinion of someone is affected by their attire, we might use deception in describing the experiment to prevent that knowledge from affecting participants’ responses. In cases where deception is involved, participants must receive a full debriefing  upon conclusion of the study—complete, honest information about the purpose of the experiment, how the data collected will be used, the reasons why deception was necessary, and information about how to obtain additional information about the study.

Dig Deeper: Ethics and the Tuskegee Syphilis Study

Unfortunately, the ethical guidelines that exist for research today were not always applied in the past. In 1932, poor, rural, black, male sharecroppers from Tuskegee, Alabama, were recruited to participate in an experiment conducted by the U.S. Public Health Service, with the aim of studying syphilis in black men (Figure 7). In exchange for free medical care, meals, and burial insurance, 600 men agreed to participate in the study. A little more than half of the men tested positive for syphilis, and they served as the experimental group (given that the researchers could not randomly assign participants to groups, this represents a quasi-experiment). The remaining syphilis-free individuals served as the control group. However, those individuals that tested positive for syphilis were never informed that they had the disease.

While there was no treatment for syphilis when the study began, by 1947 penicillin was recognized as an effective treatment for the disease. Despite this, no penicillin was administered to the participants in this study, and the participants were not allowed to seek treatment at any other facilities if they continued in the study. Over the course of 40 years, many of the participants unknowingly spread syphilis to their wives (and subsequently their children born from their wives) and eventually died because they never received treatment for the disease. This study was discontinued in 1972 when the experiment was discovered by the national press (Tuskegee University, n.d.). The resulting outrage over the experiment led directly to the National Research Act of 1974 and the strict ethical guidelines for research on humans described in this chapter. Why is this study unethical? How were the men who participated and their families harmed as a function of this research?

A photograph shows a person administering an injection.

Learn more about the Tuskegee Syphilis Study on the CDC website .

Research Involving Animal Subjects

A photograph shows a rat.

This does not mean that animal researchers are immune to ethical concerns. Indeed, the humane and ethical treatment of animal research subjects is a critical aspect of this type of research. Researchers must design their experiments to minimize any pain or distress experienced by animals serving as research subjects.

Whereas IRBs review research proposals that involve human participants, animal experimental proposals are reviewed by an Institutional Animal Care and Use Committee (IACUC) . An IACUC consists of institutional administrators, scientists, veterinarians, and community members. This committee is charged with ensuring that all experimental proposals require the humane treatment of animal research subjects. It also conducts semi-annual inspections of all animal facilities to ensure that the research protocols are being followed. No animal research project can proceed without the committee’s approval.

Introduction to Approaches to Research

  • Differentiate between descriptive, correlational, and experimental research
  • Explain the strengths and weaknesses of case studies, naturalistic observation, and surveys
  • Describe the strength and weaknesses of archival research
  • Compare longitudinal and cross-sectional approaches to research
  • Explain what a correlation coefficient tells us about the relationship between variables
  • Describe why correlation does not mean causation
  • Describe the experimental process, including ways to control for bias
  • Identify and differentiate between independent and dependent variables

Three researchers review data while talking around a microscope.

Psychologists use descriptive, experimental, and correlational methods to conduct research. Descriptive, or qualitative, methods include the case study, naturalistic observation, surveys, archival research, longitudinal research, and cross-sectional research.

Experiments are conducted in order to determine cause-and-effect relationships. In ideal experimental design, the only difference between the experimental and control groups is whether participants are exposed to the experimental manipulation. Each group goes through all phases of the experiment, but each group will experience a different level of the independent variable: the experimental group is exposed to the experimental manipulation, and the control group is not exposed to the experimental manipulation. The researcher then measures the changes that are produced in the dependent variable in each group. Once data is collected from both groups, it is analyzed statistically to determine if there are meaningful differences between the groups.

When scientists passively observe and measure phenomena it is called correlational research. Here, psychologists do not intervene and change behavior, as they do in experiments. In correlational research, they identify patterns of relationships, but usually cannot infer what causes what. Importantly, with correlational research, you can examine only two variables at a time, no more and no less.

Watch It: More on Research

If you enjoy learning through lectures and want an interesting and comprehensive summary of this section, then click on the Youtube link to watch a lecture given by MIT Professor John Gabrieli . Start at the 30:45 minute mark  and watch through the end to hear examples of actual psychological studies and how they were analyzed. Listen for references to independent and dependent variables, experimenter bias, and double-blind studies. In the lecture, you’ll learn about breaking social norms, “WEIRD” research, why expectations matter, how a warm cup of coffee might make you nicer, why you should change your answer on a multiple choice test, and why praise for intelligence won’t make you any smarter.

You can view the transcript for “Lec 2 | MIT 9.00SC Introduction to Psychology, Spring 2011” here (opens in new window) .

Descriptive Research

There are many research methods available to psychologists in their efforts to understand, describe, and explain behavior and the cognitive and biological processes that underlie it. Some methods rely on observational techniques. Other approaches involve interactions between the researcher and the individuals who are being studied—ranging from a series of simple questions to extensive, in-depth interviews—to well-controlled experiments.

The three main categories of psychological research are descriptive, correlational, and experimental research. Research studies that do not test specific relationships between variables are called descriptive, or qualitative, studies . These studies are used to describe general or specific behaviors and attributes that are observed and measured. In the early stages of research it might be difficult to form a hypothesis, especially when there is not any existing literature in the area. In these situations designing an experiment would be premature, as the question of interest is not yet clearly defined as a hypothesis. Often a researcher will begin with a non-experimental approach, such as a descriptive study, to gather more information about the topic before designing an experiment or correlational study to address a specific hypothesis. Descriptive research is distinct from correlational research , in which psychologists formally test whether a relationship exists between two or more variables. Experimental research  goes a step further beyond descriptive and correlational research and randomly assigns people to different conditions, using hypothesis testing to make inferences about how these conditions affect behavior. It aims to determine if one variable directly impacts and causes another. Correlational and experimental research both typically use hypothesis testing, whereas descriptive research does not.

Each of these research methods has unique strengths and weaknesses, and each method may only be appropriate for certain types of research questions. For example, studies that rely primarily on observation produce incredible amounts of information, but the ability to apply this information to the larger population is somewhat limited because of small sample sizes. Survey research, on the other hand, allows researchers to easily collect data from relatively large samples. While this allows for results to be generalized to the larger population more easily, the information that can be collected on any given survey is somewhat limited and subject to problems associated with any type of self-reported data. Some researchers conduct archival research by using existing records. While this can be a fairly inexpensive way to collect data that can provide insight into a number of research questions, researchers using this approach have no control on how or what kind of data was collected.

Correlational research can find a relationship between two variables, but the only way a researcher can claim that the relationship between the variables is cause and effect is to perform an experiment. In experimental research, which will be discussed later in the text, there is a tremendous amount of control over variables of interest. While this is a powerful approach, experiments are often conducted in very artificial settings. This calls into question the validity of experimental findings with regard to how they would apply in real-world settings. In addition, many of the questions that psychologists would like to answer cannot be pursued through experimental research because of ethical concerns.

The three main types of descriptive studies are, naturalistic observation, case studies, and surveys.

Naturalistic Observation

If you want to understand how behavior occurs, one of the best ways to gain information is to simply observe the behavior in its natural context. However, people might change their behavior in unexpected ways if they know they are being observed. How do researchers obtain accurate information when people tend to hide their natural behavior? As an example, imagine that your professor asks everyone in your class to raise their hand if they always wash their hands after using the restroom. Chances are that almost everyone in the classroom will raise their hand, but do you think hand washing after every trip to the restroom is really that universal?

This is very similar to the phenomenon mentioned earlier in this module: many individuals do not feel comfortable answering a question honestly. But if we are committed to finding out the facts about hand washing, we have other options available to us.

Suppose we send a classmate into the restroom to actually watch whether everyone washes their hands after using the restroom. Will our observer blend into the restroom environment by wearing a white lab coat, sitting with a clipboard, and staring at the sinks? We want our researcher to be inconspicuous—perhaps standing at one of the sinks pretending to put in contact lenses while secretly recording the relevant information. This type of observational study is called naturalistic observation : observing behavior in its natural setting. To better understand peer exclusion, Suzanne Fanger collaborated with colleagues at the University of Texas to observe the behavior of preschool children on a playground. How did the observers remain inconspicuous over the duration of the study? They equipped a few of the children with wireless microphones (which the children quickly forgot about) and observed while taking notes from a distance. Also, the children in that particular preschool (a “laboratory preschool”) were accustomed to having observers on the playground (Fanger, Frankel, & Hazen, 2012).

A photograph shows two police cars driving, one with its lights flashing.

It is critical that the observer be as unobtrusive and as inconspicuous as possible: when people know they are being watched, they are less likely to behave naturally. If you have any doubt about this, ask yourself how your driving behavior might differ in two situations: In the first situation, you are driving down a deserted highway during the middle of the day; in the second situation, you are being followed by a police car down the same deserted highway (Figure 9).

It should be pointed out that naturalistic observation is not limited to research involving humans. Indeed, some of the best-known examples of naturalistic observation involve researchers going into the field to observe various kinds of animals in their own environments. As with human studies, the researchers maintain their distance and avoid interfering with the animal subjects so as not to influence their natural behaviors. Scientists have used this technique to study social hierarchies and interactions among animals ranging from ground squirrels to gorillas. The information provided by these studies is invaluable in understanding how those animals organize socially and communicate with one another. The anthropologist Jane Goodall, for example, spent nearly five decades observing the behavior of chimpanzees in Africa (Figure 10). As an illustration of the types of concerns that a researcher might encounter in naturalistic observation, some scientists criticized Goodall for giving the chimps names instead of referring to them by numbers—using names was thought to undermine the emotional detachment required for the objectivity of the study (McKie, 2010).

(a) A photograph shows Jane Goodall speaking from a lectern. (b) A photograph shows a chimpanzee’s face.

The greatest benefit of naturalistic observation is the validity, or accuracy, of information collected unobtrusively in a natural setting. Having individuals behave as they normally would in a given situation means that we have a higher degree of ecological validity, or realism, than we might achieve with other research approaches. Therefore, our ability to generalize  the findings of the research to real-world situations is enhanced. If done correctly, we need not worry about people or animals modifying their behavior simply because they are being observed. Sometimes, people may assume that reality programs give us a glimpse into authentic human behavior. However, the principle of inconspicuous observation is violated as reality stars are followed by camera crews and are interviewed on camera for personal confessionals. Given that environment, we must doubt how natural and realistic their behaviors are.

The major downside of naturalistic observation is that they are often difficult to set up and control. In our restroom study, what if you stood in the restroom all day prepared to record people’s hand washing behavior and no one came in? Or, what if you have been closely observing a troop of gorillas for weeks only to find that they migrated to a new place while you were sleeping in your tent? The benefit of realistic data comes at a cost. As a researcher you have no control of when (or if) you have behavior to observe. In addition, this type of observational research often requires significant investments of time, money, and a good dose of luck.

Sometimes studies involve structured observation. In these cases, people are observed while engaging in set, specific tasks. An excellent example of structured observation comes from Strange Situation by Mary Ainsworth (you will read more about this in the module on lifespan development). The Strange Situation is a procedure used to evaluate attachment styles that exist between an infant and caregiver. In this scenario, caregivers bring their infants into a room filled with toys. The Strange Situation involves a number of phases, including a stranger coming into the room, the caregiver leaving the room, and the caregiver’s return to the room. The infant’s behavior is closely monitored at each phase, but it is the behavior of the infant upon being reunited with the caregiver that is most telling in terms of characterizing the infant’s attachment style with the caregiver.

Another potential problem in observational research is observer bias . Generally, people who act as observers are closely involved in the research project and may unconsciously skew their observations to fit their research goals or expectations. To protect against this type of bias, researchers should have clear criteria established for the types of behaviors recorded and how those behaviors should be classified. In addition, researchers often compare observations of the same event by multiple observers, in order to test inter-rater reliability : a measure of reliability that assesses the consistency of observations by different observers.

Case Studies

In 2011, the New York Times published a feature story on Krista and Tatiana Hogan, Canadian twin girls. These particular twins are unique because Krista and Tatiana are conjoined twins, connected at the head. There is evidence that the two girls are connected in a part of the brain called the thalamus, which is a major sensory relay center. Most incoming sensory information is sent through the thalamus before reaching higher regions of the cerebral cortex for processing.

The implications of this potential connection mean that it might be possible for one twin to experience the sensations of the other twin. For instance, if Krista is watching a particularly funny television program, Tatiana might smile or laugh even if she is not watching the program. This particular possibility has piqued the interest of many neuroscientists who seek to understand how the brain uses sensory information.

These twins represent an enormous resource in the study of the brain, and since their condition is very rare, it is likely that as long as their family agrees, scientists will follow these girls very closely throughout their lives to gain as much information as possible (Dominus, 2011).

In observational research, scientists are conducting a clinical or case study when they focus on one person or just a few individuals. Indeed, some scientists spend their entire careers studying just 10–20 individuals. Why would they do this? Obviously, when they focus their attention on a very small number of people, they can gain a tremendous amount of insight into those cases. The richness of information that is collected in clinical or case studies is unmatched by any other single research method. This allows the researcher to have a very deep understanding of the individuals and the particular phenomenon being studied.

If clinical or case studies provide so much information, why are they not more frequent among researchers? As it turns out, the major benefit of this particular approach is also a weakness. As mentioned earlier, this approach is often used when studying individuals who are interesting to researchers because they have a rare characteristic. Therefore, the individuals who serve as the focus of case studies are not like most other people. If scientists ultimately want to explain all behavior, focusing attention on such a special group of people can make it difficult to generalize any observations to the larger population as a whole. Generalizing refers to the ability to apply the findings of a particular research project to larger segments of society. Again, case studies provide enormous amounts of information, but since the cases are so specific, the potential to apply what’s learned to the average person may be very limited.

Often, psychologists develop surveys as a means of gathering data. Surveys are lists of questions to be answered by research participants, and can be delivered as paper-and-pencil questionnaires, administered electronically, or conducted verbally (Figure 11). Generally, the survey itself can be completed in a short time, and the ease of administering a survey makes it easy to collect data from a large number of people.

Surveys allow researchers to gather data from larger samples than may be afforded by other research methods . A sample is a subset of individuals selected from a population , which is the overall group of individuals that the researchers are interested in. Researchers study the sample and seek to generalize their findings to the population.

A sample online survey reads, “Dear visitor, your opinion is important to us. We would like to invite you to participate in a short survey to gather your opinions and feedback on your news consumption habits. The survey will take approximately 10-15 minutes. Simply click the “Yes” button below to launch the survey. Would you like to participate?” Two buttons are labeled “yes” and “no.”

There is both strength and weakness of the survey in comparison to case studies. By using surveys, we can collect information from a larger sample of people. A larger sample is better able to reflect the actual diversity of the population, thus allowing better generalizability. Therefore, if our sample is sufficiently large and diverse, we can assume that the data we collect from the survey can be generalized to the larger population with more certainty than the information collected through a case study. However, given the greater number of people involved, we are not able to collect the same depth of information on each person that would be collected in a case study.

Another potential weakness of surveys is something we touched on earlier in this chapter: people don’t always give accurate responses. They may lie, misremember, or answer questions in a way that they think makes them look good. For example, people may report drinking less alcohol than is actually the case.

Any number of research questions can be answered through the use of surveys. One real-world example is the research conducted by Jenkins, Ruppel, Kizer, Yehl, and Griffin (2012) about the backlash against the US Arab-American community following the terrorist attacks of September 11, 2001. Jenkins and colleagues wanted to determine to what extent these negative attitudes toward Arab-Americans still existed nearly a decade after the attacks occurred. In one study, 140 research participants filled out a survey with 10 questions, including questions asking directly about the participant’s overt prejudicial attitudes toward people of various ethnicities. The survey also asked indirect questions about how likely the participant would be to interact with a person of a given ethnicity in a variety of settings (such as, “How likely do you think it is that you would introduce yourself to a person of Arab-American descent?”). The results of the research suggested that participants were unwilling to report prejudicial attitudes toward any ethnic group. However, there were significant differences between their pattern of responses to questions about social interaction with Arab-Americans compared to other ethnic groups: they indicated less willingness for social interaction with Arab-Americans compared to the other ethnic groups. This suggested that the participants harbored subtle forms of prejudice against Arab-Americans, despite their assertions that this was not the case (Jenkins et al., 2012).

Think It Over

Archival research.

(a) A photograph shows stacks of paper files on shelves. (b) A photograph shows a computer.

In comparing archival research to other research methods, there are several important distinctions. For one, the researcher employing archival research never directly interacts with research participants. Therefore, the investment of time and money to collect data is considerably less with archival research. Additionally, researchers have no control over what information was originally collected. Therefore, research questions have to be tailored so they can be answered within the structure of the existing data sets. There is also no guarantee of consistency between the records from one source to another, which might make comparing and contrasting different data sets problematic.

Longitudinal and Cross-Sectional Research

Sometimes we want to see how people change over time, as in studies of human development and lifespan. When we test the same group of individuals repeatedly over an extended period of time, we are conducting longitudinal research. Longitudinal research  is a research design in which data-gathering is administered repeatedly over an extended period of time. For example, we may survey a group of individuals about their dietary habits at age 20, retest them a decade later at age 30, and then again at age 40.

Another approach is cross-sectional research . In cross-sectional research, a researcher compares multiple segments of the population at the same time. Using the dietary habits example above, the researcher might directly compare different groups of people by age. Instead of observing a group of people for 20 years to see how their dietary habits changed from decade to decade, the researcher would study a group of 20-year-old individuals and compare them to a group of 30-year-old individuals and a group of 40-year-old individuals. While cross-sectional research requires a shorter-term investment, it is also limited by differences that exist between the different generations (or cohorts) that have nothing to do with age per se, but rather reflect the social and cultural experiences of different generations of individuals make them different from one another.

To illustrate this concept, consider the following survey findings. In recent years there has been significant growth in the popular support of same-sex marriage. Many studies on this topic break down survey participants into different age groups. In general, younger people are more supportive of same-sex marriage than are those who are older (Jones, 2013). Does this mean that as we age we become less open to the idea of same-sex marriage, or does this mean that older individuals have different perspectives because of the social climates in which they grew up? Longitudinal research is a powerful approach because the same individuals are involved in the research project over time, which means that the researchers need to be less concerned with differences among cohorts affecting the results of their study.

Often longitudinal studies are employed when researching various diseases in an effort to understand particular risk factors. Such studies often involve tens of thousands of individuals who are followed for several decades. Given the enormous number of people involved in these studies, researchers can feel confident that their findings can be generalized to the larger population. The Cancer Prevention Study-3 (CPS-3) is one of a series of longitudinal studies sponsored by the American Cancer Society aimed at determining predictive risk factors associated with cancer. When participants enter the study, they complete a survey about their lives and family histories, providing information on factors that might cause or prevent the development of cancer. Then every few years the participants receive additional surveys to complete. In the end, hundreds of thousands of participants will be tracked over 20 years to determine which of them develop cancer and which do not.

Clearly, this type of research is important and potentially very informative. For instance, earlier longitudinal studies sponsored by the American Cancer Society provided some of the first scientific demonstrations of the now well-established links between increased rates of cancer and smoking (American Cancer Society, n.d.) (Figure 13).

A photograph shows pack of cigarettes and cigarettes in an ashtray. The pack of cigarettes reads, “Surgeon general’s warning: smoking causes lung cancer, heart disease, emphysema, and may complicate pregnancy.”

As with any research strategy, longitudinal research is not without limitations. For one, these studies require an incredible time investment by the researcher and research participants. Given that some longitudinal studies take years, if not decades, to complete, the results will not be known for a considerable period of time. In addition to the time demands, these studies also require a substantial financial investment. Many researchers are unable to commit the resources necessary to see a longitudinal project through to the end.

Research participants must also be willing to continue their participation for an extended period of time, and this can be problematic. People move, get married and take new names, get ill, and eventually die. Even without significant life changes, some people may simply choose to discontinue their participation in the project. As a result, the attrition  rates, or reduction in the number of research participants due to dropouts, in longitudinal studies are quite high and increases over the course of a project. For this reason, researchers using this approach typically recruit many participants fully expecting that a substantial number will drop out before the end. As the study progresses, they continually check whether the sample still represents the larger population, and make adjustments as necessary.

Correlational Research

Did you know that as sales in ice cream increase, so does the overall rate of crime? Is it possible that indulging in your favorite flavor of ice cream could send you on a crime spree? Or, after committing crime do you think you might decide to treat yourself to a cone? There is no question that a relationship exists between ice cream and crime (e.g., Harper, 2013), but it would be pretty foolish to decide that one thing actually caused the other to occur.

It is much more likely that both ice cream sales and crime rates are related to the temperature outside. When the temperature is warm, there are lots of people out of their houses, interacting with each other, getting annoyed with one another, and sometimes committing crimes. Also, when it is warm outside, we are more likely to seek a cool treat like ice cream. How do we determine if there is indeed a relationship between two things? And when there is a relationship, how can we discern whether it is attributable to coincidence or causation?

Three scatterplots are shown. Scatterplot (a) is labeled “positive correlation” and shows scattered dots forming a rough line from the bottom left to the top right; the x-axis is labeled “weight” and the y-axis is labeled “height.” Scatterplot (b) is labeled “negative correlation” and shows scattered dots forming a rough line from the top left to the bottom right; the x-axis is labeled “tiredness” and the y-axis is labeled “hours of sleep.” Scatterplot (c) is labeled “no correlation” and shows scattered dots having no pattern; the x-axis is labeled “shoe size” and the y-axis is labeled “hours of sleep.”

Correlation Does Not Indicate Causation

Correlational research is useful because it allows us to discover the strength and direction of relationships that exist between two variables. However, correlation is limited because establishing the existence of a relationship tells us little about cause and effect . While variables are sometimes correlated because one does cause the other, it could also be that some other factor, a confounding variable , is actually causing the systematic movement in our variables of interest. In the ice cream/crime rate example mentioned earlier, temperature is a confounding variable that could account for the relationship between the two variables.

Even when we cannot point to clear confounding variables, we should not assume that a correlation between two variables implies that one variable causes changes in another. This can be frustrating when a cause-and-effect relationship seems clear and intuitive. Think back to our discussion of the research done by the American Cancer Society and how their research projects were some of the first demonstrations of the link between smoking and cancer. It seems reasonable to assume that smoking causes cancer, but if we were limited to correlational research , we would be overstepping our bounds by making this assumption.

A photograph shows a bowl of cereal.

Unfortunately, people mistakenly make claims of causation as a function of correlations all the time. Such claims are especially common in advertisements and news stories. For example, recent research found that people who eat cereal on a regular basis achieve healthier weights than those who rarely eat cereal (Frantzen, Treviño, Echon, Garcia-Dominic, & DiMarco, 2013; Barton et al., 2005). Guess how the cereal companies report this finding. Does eating cereal really cause an individual to maintain a healthy weight, or are there other possible explanations, such as, someone at a healthy weight is more likely to regularly eat a healthy breakfast than someone who is obese or someone who avoids meals in an attempt to diet (Figure 15)? While correlational research is invaluable in identifying relationships among variables, a major limitation is the inability to establish causality. Psychologists want to make statements about cause and effect, but the only way to do that is to conduct an experiment to answer a research question. The next section describes how scientific experiments incorporate methods that eliminate, or control for, alternative explanations, which allow researchers to explore how changes in one variable cause changes in another variable.

Watch this clip from Freakonomics for an example of how correlation does  not  indicate causation.

You can view the transcript for “Correlation vs. Causality: Freakonomics Movie” here (opens in new window) .

Illusory Correlations

The temptation to make erroneous cause-and-effect statements based on correlational research is not the only way we tend to misinterpret data. We also tend to make the mistake of illusory correlations, especially with unsystematic observations. Illusory correlations , or false correlations, occur when people believe that relationships exist between two things when no such relationship exists. One well-known illusory correlation is the supposed effect that the moon’s phases have on human behavior. Many people passionately assert that human behavior is affected by the phase of the moon, and specifically, that people act strangely when the moon is full (Figure 16).

A photograph shows the moon.

There is no denying that the moon exerts a powerful influence on our planet. The ebb and flow of the ocean’s tides are tightly tied to the gravitational forces of the moon. Many people believe, therefore, that it is logical that we are affected by the moon as well. After all, our bodies are largely made up of water. A meta-analysis of nearly 40 studies consistently demonstrated, however, that the relationship between the moon and our behavior does not exist (Rotton & Kelly, 1985). While we may pay more attention to odd behavior during the full phase of the moon, the rates of odd behavior remain constant throughout the lunar cycle.

Why are we so apt to believe in illusory correlations like this? Often we read or hear about them and simply accept the information as valid. Or, we have a hunch about how something works and then look for evidence to support that hunch, ignoring evidence that would tell us our hunch is false; this is known as confirmation bias . Other times, we find illusory correlations based on the information that comes most easily to mind, even if that information is severely limited. And while we may feel confident that we can use these relationships to better understand and predict the world around us, illusory correlations can have significant drawbacks. For example, research suggests that illusory correlations—in which certain behaviors are inaccurately attributed to certain groups—are involved in the formation of prejudicial attitudes that can ultimately lead to discriminatory behavior (Fiedler, 2004).

We all have a tendency to make illusory correlations from time to time. Try to think of an illusory correlation that is held by you, a family member, or a close friend. How do you think this illusory correlation came about and what can be done in the future to combat them?

Experiments

Causality: conducting experiments and using the data, experimental hypothesis.

In order to conduct an experiment, a researcher must have a specific hypothesis to be tested. As you’ve learned, hypotheses can be formulated either through direct observation of the real world or after careful review of previous research. For example, if you think that children should not be allowed to watch violent programming on television because doing so would cause them to behave more violently, then you have basically formulated a hypothesis—namely, that watching violent television programs causes children to behave more violently. How might you have arrived at this particular hypothesis? You may have younger relatives who watch cartoons featuring characters using martial arts to save the world from evildoers, with an impressive array of punching, kicking, and defensive postures. You notice that after watching these programs for a while, your young relatives mimic the fighting behavior of the characters portrayed in the cartoon (Figure 17).

A photograph shows a child pointing a toy gun.

These sorts of personal observations are what often lead us to formulate a specific hypothesis, but we cannot use limited personal observations and anecdotal evidence to rigorously test our hypothesis. Instead, to find out if real-world data supports our hypothesis, we have to conduct an experiment.

Designing an Experiment

The most basic experimental design involves two groups: the experimental group and the control group. The two groups are designed to be the same except for one difference— experimental manipulation. The experimental group  gets the experimental manipulation—that is, the treatment or variable being tested (in this case, violent TV images)—and the control group does not. Since experimental manipulation is the only difference between the experimental and control groups, we can be sure that any differences between the two are due to experimental manipulation rather than chance.

In our example of how violent television programming might affect violent behavior in children, we have the experimental group view violent television programming for a specified time and then measure their violent behavior. We measure the violent behavior in our control group after they watch nonviolent television programming for the same amount of time. It is important for the control group to be treated similarly to the experimental group, with the exception that the control group does not receive the experimental manipulation. Therefore, we have the control group watch non-violent television programming for the same amount of time as the experimental group.

We also need to precisely define, or operationalize, what is considered violent and nonviolent. An operational definition is a description of how we will measure our variables, and it is important in allowing others understand exactly how and what a researcher measures in a particular experiment. In operationalizing violent behavior, we might choose to count only physical acts like kicking or punching as instances of this behavior, or we also may choose to include angry verbal exchanges. Whatever we determine, it is important that we operationalize violent behavior in such a way that anyone who hears about our study for the first time knows exactly what we mean by violence. This aids peoples’ ability to interpret our data as well as their capacity to repeat our experiment should they choose to do so.

Once we have operationalized what is considered violent television programming and what is considered violent behavior from our experiment participants, we need to establish how we will run our experiment. In this case, we might have participants watch a 30-minute television program (either violent or nonviolent, depending on their group membership) before sending them out to a playground for an hour where their behavior is observed and the number and type of violent acts is recorded.

Ideally, the people who observe and record the children’s behavior are unaware of who was assigned to the experimental or control group, in order to control for experimenter bias. Experimenter bias refers to the possibility that a researcher’s expectations might skew the results of the study. Remember, conducting an experiment requires a lot of planning, and the people involved in the research project have a vested interest in supporting their hypotheses. If the observers knew which child was in which group, it might influence how much attention they paid to each child’s behavior as well as how they interpreted that behavior. By being blind to which child is in which group, we protect against those biases. This situation is a single-blind study , meaning that one of the groups (participants) are unaware as to which group they are in (experiment or control group) while the researcher who developed the experiment knows which participants are in each group.

A photograph shows three glass bottles of pills labeled as placebos.

In a double-blind study , both the researchers and the participants are blind to group assignments. Why would a researcher want to run a study where no one knows who is in which group? Because by doing so, we can control for both experimenter and participant expectations. If you are familiar with the phrase placebo effect, you already have some idea as to why this is an important consideration. The placebo effect occurs when people’s expectations or beliefs influence or determine their experience in a given situation. In other words, simply expecting something to happen can actually make it happen.

The placebo effect is commonly described in terms of testing the effectiveness of a new medication. Imagine that you work in a pharmaceutical company, and you think you have a new drug that is effective in treating depression. To demonstrate that your medication is effective, you run an experiment with two groups: The experimental group receives the medication, and the control group does not. But you don’t want participants to know whether they received the drug or not.

Why is that? Imagine that you are a participant in this study, and you have just taken a pill that you think will improve your mood. Because you expect the pill to have an effect, you might feel better simply because you took the pill and not because of any drug actually contained in the pill—this is the placebo effect.

To make sure that any effects on mood are due to the drug and not due to expectations, the control group receives a placebo (in this case a sugar pill). Now everyone gets a pill, and once again neither the researcher nor the experimental participants know who got the drug and who got the sugar pill. Any differences in mood between the experimental and control groups can now be attributed to the drug itself rather than to experimenter bias or participant expectations (Figure 18).

Independent and Dependent Variables

In a research experiment, we strive to study whether changes in one thing cause changes in another. To achieve this, we must pay attention to two important variables, or things that can be changed, in any experimental study: the independent variable and the dependent variable. An independent variable is manipulated or controlled by the experimenter. In a well-designed experimental study, the independent variable is the only important difference between the experimental and control groups. In our example of how violent television programs affect children’s display of violent behavior, the independent variable is the type of program—violent or nonviolent—viewed by participants in the study (Figure 19). A dependent variable is what the researcher measures to see how much effect the independent variable had. In our example, the dependent variable is the number of violent acts displayed by the experimental participants.

A box labeled “independent variable: type of television programming viewed” contains a photograph of a person shooting an automatic weapon. An arrow labeled “influences change in the…” leads to a second box. The second box is labeled “dependent variable: violent behavior displayed” and has a photograph of a child pointing a toy gun.

We expect that the dependent variable will change as a function of the independent variable. In other words, the dependent variable depends on the independent variable. A good way to think about the relationship between the independent and dependent variables is with this question: What effect does the independent variable have on the dependent variable? Returning to our example, what effect does watching a half hour of violent television programming or nonviolent television programming have on the number of incidents of physical aggression displayed on the playground?

Selecting and Assigning Experimental Participants

Now that our study is designed, we need to obtain a sample of individuals to include in our experiment. Our study involves human participants so we need to determine who to include. Participants  are the subjects of psychological research, and as the name implies, individuals who are involved in psychological research actively participate in the process. Often, psychological research projects rely on college students to serve as participants. In fact, the vast majority of research in psychology subfields has historically involved students as research participants (Sears, 1986; Arnett, 2008). But are college students truly representative of the general population? College students tend to be younger, more educated, more liberal, and less diverse than the general population. Although using students as test subjects is an accepted practice, relying on such a limited pool of research participants can be problematic because it is difficult to generalize findings to the larger population.

Our hypothetical experiment involves children, and we must first generate a sample of child participants. Samples are used because populations are usually too large to reasonably involve every member in our particular experiment (Figure 20). If possible, we should use a random sample   (there are other types of samples, but for the purposes of this section, we will focus on random samples). A random sample is a subset of a larger population in which every member of the population has an equal chance of being selected. Random samples are preferred because if the sample is large enough we can be reasonably sure that the participating individuals are representative of the larger population. This means that the percentages of characteristics in the sample—sex, ethnicity, socioeconomic level, and any other characteristics that might affect the results—are close to those percentages in the larger population.

In our example, let’s say we decide our population of interest is fourth graders. But all fourth graders is a very large population, so we need to be more specific; instead we might say our population of interest is all fourth graders in a particular city. We should include students from various income brackets, family situations, races, ethnicities, religions, and geographic areas of town. With this more manageable population, we can work with the local schools in selecting a random sample of around 200 fourth graders who we want to participate in our experiment.

In summary, because we cannot test all of the fourth graders in a city, we want to find a group of about 200 that reflects the composition of that city. With a representative group, we can generalize our findings to the larger population without fear of our sample being biased in some way.

(a) A photograph shows an aerial view of crowds on a street. (b) A photograph shows s small group of children.

Now that we have a sample, the next step of the experimental process is to split the participants into experimental and control groups through random assignment. With random assignment , all participants have an equal chance of being assigned to either group. There is statistical software that will randomly assign each of the fourth graders in the sample to either the experimental or the control group.

Random assignment is critical for sound experimental design. With sufficiently large samples, random assignment makes it unlikely that there are systematic differences between the groups. So, for instance, it would be very unlikely that we would get one group composed entirely of males, a given ethnic identity, or a given religious ideology. This is important because if the groups were systematically different before the experiment began, we would not know the origin of any differences we find between the groups: Were the differences preexisting, or were they caused by manipulation of the independent variable? Random assignment allows us to assume that any differences observed between experimental and control groups result from the manipulation of the independent variable.

Issues to Consider

While experiments allow scientists to make cause-and-effect claims, they are not without problems. True experiments require the experimenter to manipulate an independent variable, and that can complicate many questions that psychologists might want to address. For instance, imagine that you want to know what effect sex (the independent variable) has on spatial memory (the dependent variable). Although you can certainly look for differences between males and females on a task that taps into spatial memory, you cannot directly control a person’s sex. We categorize this type of research approach as quasi-experimental and recognize that we cannot make cause-and-effect claims in these circumstances.

Experimenters are also limited by ethical constraints. For instance, you would not be able to conduct an experiment designed to determine if experiencing abuse as a child leads to lower levels of self-esteem among adults. To conduct such an experiment, you would need to randomly assign some experimental participants to a group that receives abuse, and that experiment would be unethical.

Introduction to Statistical Thinking

Psychologists use statistics to assist them in analyzing data, and also to give more precise measurements to describe whether something is statistically significant. Analyzing data using statistics enables researchers to find patterns, make claims, and share their results with others. In this section, you’ll learn about some of the tools that psychologists use in statistical analysis.

  • Define reliability and validity
  • Describe the importance of distributional thinking and the role of p-values in statistical inference
  • Describe the role of random sampling and random assignment in drawing cause-and-effect conclusions
  • Describe the basic structure of a psychological research article

Interpreting Experimental Findings

Once data is collected from both the experimental and the control groups, a statistical analysis is conducted to find out if there are meaningful differences between the two groups. A statistical analysis determines how likely any difference found is due to chance (and thus not meaningful). In psychology, group differences are considered meaningful, or significant, if the odds that these differences occurred by chance alone are 5 percent or less. Stated another way, if we repeated this experiment 100 times, we would expect to find the same results at least 95 times out of 100.

The greatest strength of experiments is the ability to assert that any significant differences in the findings are caused by the independent variable. This occurs because random selection, random assignment, and a design that limits the effects of both experimenter bias and participant expectancy should create groups that are similar in composition and treatment. Therefore, any difference between the groups is attributable to the independent variable, and now we can finally make a causal statement. If we find that watching a violent television program results in more violent behavior than watching a nonviolent program, we can safely say that watching violent television programs causes an increase in the display of violent behavior.

Reporting Research

When psychologists complete a research project, they generally want to share their findings with other scientists. The American Psychological Association (APA) publishes a manual detailing how to write a paper for submission to scientific journals. Unlike an article that might be published in a magazine like Psychology Today, which targets a general audience with an interest in psychology, scientific journals generally publish peer-reviewed journal articles aimed at an audience of professionals and scholars who are actively involved in research themselves.

A peer-reviewed journal article is read by several other scientists (generally anonymously) with expertise in the subject matter. These peer reviewers provide feedback—to both the author and the journal editor—regarding the quality of the draft. Peer reviewers look for a strong rationale for the research being described, a clear description of how the research was conducted, and evidence that the research was conducted in an ethical manner. They also look for flaws in the study’s design, methods, and statistical analyses. They check that the conclusions drawn by the authors seem reasonable given the observations made during the research. Peer reviewers also comment on how valuable the research is in advancing the discipline’s knowledge. This helps prevent unnecessary duplication of research findings in the scientific literature and, to some extent, ensures that each research article provides new information. Ultimately, the journal editor will compile all of the peer reviewer feedback and determine whether the article will be published in its current state (a rare occurrence), published with revisions, or not accepted for publication.

Peer review provides some degree of quality control for psychological research. Poorly conceived or executed studies can be weeded out, and even well-designed research can be improved by the revisions suggested. Peer review also ensures that the research is described clearly enough to allow other scientists to replicate it, meaning they can repeat the experiment using different samples to determine reliability. Sometimes replications involve additional measures that expand on the original finding. In any case, each replication serves to provide more evidence to support the original research findings. Successful replications of published research make scientists more apt to adopt those findings, while repeated failures tend to cast doubt on the legitimacy of the original article and lead scientists to look elsewhere. For example, it would be a major advancement in the medical field if a published study indicated that taking a new drug helped individuals achieve a healthy weight without changing their diet. But if other scientists could not replicate the results, the original study’s claims would be questioned.

Dig Deeper: The Vaccine-Autism Myth and the Retraction of Published Studies

Some scientists have claimed that routine childhood vaccines cause some children to develop autism, and, in fact, several peer-reviewed publications published research making these claims. Since the initial reports, large-scale epidemiological research has suggested that vaccinations are not responsible for causing autism and that it is much safer to have your child vaccinated than not. Furthermore, several of the original studies making this claim have since been retracted.

A published piece of work can be rescinded when data is called into question because of falsification, fabrication, or serious research design problems. Once rescinded, the scientific community is informed that there are serious problems with the original publication. Retractions can be initiated by the researcher who led the study, by research collaborators, by the institution that employed the researcher, or by the editorial board of the journal in which the article was originally published. In the vaccine-autism case, the retraction was made because of a significant conflict of interest in which the leading researcher had a financial interest in establishing a link between childhood vaccines and autism (Offit, 2008). Unfortunately, the initial studies received so much media attention that many parents around the world became hesitant to have their children vaccinated (Figure 21). For more information about how the vaccine/autism story unfolded, as well as the repercussions of this story, take a look at Paul Offit’s book, Autism’s False Prophets: Bad Science, Risky Medicine, and the Search for a Cure.

A photograph shows a child being given an oral vaccine.

Reliability and Validity

Dig deeper:  everyday connection: how valid is the sat.

Standardized tests like the SAT are supposed to measure an individual’s aptitude for a college education, but how reliable and valid are such tests? Research conducted by the College Board suggests that scores on the SAT have high predictive validity for first-year college students’ GPA (Kobrin, Patterson, Shaw, Mattern, & Barbuti, 2008). In this context, predictive validity refers to the test’s ability to effectively predict the GPA of college freshmen. Given that many institutions of higher education require the SAT for admission, this high degree of predictive validity might be comforting.

However, the emphasis placed on SAT scores in college admissions has generated some controversy on a number of fronts. For one, some researchers assert that the SAT is a biased test that places minority students at a disadvantage and unfairly reduces the likelihood of being admitted into a college (Santelices & Wilson, 2010). Additionally, some research has suggested that the predictive validity of the SAT is grossly exaggerated in how well it is able to predict the GPA of first-year college students. In fact, it has been suggested that the SAT’s predictive validity may be overestimated by as much as 150% (Rothstein, 2004). Many institutions of higher education are beginning to consider de-emphasizing the significance of SAT scores in making admission decisions (Rimer, 2008).

In 2014, College Board president David Coleman expressed his awareness of these problems, recognizing that college success is more accurately predicted by high school grades than by SAT scores. To address these concerns, he has called for significant changes to the SAT exam (Lewin, 2014).

Statistical Significance

Coffee cup with heart shaped cream inside.

Does drinking coffee actually increase your life expectancy? A recent study (Freedman, Park, Abnet, Hollenbeck, & Sinha, 2012) found that men who drank at least six cups of coffee a day also had a 10% lower chance of dying (women’s chances were 15% lower) than those who drank none. Does this mean you should pick up or increase your own coffee habit? We will explore these results in more depth in the next section about drawing conclusions from statistics. Modern society has become awash in studies such as this; you can read about several such studies in the news every day.

Conducting such a study well, and interpreting the results of such studies requires understanding basic ideas of statistics , the science of gaining insight from data. Key components to a statistical investigation are:

  • Planning the study: Start by asking a testable research question and deciding how to collect data. For example, how long was the study period of the coffee study? How many people were recruited for the study, how were they recruited, and from where? How old were they? What other variables were recorded about the individuals? Were changes made to the participants’ coffee habits during the course of the study?
  • Examining the data: What are appropriate ways to examine the data? What graphs are relevant, and what do they reveal? What descriptive statistics can be calculated to summarize relevant aspects of the data, and what do they reveal? What patterns do you see in the data? Are there any individual observations that deviate from the overall pattern, and what do they reveal? For example, in the coffee study, did the proportions differ when we compared the smokers to the non-smokers?
  • Inferring from the data: What are valid statistical methods for drawing inferences “beyond” the data you collected? In the coffee study, is the 10%–15% reduction in risk of death something that could have happened just by chance?
  • Drawing conclusions: Based on what you learned from your data, what conclusions can you draw? Who do you think these conclusions apply to? (Were the people in the coffee study older? Healthy? Living in cities?) Can you draw a cause-and-effect conclusion about your treatments? (Are scientists now saying that the coffee drinking is the cause of the decreased risk of death?)

Notice that the numerical analysis (“crunching numbers” on the computer) comprises only a small part of overall statistical investigation. In this section, you will see how we can answer some of these questions and what questions you should be asking about any statistical investigation you read about.

Distributional Thinking

When data are collected to address a particular question, an important first step is to think of meaningful ways to organize and examine the data. Let’s take a look at an example.

Example 1 : Researchers investigated whether cancer pamphlets are written at an appropriate level to be read and understood by cancer patients (Short, Moriarty, & Cooley, 1995). Tests of reading ability were given to 63 patients. In addition, readability level was determined for a sample of 30 pamphlets, based on characteristics such as the lengths of words and sentences in the pamphlet. The results, reported in terms of grade levels, are displayed in Figure 23.

Table showing patients' reading levels and pahmphlet's reading levels.

  • Data vary . More specifically, values of a variable (such as reading level of a cancer patient or readability level of a cancer pamphlet) vary.
  • Analyzing the pattern of variation, called the distribution of the variable, often reveals insights.

Addressing the research question of whether the cancer pamphlets are written at appropriate levels for the cancer patients requires comparing the two distributions. A naïve comparison might focus only on the centers of the distributions. Both medians turn out to be ninth grade, but considering only medians ignores the variability and the overall distributions of these data. A more illuminating approach is to compare the entire distributions, for example with a graph, as in Figure 24.

Bar graph showing that the reading level of pamphlets is typically higher than the reading level of the patients.

Figure 24 makes clear that the two distributions are not well aligned at all. The most glaring discrepancy is that many patients (17/63, or 27%, to be precise) have a reading level below that of the most readable pamphlet. These patients will need help to understand the information provided in the cancer pamphlets. Notice that this conclusion follows from considering the distributions as a whole, not simply measures of center or variability, and that the graph contrasts those distributions more immediately than the frequency tables.

Finding Significance in Data

Even when we find patterns in data, often there is still uncertainty in various aspects of the data. For example, there may be potential for measurement errors (even your own body temperature can fluctuate by almost 1°F over the course of the day). Or we may only have a “snapshot” of observations from a more long-term process or only a small subset of individuals from the population of interest. In such cases, how can we determine whether patterns we see in our small set of data is convincing evidence of a systematic phenomenon in the larger process or population? Let’s take a look at another example.

Example 2 : In a study reported in the November 2007 issue of Nature , researchers investigated whether pre-verbal infants take into account an individual’s actions toward others in evaluating that individual as appealing or aversive (Hamlin, Wynn, & Bloom, 2007). In one component of the study, 10-month-old infants were shown a “climber” character (a piece of wood with “googly” eyes glued onto it) that could not make it up a hill in two tries. Then the infants were shown two scenarios for the climber’s next try, one where the climber was pushed to the top of the hill by another character (“helper”), and one where the climber was pushed back down the hill by another character (“hinderer”). The infant was alternately shown these two scenarios several times. Then the infant was presented with two pieces of wood (representing the helper and the hinderer characters) and asked to pick one to play with.

The researchers found that of the 16 infants who made a clear choice, 14 chose to play with the helper toy. One possible explanation for this clear majority result is that the helping behavior of the one toy increases the infants’ likelihood of choosing that toy. But are there other possible explanations? What about the color of the toy? Well, prior to collecting the data, the researchers arranged so that each color and shape (red square and blue circle) would be seen by the same number of infants. Or maybe the infants had right-handed tendencies and so picked whichever toy was closer to their right hand?

Well, prior to collecting the data, the researchers arranged it so half the infants saw the helper toy on the right and half on the left. Or, maybe the shapes of these wooden characters (square, triangle, circle) had an effect? Perhaps, but again, the researchers controlled for this by rotating which shape was the helper toy, the hinderer toy, and the climber. When designing experiments, it is important to control for as many variables as might affect the responses as possible. It is beginning to appear that the researchers accounted for all the other plausible explanations. But there is one more important consideration that cannot be controlled—if we did the study again with these 16 infants, they might not make the same choices. In other words, there is some randomness inherent in their selection process.

Maybe each infant had no genuine preference at all, and it was simply “random luck” that led to 14 infants picking the helper toy. Although this random component cannot be controlled, we can apply a probability model to investigate the pattern of results that would occur in the long run if random chance were the only factor.

If the infants were equally likely to pick between the two toys, then each infant had a 50% chance of picking the helper toy. It’s like each infant tossed a coin, and if it landed heads, the infant picked the helper toy. So if we tossed a coin 16 times, could it land heads 14 times? Sure, it’s possible, but it turns out to be very unlikely. Getting 14 (or more) heads in 16 tosses is about as likely as tossing a coin and getting 9 heads in a row. This probability is referred to as a p-value . The p-value represents the likelihood that experimental results happened by chance. Within psychology, the most common standard for p-values is “p < .05”. What this means is that there is less than a 5% probability that the results happened just by random chance, and therefore a 95% probability that the results reflect a meaningful pattern in human psychology. We call this statistical significance .

So, in the study above, if we assume that each infant was choosing equally, then the probability that 14 or more out of 16 infants would choose the helper toy is found to be 0.0021. We have only two logical possibilities: either the infants have a genuine preference for the helper toy, or the infants have no preference (50/50) and an outcome that would occur only 2 times in 1,000 iterations happened in this study. Because this p-value of 0.0021 is quite small, we conclude that the study provides very strong evidence that these infants have a genuine preference for the helper toy.

If we compare the p-value to some cut-off value, like 0.05, we see that the p=value is smaller. Because the p-value is smaller than that cut-off value, then we reject the hypothesis that only random chance was at play here. In this case, these researchers would conclude that significantly more than half of the infants in the study chose the helper toy, giving strong evidence of a genuine preference for the toy with the helping behavior.

Drawing Conclusions from Statistics

Generalizability.

Photo of a diverse group of college-aged students.

One limitation to the study mentioned previously about the babies choosing the “helper” toy is that the conclusion only applies to the 16 infants in the study. We don’t know much about how those 16 infants were selected. Suppose we want to select a subset of individuals (a sample ) from a much larger group of individuals (the population ) in such a way that conclusions from the sample can be generalized to the larger population. This is the question faced by pollsters every day.

Example 3 : The General Social Survey (GSS) is a survey on societal trends conducted every other year in the United States. Based on a sample of about 2,000 adult Americans, researchers make claims about what percentage of the U.S. population consider themselves to be “liberal,” what percentage consider themselves “happy,” what percentage feel “rushed” in their daily lives, and many other issues. The key to making these claims about the larger population of all American adults lies in how the sample is selected. The goal is to select a sample that is representative of the population, and a common way to achieve this goal is to select a r andom sample  that gives every member of the population an equal chance of being selected for the sample. In its simplest form, random sampling involves numbering every member of the population and then using a computer to randomly select the subset to be surveyed. Most polls don’t operate exactly like this, but they do use probability-based sampling methods to select individuals from nationally representative panels.

In 2004, the GSS reported that 817 of 977 respondents (or 83.6%) indicated that they always or sometimes feel rushed. This is a clear majority, but we again need to consider variation due to random sampling . Fortunately, we can use the same probability model we did in the previous example to investigate the probable size of this error. (Note, we can use the coin-tossing model when the actual population size is much, much larger than the sample size, as then we can still consider the probability to be the same for every individual in the sample.) This probability model predicts that the sample result will be within 3 percentage points of the population value (roughly 1 over the square root of the sample size, the margin of error. A statistician would conclude, with 95% confidence, that between 80.6% and 86.6% of all adult Americans in 2004 would have responded that they sometimes or always feel rushed.

The key to the margin of error is that when we use a probability sampling method, we can make claims about how often (in the long run, with repeated random sampling) the sample result would fall within a certain distance from the unknown population value by chance (meaning by random sampling variation) alone. Conversely, non-random samples are often suspect to bias, meaning the sampling method systematically over-represents some segments of the population and under-represents others. We also still need to consider other sources of bias, such as individuals not responding honestly. These sources of error are not measured by the margin of error.

Cause and Effect

In many research studies, the primary question of interest concerns differences between groups. Then the question becomes how were the groups formed (e.g., selecting people who already drink coffee vs. those who don’t). In some studies, the researchers actively form the groups themselves. But then we have a similar question—could any differences we observe in the groups be an artifact of that group-formation process? Or maybe the difference we observe in the groups is so large that we can discount a “fluke” in the group-formation process as a reasonable explanation for what we find?

Example 4 : A psychology study investigated whether people tend to display more creativity when they are thinking about intrinsic (internal) or extrinsic (external) motivations (Ramsey & Schafer, 2002, based on a study by Amabile, 1985). The subjects were 47 people with extensive experience with creative writing. Subjects began by answering survey questions about either intrinsic motivations for writing (such as the pleasure of self-expression) or extrinsic motivations (such as public recognition). Then all subjects were instructed to write a haiku, and those poems were evaluated for creativity by a panel of judges. The researchers conjectured beforehand that subjects who were thinking about intrinsic motivations would display more creativity than subjects who were thinking about extrinsic motivations. The creativity scores from the 47 subjects in this study are displayed in Figure 26, where higher scores indicate more creativity.

Image showing a dot for creativity scores, which vary between 5 and 27, and the types of motivation each person was given as a motivator, either extrinsic or intrinsic.

In this example, the key question is whether the type of motivation affects creativity scores. In particular, do subjects who were asked about intrinsic motivations tend to have higher creativity scores than subjects who were asked about extrinsic motivations?

Figure 26 reveals that both motivation groups saw considerable variability in creativity scores, and these scores have considerable overlap between the groups. In other words, it’s certainly not always the case that those with extrinsic motivations have higher creativity than those with intrinsic motivations, but there may still be a statistical tendency in this direction. (Psychologist Keith Stanovich (2013) refers to people’s difficulties with thinking about such probabilistic tendencies as “the Achilles heel of human cognition.”)

The mean creativity score is 19.88 for the intrinsic group, compared to 15.74 for the extrinsic group, which supports the researchers’ conjecture. Yet comparing only the means of the two groups fails to consider the variability of creativity scores in the groups. We can measure variability with statistics using, for instance, the standard deviation: 5.25 for the extrinsic group and 4.40 for the intrinsic group. The standard deviations tell us that most of the creativity scores are within about 5 points of the mean score in each group. We see that the mean score for the intrinsic group lies within one standard deviation of the mean score for extrinsic group. So, although there is a tendency for the creativity scores to be higher in the intrinsic group, on average, the difference is not extremely large.

We again want to consider possible explanations for this difference. The study only involved individuals with extensive creative writing experience. Although this limits the population to which we can generalize, it does not explain why the mean creativity score was a bit larger for the intrinsic group than for the extrinsic group. Maybe women tend to receive higher creativity scores? Here is where we need to focus on how the individuals were assigned to the motivation groups. If only women were in the intrinsic motivation group and only men in the extrinsic group, then this would present a problem because we wouldn’t know if the intrinsic group did better because of the different type of motivation or because they were women. However, the researchers guarded against such a problem by randomly assigning the individuals to the motivation groups. Like flipping a coin, each individual was just as likely to be assigned to either type of motivation. Why is this helpful? Because this random assignment  tends to balance out all the variables related to creativity we can think of, and even those we don’t think of in advance, between the two groups. So we should have a similar male/female split between the two groups; we should have a similar age distribution between the two groups; we should have a similar distribution of educational background between the two groups; and so on. Random assignment should produce groups that are as similar as possible except for the type of motivation, which presumably eliminates all those other variables as possible explanations for the observed tendency for higher scores in the intrinsic group.

But does this always work? No, so by “luck of the draw” the groups may be a little different prior to answering the motivation survey. So then the question is, is it possible that an unlucky random assignment is responsible for the observed difference in creativity scores between the groups? In other words, suppose each individual’s poem was going to get the same creativity score no matter which group they were assigned to, that the type of motivation in no way impacted their score. Then how often would the random-assignment process alone lead to a difference in mean creativity scores as large (or larger) than 19.88 – 15.74 = 4.14 points?

We again want to apply to a probability model to approximate a p-value , but this time the model will be a bit different. Think of writing everyone’s creativity scores on an index card, shuffling up the index cards, and then dealing out 23 to the extrinsic motivation group and 24 to the intrinsic motivation group, and finding the difference in the group means. We (better yet, the computer) can repeat this process over and over to see how often, when the scores don’t change, random assignment leads to a difference in means at least as large as 4.41. Figure 27 shows the results from 1,000 such hypothetical random assignments for these scores.

Standard distribution in a typical bell curve.

Only 2 of the 1,000 simulated random assignments produced a difference in group means of 4.41 or larger. In other words, the approximate p-value is 2/1000 = 0.002. This small p-value indicates that it would be very surprising for the random assignment process alone to produce such a large difference in group means. Therefore, as with Example 2, we have strong evidence that focusing on intrinsic motivations tends to increase creativity scores, as compared to thinking about extrinsic motivations.

Notice that the previous statement implies a cause-and-effect relationship between motivation and creativity score; is such a strong conclusion justified? Yes, because of the random assignment used in the study. That should have balanced out any other variables between the two groups, so now that the small p-value convinces us that the higher mean in the intrinsic group wasn’t just a coincidence, the only reasonable explanation left is the difference in the type of motivation. Can we generalize this conclusion to everyone? Not necessarily—we could cautiously generalize this conclusion to individuals with extensive experience in creative writing similar the individuals in this study, but we would still want to know more about how these individuals were selected to participate.

Close-up photo of mathematical equations.

Statistical thinking involves the careful design of a study to collect meaningful data to answer a focused research question, detailed analysis of patterns in the data, and drawing conclusions that go beyond the observed data. Random sampling is paramount to generalizing results from our sample to a larger population, and random assignment is key to drawing cause-and-effect conclusions. With both kinds of randomness, probability models help us assess how much random variation we can expect in our results, in order to determine whether our results could happen by chance alone and to estimate a margin of error.

So where does this leave us with regard to the coffee study mentioned previously (the Freedman, Park, Abnet, Hollenbeck, & Sinha, 2012 found that men who drank at least six cups of coffee a day had a 10% lower chance of dying (women 15% lower) than those who drank none)? We can answer many of the questions:

  • This was a 14-year study conducted by researchers at the National Cancer Institute.
  • The results were published in the June issue of the New England Journal of Medicine , a respected, peer-reviewed journal.
  • The study reviewed coffee habits of more than 402,000 people ages 50 to 71 from six states and two metropolitan areas. Those with cancer, heart disease, and stroke were excluded at the start of the study. Coffee consumption was assessed once at the start of the study.
  • About 52,000 people died during the course of the study.
  • People who drank between two and five cups of coffee daily showed a lower risk as well, but the amount of reduction increased for those drinking six or more cups.
  • The sample sizes were fairly large and so the p-values are quite small, even though percent reduction in risk was not extremely large (dropping from a 12% chance to about 10%–11%).
  • Whether coffee was caffeinated or decaffeinated did not appear to affect the results.
  • This was an observational study, so no cause-and-effect conclusions can be drawn between coffee drinking and increased longevity, contrary to the impression conveyed by many news headlines about this study. In particular, it’s possible that those with chronic diseases don’t tend to drink coffee.

This study needs to be reviewed in the larger context of similar studies and consistency of results across studies, with the constant caution that this was not a randomized experiment. Whereas a statistical analysis can still “adjust” for other potential confounding variables, we are not yet convinced that researchers have identified them all or completely isolated why this decrease in death risk is evident. Researchers can now take the findings of this study and develop more focused studies that address new questions.

Explore these outside resources to learn more about applied statistics:

  • Video about p-values:  P-Value Extravaganza
  • Interactive web applets for teaching and learning statistics
  • Inter-university Consortium for Political and Social Research  where you can find and analyze data.
  • The Consortium for the Advancement of Undergraduate Statistics
  • Find a recent research article in your field and answer the following: What was the primary research question? How were individuals selected to participate in the study? Were summary results provided? How strong is the evidence presented in favor or against the research question? Was random assignment used? Summarize the main conclusions from the study, addressing the issues of statistical significance, statistical confidence, generalizability, and cause and effect. Do you agree with the conclusions drawn from this study, based on the study design and the results presented?
  • Is it reasonable to use a random sample of 1,000 individuals to draw conclusions about all U.S. adults? Explain why or why not.

How to Read Research

In this course and throughout your academic career, you’ll be reading journal articles (meaning they were published by experts in a peer-reviewed journal) and reports that explain psychological research. It’s important to understand the format of these articles so that you can read them strategically and understand the information presented. Scientific articles vary in content or structure, depending on the type of journal to which they will be submitted. Psychological articles and many papers in the social sciences follow the writing guidelines and format dictated by the American Psychological Association (APA). In general, the structure follows: abstract, introduction, methods, results, discussion, and references.

  • Abstract : the abstract is the concise summary of the article. It summarizes the most important features of the manuscript, providing the reader with a global first impression on the article. It is generally just one paragraph that explains the experiment as well as a short synopsis of the results.
  • Introduction : this section provides background information about the origin and purpose of performing the experiment or study. It reviews previous research and presents existing theories on the topic.
  • Method : this section covers the methodologies used to investigate the research question, including the identification of participants , procedures , and  materials  as well as a description of the actual procedure . It should be sufficiently detailed to allow for replication.
  • Results : the results section presents key findings of the research, including reference to indicators of statistical significance.
  • Discussion : this section provides an interpretation of the findings, states their significance for current research, and derives implications for theory and practice. Alternative interpretations for findings are also provided, particularly when it is not possible to conclude for the directionality of the effects. In the discussion, authors also acknowledge the strengths and limitations/weaknesses of the study and offer concrete directions about for future research.

Watch this 3-minute video for an explanation on how to read scholarly articles. Look closely at the example article shared just before the two minute mark.

https://digitalcommons.coastal.edu/kimbel-library-instructional-videos/9/

Practice identifying these key components in the following experiment: Food-Induced Emotional Resonance Improves Emotion Recognition.

In this chapter, you learned to

  • define and apply the scientific method to psychology
  • describe the strengths and weaknesses of descriptive, experimental, and correlational research
  • define the basic elements of a statistical investigation

Putting It Together: Psychological Research

Psychologists use the scientific method to examine human behavior and mental processes. Some of the methods you learned about include descriptive, experimental, and correlational research designs.

Watch the CrashCourse video to review the material you learned, then read through the following examples and see if you can come up with your own design for each type of study.

You can view the transcript for “Psychological Research: Crash Course Psychology #2” here (opens in new window).

Case Study: a detailed analysis of a particular person, group, business, event, etc. This approach is commonly used to to learn more about rare examples with the goal of describing that particular thing.

  • Ted Bundy was one of America’s most notorious serial killers who murdered at least 30 women and was executed in 1989. Dr. Al Carlisle evaluated Bundy when he was first arrested and conducted a psychological analysis of Bundy’s development of his sexual fantasies merging into reality (Ramsland, 2012). Carlisle believes that there was a gradual evolution of three processes that guided his actions: fantasy, dissociation, and compartmentalization (Ramsland, 2012). Read   Imagining Ted Bundy  (http://goo.gl/rGqcUv) for more information on this case study.

Naturalistic Observation : a researcher unobtrusively collects information without the participant’s awareness.

  • Drain and Engelhardt (2013) observed six nonverbal children with autism’s evoked and spontaneous communicative acts. Each of the children attended a school for children with autism and were in different classes. They were observed for 30 minutes of each school day. By observing these children without them knowing, they were able to see true communicative acts without any external influences.

Survey : participants are asked to provide information or responses to questions on a survey or structure assessment.

  • Educational psychologists can ask students to report their grade point average and what, if anything, they eat for breakfast on an average day. A healthy breakfast has been associated with better academic performance (Digangi’s 1999).
  • Anderson (1987) tried to find the relationship between uncomfortably hot temperatures and aggressive behavior, which was then looked at with two studies done on violent and nonviolent crime. Based on previous research that had been done by Anderson and Anderson (1984), it was predicted that violent crimes would be more prevalent during the hotter time of year and the years in which it was hotter weather in general. The study confirmed this prediction.

Longitudinal Study: researchers   recruit a sample of participants and track them for an extended period of time.

  • In a study of a representative sample of 856 children Eron and his colleagues (1972) found that a boy’s exposure to media violence at age eight was significantly related to his aggressive behavior ten years later, after he graduated from high school.

Cross-Sectional Study:  researchers gather participants from different groups (commonly different ages) and look for differences between the groups.

  • In 1996, Russell surveyed people of varying age groups and found that people in their 20s tend to report being more lonely than people in their 70s.

Correlational Design:  two different variables are measured to determine whether there is a relationship between them.

  • Thornhill et al. (2003) had people rate how physically attractive they found other people to be. They then had them separately smell t-shirts those people had worn (without knowing which clothes belonged to whom) and rate how good or bad their body oder was. They found that the more attractive someone was the more pleasant their body order was rated to be.
  • Clinical psychologists can test a new pharmaceutical treatment for depression by giving some patients the new pill and others an already-tested one to see which is the more effective treatment.

American Cancer Society. (n.d.). History of the cancer prevention studies. Retrieved from http://www.cancer.org/research/researchtopreventcancer/history-cancer-prevention-study

American Psychological Association. (2009). Publication Manual of the American Psychological Association (6th ed.). Washington, DC: Author.

American Psychological Association. (n.d.). Research with animals in psychology. Retrieved from https://www.apa.org/research/responsible/research-animals.pdf

Arnett, J. (2008). The neglected 95%: Why American psychology needs to become less American. American Psychologist, 63(7), 602–614.

Barton, B. A., Eldridge, A. L., Thompson, D., Affenito, S. G., Striegel-Moore, R. H., Franko, D. L., . . . Crockett, S. J. (2005). The relationship of breakfast and cereal consumption to nutrient intake and body mass index: The national heart, lung, and blood institute growth and health study. Journal of the American Dietetic Association, 105(9), 1383–1389. Retrieved from http://dx.doi.org/10.1016/j.jada.2005.06.003

Chwalisz, K., Diener, E., & Gallagher, D. (1988). Autonomic arousal feedback and emotional experience: Evidence from the spinal cord injured. Journal of Personality and Social Psychology, 54, 820–828.

Dominus, S. (2011, May 25). Could conjoined twins share a mind? New York Times Sunday Magazine. Retrieved from http://www.nytimes.com/2011/05/29/magazine/could-conjoined-twins-share-a-mind.html?_r=5&hp&

Fanger, S. M., Frankel, L. A., & Hazen, N. (2012). Peer exclusion in preschool children’s play: Naturalistic observations in a playground setting. Merrill-Palmer Quarterly, 58, 224–254.

Fiedler, K. (2004). Illusory correlation. In R. F. Pohl (Ed.), Cognitive illusions: A handbook on fallacies and biases in thinking, judgment and memory (pp. 97–114). New York, NY: Psychology Press.

Frantzen, L. B., Treviño, R. P., Echon, R. M., Garcia-Dominic, O., & DiMarco, N. (2013). Association between frequency of ready-to-eat cereal consumption, nutrient intakes, and body mass index in fourth- to sixth-grade low-income minority children. Journal of the Academy of Nutrition and Dietetics, 113(4), 511–519.

Harper, J. (2013, July 5). Ice cream and crime: Where cold cuisine and hot disputes intersect. The Times-Picaune. Retrieved from http://www.nola.com/crime/index.ssf/2013/07/ice_cream_and_crime_where_hot.html

Jenkins, W. J., Ruppel, S. E., Kizer, J. B., Yehl, J. L., & Griffin, J. L. (2012). An examination of post 9-11 attitudes towards Arab Americans. North American Journal of Psychology, 14, 77–84.

Jones, J. M. (2013, May 13). Same-sex marriage support solidifies above 50% in U.S. Gallup Politics. Retrieved from http://www.gallup.com/poll/162398/sex-marriage-support-solidifies-above.aspx

Kobrin, J. L., Patterson, B. F., Shaw, E. J., Mattern, K. D., & Barbuti, S. M. (2008). Validity of the SAT for predicting first-year college grade point average (Research Report No. 2008-5). Retrieved from https://research.collegeboard.org/sites/default/files/publications/2012/7/researchreport-2008-5-validity-sat-predicting-first-year-college-grade-point-average.pdf

Lewin, T. (2014, March 5). A new SAT aims to realign with schoolwork. New York Times. Retreived from http://www.nytimes.com/2014/03/06/education/major-changes-in-sat-announced-by-college-board.html.

Lowry, M., Dean, K., & Manders, K. (2010). The link between sleep quantity and academic performance for the college student. Sentience: The University of Minnesota Undergraduate Journal of Psychology, 3(Spring), 16–19. Retrieved from http://www.psych.umn.edu/sentience/files/SENTIENCE_Vol3.pdf

McKie, R. (2010, June 26). Chimps with everything: Jane Goodall’s 50 years in the jungle. The Guardian. Retrieved from http://www.theguardian.com/science/2010/jun/27/jane-goodall-chimps-africa-interview

Offit, P. (2008). Autism’s false prophets: Bad science, risky medicine, and the search for a cure. New York: Columbia University Press.

Perkins, H. W., Haines, M. P., & Rice, R. (2005). Misperceiving the college drinking norm and related problems: A nationwide study of exposure to prevention information, perceived norms and student alcohol misuse. J. Stud. Alcohol, 66(4), 470–478.

Rimer, S. (2008, September 21). College panel calls for less focus on SATs. The New York Times. Retrieved from http://www.nytimes.com/2008/09/22/education/22admissions.html?_r=0

Rothstein, J. M. (2004). College performance predictions and the SAT. Journal of Econometrics, 121, 297–317.

Rotton, J., & Kelly, I. W. (1985). Much ado about the full moon: A meta-analysis of lunar-lunacy research. Psychological Bulletin, 97(2), 286–306. doi:10.1037/0033-2909.97.2.286

Santelices, M. V., & Wilson, M. (2010). Unfair treatment? The case of Freedle, the SAT, and the standardization approach to differential item functioning. Harvard Education Review, 80, 106–134.

Sears, D. O. (1986). College sophomores in the laboratory: Influences of a narrow data base on social psychology’s view of human nature. Journal of Personality and Social Psychology, 51, 515–530.

Tuskegee University. (n.d.). About the USPHS Syphilis Study. Retrieved from http://www.tuskegee.edu/about_us/centers_of_excellence/bioethics_center/about_the_usphs_syphilis_study.aspx.

CC licensed content, Original

  • Psychological Research Methods. Provided by : Karenna Malavanti. License : CC BY-SA: Attribution ShareAlike

CC licensed content, Shared previously

  • Psychological Research. Provided by : OpenStax College. License : CC BY: Attribution . License Terms : Download for free at https://openstax.org/books/psychology-2e/pages/1-introduction. Located at : https://openstax.org/books/psychology-2e/pages/2-introduction .
  • Why It Matters: Psychological Research. Provided by : Lumen Learning. License : CC BY: Attribution   Located at: https://pressbooks.online.ucf.edu/lumenpsychology/chapter/introduction-15/
  • Introduction to The Scientific Method. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:   https://pressbooks.online.ucf.edu/lumenpsychology/chapter/outcome-the-scientific-method/
  • Research picture. Authored by : Mediterranean Center of Medical Sciences. Provided by : Flickr. License : CC BY: Attribution   Located at : https://www.flickr.com/photos/mcmscience/17664002728 .
  • The Scientific Process. Provided by : Lumen Learning. License : CC BY-SA: Attribution ShareAlike   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/reading-the-scientific-process/
  • Ethics in Research. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/ethics/
  • Ethics. Authored by : OpenStax College. Located at : https://openstax.org/books/psychology-2e/pages/2-4-ethics . License : CC BY: Attribution . License Terms : Download for free at https://openstax.org/books/psychology-2e/pages/1-introduction .
  • Introduction to Approaches to Research. Provided by : Lumen Learning. License : CC BY-NC-SA: Attribution NonCommercial ShareAlike   Located at:   https://pressbooks.online.ucf.edu/lumenpsychology/chapter/outcome-approaches-to-research/
  • Lec 2 | MIT 9.00SC Introduction to Psychology, Spring 2011. Authored by : John Gabrieli. Provided by : MIT OpenCourseWare. License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike Located at : https://www.youtube.com/watch?v=syXplPKQb_o .
  • Paragraph on correlation. Authored by : Christie Napa Scollon. Provided by : Singapore Management University. License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike Located at : http://nobaproject.com/modules/research-designs?r=MTc0ODYsMjMzNjQ%3D . Project : The Noba Project.
  • Descriptive Research. Provided by : Lumen Learning. License : CC BY-SA: Attribution ShareAlike   Located at: https://pressbooks.online.ucf.edu/lumenpsychology/chapter/reading-clinical-or-case-studies/
  • Approaches to Research. Authored by : OpenStax College.  License : CC BY: Attribution . License Terms : Download for free at https://openstax.org/books/psychology-2e/pages/1-introduction. Located at : https://openstax.org/books/psychology-2e/pages/2-2-approaches-to-research
  • Analyzing Findings. Authored by : OpenStax College. Located at : https://openstax.org/books/psychology-2e/pages/2-3-analyzing-findings . License : CC BY: Attribution . License Terms : Download for free at https://openstax.org/books/psychology-2e/pages/1-introduction.
  • Experiments. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/reading-conducting-experiments/
  • Research Review. Authored by : Jessica Traylor for Lumen Learning. License : CC BY: Attribution Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/reading-conducting-experiments/
  • Introduction to Statistics. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/outcome-statistical-thinking/
  • histogram. Authored by : Fisher’s Iris flower data set. Provided by : Wikipedia.
  • License : CC BY-SA: Attribution-ShareAlike   Located at : https://en.wikipedia.org/wiki/Wikipedia:Meetup/DC/Statistics_Edit-a-thon#/media/File:Fisher_iris_versicolor_sepalwidth.svg .
  • Statistical Thinking. Authored by : Beth Chance and Allan Rossman . Provided by : California Polytechnic State University, San Luis Obispo.  
  • License : CC BY-NC-SA: Attribution-NonCommerci al-S hareAlike .  License Terms : http://nobaproject.com/license-agreement   Located at : http://nobaproject.com/modules/statistical-thinking . Project : The Noba Project.
  • Drawing Conclusions from Statistics. Authored by: Pat Carroll and Lumen Learning. Provided by : Lumen Learning. License : CC BY: Attribution   Located at: https://pressbooks.online.ucf.edu/lumenpsychology/chapter/reading-drawing-conclusions-from-statistics/
  • Statistical Thinking. Authored by : Beth Chance and Allan Rossman, California Polytechnic State University, San Luis Obispo. Provided by : Noba. License: CC BY-NC-SA: Attribution-NonCommercial-ShareAlike Located at : http://nobaproject.com/modules/statistical-thinking .
  • The Replication Crisis. Authored by : Colin Thomas William. Provided by : Ivy Tech Community College. License: CC BY: Attribution
  • How to Read Research. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/how-to-read-research/
  • What is a Scholarly Article? Kimbel Library First Year Experience Instructional Videos. 9. Authored by:  Joshua Vossler, John Watts, and Tim Hodge.  Provided by : Coastal Carolina University  License :  CC BY NC ND:  Attribution-NonCommercial-NoDerivatives Located at :  https://digitalcommons.coastal.edu/kimbel-library-instructional-videos/9/
  • Putting It Together: Psychological Research. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/putting-it-together-psychological-research/
  • Research. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:

All rights reserved content

  • Understanding Driver Distraction. Provided by : American Psychological Association. License : Other. License Terms: Standard YouTube License Located at : https://www.youtube.com/watch?v=XToWVxS_9lA&list=PLxf85IzktYWJ9MrXwt5GGX3W-16XgrwPW&index=9 .
  • Correlation vs. Causality: Freakonomics Movie. License : Other. License Terms : Standard YouTube License Located at : https://www.youtube.com/watch?v=lbODqslc4Tg.
  • Psychological Research – Crash Course Psychology #2. Authored by : Hank Green. Provided by : Crash Course. License : Other. License Terms : Standard YouTube License Located at : https://www.youtube.com/watch?v=hFV71QPvX2I .

Public domain content

  • Researchers review documents. Authored by : National Cancer Institute. Provided by : Wikimedia. Located at : https://commons.wikimedia.org/wiki/File:Researchers_review_documents.jpg . License : Public Domain: No Known Copyright

grounded in objective, tangible evidence that can be observed time and time again, regardless of who is observing

well-developed set of ideas that propose an explanation for observed phenomena

(plural: hypotheses) tentative and testable statement about the relationship between two or more variables

an experiment must be replicable by another researcher

implies that a theory should enable us to make predictions about future events

able to be disproven by experimental results

implies that all data must be considered when evaluating a hypothesis

committee of administrators, scientists, and community members that reviews proposals for research involving human participants

process of informing a research participant about what to expect during an experiment, any risks involved, and the implications of the research, and then obtaining the person’s consent to participate

purposely misleading experiment participants in order to maintain the integrity of the experiment

when an experiment involved deception, participants are told complete and truthful information about the experiment at its conclusion

committee of administrators, scientists, veterinarians, and community members that reviews proposals for research involving non-human animals

research studies that do not test specific relationships between variables

research investigating the relationship between two or more variables

research method that uses hypothesis testing to make inferences about how one variable impacts and causes another

observation of behavior in its natural setting

inferring that the results for a sample apply to the larger population

when observations may be skewed to align with observer expectations

measure of agreement among observers on how they record and classify a particular event

observational research study focusing on one or a few people

list of questions to be answered by research participants—given as paper-and-pencil questionnaires, administered electronically, or conducted verbally—allowing researchers to collect data from a large number of people

subset of individuals selected from the larger population

overall group of individuals that the researchers are interested in

method of research using past records or data sets to answer various research questions, or to search for interesting patterns or relationships

studies in which the same group of individuals is surveyed or measured repeatedly over an extended period of time

compares multiple segments of a population at a single time

reduction in number of research participants as some drop out of the study over time

relationship between two or more variables; when two variables are correlated, one variable changes as the other does

number from -1 to +1, indicating the strength and direction of the relationship between variables, and usually represented by r

two variables change in the same direction, both becoming either larger or smaller

two variables change in different directions, with one becoming larger as the other becomes smaller; a negative correlation is not the same thing as no correlation

changes in one variable cause the changes in the other variable; can be determined only through an experimental research design

unanticipated outside factor that affects both variables of interest, often giving the false impression that changes in one variable causes changes in the other variable, when, in actuality, the outside factor causes changes in both variables

seeing relationships between two things when in reality no such relationship exists

tendency to ignore evidence that disproves ideas or beliefs

group designed to answer the research question; experimental manipulation is the only difference between the experimental and control groups, so any differences between the two are due to experimental manipulation rather than chance

serves as a basis for comparison and controls for chance factors that might influence the results of the study—by holding such factors constant across groups so that the experimental manipulation is the only difference between groups

description of what actions and operations will be used to measure the dependent variables and manipulate the independent variables

researcher expectations skew the results of the study

experiment in which the researcher knows which participants are in the experimental group and which are in the control group

experiment in which both the researchers and the participants are blind to group assignments

people's expectations or beliefs influencing or determining their experience in a given situation

variable that is influenced or controlled by the experimenter; in a sound experimental study, the independent variable is the only important difference between the experimental and control group

variable that the researcher measures to see how much effect the independent variable had

subjects of psychological research

subset of a larger population in which every member of the population has an equal chance of being selected

method of experimental group assignment in which all participants have an equal chance of being assigned to either group

consistency and reproducibility of a given result

accuracy of a given result in measuring what it is designed to measure

determines how likely any difference between experimental groups is due to chance

statistical probability that represents the likelihood that experimental results happened by chance

Psychological Science is the scientific study of mind, brain, and behavior. We will explore what it means to be human in this class. It has never been more important for us to understand what makes people tick, how to evaluate information critically, and the importance of history. Psychology can also help you in your future career; indeed, there are very little jobs out there with no human interaction!

Because psychology is a science, we analyze human behavior through the scientific method. There are several ways to investigate human phenomena, such as observation, experiments, and more. We will discuss the basics, pros and cons of each! We will also dig deeper into the important ethical guidelines that psychologists must follow in order to do research. Lastly, we will briefly introduce ourselves to statistics, the language of scientific research. While reading the content in these chapters, try to find examples of material that can fit with the themes of the course.

To get us started:

  • The study of the mind moved away Introspection to reaction time studies as we learned more about empiricism
  • Psychologists work in careers outside of the typical "clinician" role. We advise in human factors, education, policy, and more!
  • While completing an observation study, psychologists will work to aggregate common themes to explain the behavior of the group (sample) as a whole. In doing so, we still allow for normal variation from the group!
  • The IRB and IACUC are important in ensuring ethics are maintained for both human and animal subjects

Psychological Science: Understanding Human Behavior Copyright © by Karenna Malavanti is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Korean Med Sci
  • v.37(16); 2022 Apr 25

Logo of jkms

A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles

Edward barroga.

1 Department of General Education, Graduate School of Nursing Science, St. Luke’s International University, Tokyo, Japan.

Glafera Janet Matanguihan

2 Department of Biological Sciences, Messiah University, Mechanicsburg, PA, USA.

The development of research questions and the subsequent hypotheses are prerequisites to defining the main research purpose and specific objectives of a study. Consequently, these objectives determine the study design and research outcome. The development of research questions is a process based on knowledge of current trends, cutting-edge studies, and technological advances in the research field. Excellent research questions are focused and require a comprehensive literature search and in-depth understanding of the problem being investigated. Initially, research questions may be written as descriptive questions which could be developed into inferential questions. These questions must be specific and concise to provide a clear foundation for developing hypotheses. Hypotheses are more formal predictions about the research outcomes. These specify the possible results that may or may not be expected regarding the relationship between groups. Thus, research questions and hypotheses clarify the main purpose and specific objectives of the study, which in turn dictate the design of the study, its direction, and outcome. Studies developed from good research questions and hypotheses will have trustworthy outcomes with wide-ranging social and health implications.

INTRODUCTION

Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses. 1 , 2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results. 3 , 4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the inception of novel studies and the ethical testing of ideas. 5 , 6

It is crucial to have knowledge of both quantitative and qualitative research 2 as both types of research involve writing research questions and hypotheses. 7 However, these crucial elements of research are sometimes overlooked; if not overlooked, then framed without the forethought and meticulous attention it needs. Planning and careful consideration are needed when developing quantitative or qualitative research, particularly when conceptualizing research questions and hypotheses. 4

There is a continuing need to support researchers in the creation of innovative research questions and hypotheses, as well as for journal articles that carefully review these elements. 1 When research questions and hypotheses are not carefully thought of, unethical studies and poor outcomes usually ensue. Carefully formulated research questions and hypotheses define well-founded objectives, which in turn determine the appropriate design, course, and outcome of the study. This article then aims to discuss in detail the various aspects of crafting research questions and hypotheses, with the goal of guiding researchers as they develop their own. Examples from the authors and peer-reviewed scientific articles in the healthcare field are provided to illustrate key points.

DEFINITIONS AND RELATIONSHIP OF RESEARCH QUESTIONS AND HYPOTHESES

A research question is what a study aims to answer after data analysis and interpretation. The answer is written in length in the discussion section of the paper. Thus, the research question gives a preview of the different parts and variables of the study meant to address the problem posed in the research question. 1 An excellent research question clarifies the research writing while facilitating understanding of the research topic, objective, scope, and limitations of the study. 5

On the other hand, a research hypothesis is an educated statement of an expected outcome. This statement is based on background research and current knowledge. 8 , 9 The research hypothesis makes a specific prediction about a new phenomenon 10 or a formal statement on the expected relationship between an independent variable and a dependent variable. 3 , 11 It provides a tentative answer to the research question to be tested or explored. 4

Hypotheses employ reasoning to predict a theory-based outcome. 10 These can also be developed from theories by focusing on components of theories that have not yet been observed. 10 The validity of hypotheses is often based on the testability of the prediction made in a reproducible experiment. 8

Conversely, hypotheses can also be rephrased as research questions. Several hypotheses based on existing theories and knowledge may be needed to answer a research question. Developing ethical research questions and hypotheses creates a research design that has logical relationships among variables. These relationships serve as a solid foundation for the conduct of the study. 4 , 11 Haphazardly constructed research questions can result in poorly formulated hypotheses and improper study designs, leading to unreliable results. Thus, the formulations of relevant research questions and verifiable hypotheses are crucial when beginning research. 12

CHARACTERISTICS OF GOOD RESEARCH QUESTIONS AND HYPOTHESES

Excellent research questions are specific and focused. These integrate collective data and observations to confirm or refute the subsequent hypotheses. Well-constructed hypotheses are based on previous reports and verify the research context. These are realistic, in-depth, sufficiently complex, and reproducible. More importantly, these hypotheses can be addressed and tested. 13

There are several characteristics of well-developed hypotheses. Good hypotheses are 1) empirically testable 7 , 10 , 11 , 13 ; 2) backed by preliminary evidence 9 ; 3) testable by ethical research 7 , 9 ; 4) based on original ideas 9 ; 5) have evidenced-based logical reasoning 10 ; and 6) can be predicted. 11 Good hypotheses can infer ethical and positive implications, indicating the presence of a relationship or effect relevant to the research theme. 7 , 11 These are initially developed from a general theory and branch into specific hypotheses by deductive reasoning. In the absence of a theory to base the hypotheses, inductive reasoning based on specific observations or findings form more general hypotheses. 10

TYPES OF RESEARCH QUESTIONS AND HYPOTHESES

Research questions and hypotheses are developed according to the type of research, which can be broadly classified into quantitative and qualitative research. We provide a summary of the types of research questions and hypotheses under quantitative and qualitative research categories in Table 1 .

Research questions in quantitative research

In quantitative research, research questions inquire about the relationships among variables being investigated and are usually framed at the start of the study. These are precise and typically linked to the subject population, dependent and independent variables, and research design. 1 Research questions may also attempt to describe the behavior of a population in relation to one or more variables, or describe the characteristics of variables to be measured ( descriptive research questions ). 1 , 5 , 14 These questions may also aim to discover differences between groups within the context of an outcome variable ( comparative research questions ), 1 , 5 , 14 or elucidate trends and interactions among variables ( relationship research questions ). 1 , 5 We provide examples of descriptive, comparative, and relationship research questions in quantitative research in Table 2 .

Hypotheses in quantitative research

In quantitative research, hypotheses predict the expected relationships among variables. 15 Relationships among variables that can be predicted include 1) between a single dependent variable and a single independent variable ( simple hypothesis ) or 2) between two or more independent and dependent variables ( complex hypothesis ). 4 , 11 Hypotheses may also specify the expected direction to be followed and imply an intellectual commitment to a particular outcome ( directional hypothesis ) 4 . On the other hand, hypotheses may not predict the exact direction and are used in the absence of a theory, or when findings contradict previous studies ( non-directional hypothesis ). 4 In addition, hypotheses can 1) define interdependency between variables ( associative hypothesis ), 4 2) propose an effect on the dependent variable from manipulation of the independent variable ( causal hypothesis ), 4 3) state a negative relationship between two variables ( null hypothesis ), 4 , 11 , 15 4) replace the working hypothesis if rejected ( alternative hypothesis ), 15 explain the relationship of phenomena to possibly generate a theory ( working hypothesis ), 11 5) involve quantifiable variables that can be tested statistically ( statistical hypothesis ), 11 6) or express a relationship whose interlinks can be verified logically ( logical hypothesis ). 11 We provide examples of simple, complex, directional, non-directional, associative, causal, null, alternative, working, statistical, and logical hypotheses in quantitative research, as well as the definition of quantitative hypothesis-testing research in Table 3 .

Research questions in qualitative research

Unlike research questions in quantitative research, research questions in qualitative research are usually continuously reviewed and reformulated. The central question and associated subquestions are stated more than the hypotheses. 15 The central question broadly explores a complex set of factors surrounding the central phenomenon, aiming to present the varied perspectives of participants. 15

There are varied goals for which qualitative research questions are developed. These questions can function in several ways, such as to 1) identify and describe existing conditions ( contextual research question s); 2) describe a phenomenon ( descriptive research questions ); 3) assess the effectiveness of existing methods, protocols, theories, or procedures ( evaluation research questions ); 4) examine a phenomenon or analyze the reasons or relationships between subjects or phenomena ( explanatory research questions ); or 5) focus on unknown aspects of a particular topic ( exploratory research questions ). 5 In addition, some qualitative research questions provide new ideas for the development of theories and actions ( generative research questions ) or advance specific ideologies of a position ( ideological research questions ). 1 Other qualitative research questions may build on a body of existing literature and become working guidelines ( ethnographic research questions ). Research questions may also be broadly stated without specific reference to the existing literature or a typology of questions ( phenomenological research questions ), may be directed towards generating a theory of some process ( grounded theory questions ), or may address a description of the case and the emerging themes ( qualitative case study questions ). 15 We provide examples of contextual, descriptive, evaluation, explanatory, exploratory, generative, ideological, ethnographic, phenomenological, grounded theory, and qualitative case study research questions in qualitative research in Table 4 , and the definition of qualitative hypothesis-generating research in Table 5 .

Qualitative studies usually pose at least one central research question and several subquestions starting with How or What . These research questions use exploratory verbs such as explore or describe . These also focus on one central phenomenon of interest, and may mention the participants and research site. 15

Hypotheses in qualitative research

Hypotheses in qualitative research are stated in the form of a clear statement concerning the problem to be investigated. Unlike in quantitative research where hypotheses are usually developed to be tested, qualitative research can lead to both hypothesis-testing and hypothesis-generating outcomes. 2 When studies require both quantitative and qualitative research questions, this suggests an integrative process between both research methods wherein a single mixed-methods research question can be developed. 1

FRAMEWORKS FOR DEVELOPING RESEARCH QUESTIONS AND HYPOTHESES

Research questions followed by hypotheses should be developed before the start of the study. 1 , 12 , 14 It is crucial to develop feasible research questions on a topic that is interesting to both the researcher and the scientific community. This can be achieved by a meticulous review of previous and current studies to establish a novel topic. Specific areas are subsequently focused on to generate ethical research questions. The relevance of the research questions is evaluated in terms of clarity of the resulting data, specificity of the methodology, objectivity of the outcome, depth of the research, and impact of the study. 1 , 5 These aspects constitute the FINER criteria (i.e., Feasible, Interesting, Novel, Ethical, and Relevant). 1 Clarity and effectiveness are achieved if research questions meet the FINER criteria. In addition to the FINER criteria, Ratan et al. described focus, complexity, novelty, feasibility, and measurability for evaluating the effectiveness of research questions. 14

The PICOT and PEO frameworks are also used when developing research questions. 1 The following elements are addressed in these frameworks, PICOT: P-population/patients/problem, I-intervention or indicator being studied, C-comparison group, O-outcome of interest, and T-timeframe of the study; PEO: P-population being studied, E-exposure to preexisting conditions, and O-outcome of interest. 1 Research questions are also considered good if these meet the “FINERMAPS” framework: Feasible, Interesting, Novel, Ethical, Relevant, Manageable, Appropriate, Potential value/publishable, and Systematic. 14

As we indicated earlier, research questions and hypotheses that are not carefully formulated result in unethical studies or poor outcomes. To illustrate this, we provide some examples of ambiguous research question and hypotheses that result in unclear and weak research objectives in quantitative research ( Table 6 ) 16 and qualitative research ( Table 7 ) 17 , and how to transform these ambiguous research question(s) and hypothesis(es) into clear and good statements.

a These statements were composed for comparison and illustrative purposes only.

b These statements are direct quotes from Higashihara and Horiuchi. 16

a This statement is a direct quote from Shimoda et al. 17

The other statements were composed for comparison and illustrative purposes only.

CONSTRUCTING RESEARCH QUESTIONS AND HYPOTHESES

To construct effective research questions and hypotheses, it is very important to 1) clarify the background and 2) identify the research problem at the outset of the research, within a specific timeframe. 9 Then, 3) review or conduct preliminary research to collect all available knowledge about the possible research questions by studying theories and previous studies. 18 Afterwards, 4) construct research questions to investigate the research problem. Identify variables to be accessed from the research questions 4 and make operational definitions of constructs from the research problem and questions. Thereafter, 5) construct specific deductive or inductive predictions in the form of hypotheses. 4 Finally, 6) state the study aims . This general flow for constructing effective research questions and hypotheses prior to conducting research is shown in Fig. 1 .

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g001.jpg

Research questions are used more frequently in qualitative research than objectives or hypotheses. 3 These questions seek to discover, understand, explore or describe experiences by asking “What” or “How.” The questions are open-ended to elicit a description rather than to relate variables or compare groups. The questions are continually reviewed, reformulated, and changed during the qualitative study. 3 Research questions are also used more frequently in survey projects than hypotheses in experiments in quantitative research to compare variables and their relationships.

Hypotheses are constructed based on the variables identified and as an if-then statement, following the template, ‘If a specific action is taken, then a certain outcome is expected.’ At this stage, some ideas regarding expectations from the research to be conducted must be drawn. 18 Then, the variables to be manipulated (independent) and influenced (dependent) are defined. 4 Thereafter, the hypothesis is stated and refined, and reproducible data tailored to the hypothesis are identified, collected, and analyzed. 4 The hypotheses must be testable and specific, 18 and should describe the variables and their relationships, the specific group being studied, and the predicted research outcome. 18 Hypotheses construction involves a testable proposition to be deduced from theory, and independent and dependent variables to be separated and measured separately. 3 Therefore, good hypotheses must be based on good research questions constructed at the start of a study or trial. 12

In summary, research questions are constructed after establishing the background of the study. Hypotheses are then developed based on the research questions. Thus, it is crucial to have excellent research questions to generate superior hypotheses. In turn, these would determine the research objectives and the design of the study, and ultimately, the outcome of the research. 12 Algorithms for building research questions and hypotheses are shown in Fig. 2 for quantitative research and in Fig. 3 for qualitative research.

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g002.jpg

EXAMPLES OF RESEARCH QUESTIONS FROM PUBLISHED ARTICLES

  • EXAMPLE 1. Descriptive research question (quantitative research)
  • - Presents research variables to be assessed (distinct phenotypes and subphenotypes)
  • “BACKGROUND: Since COVID-19 was identified, its clinical and biological heterogeneity has been recognized. Identifying COVID-19 phenotypes might help guide basic, clinical, and translational research efforts.
  • RESEARCH QUESTION: Does the clinical spectrum of patients with COVID-19 contain distinct phenotypes and subphenotypes? ” 19
  • EXAMPLE 2. Relationship research question (quantitative research)
  • - Shows interactions between dependent variable (static postural control) and independent variable (peripheral visual field loss)
  • “Background: Integration of visual, vestibular, and proprioceptive sensations contributes to postural control. People with peripheral visual field loss have serious postural instability. However, the directional specificity of postural stability and sensory reweighting caused by gradual peripheral visual field loss remain unclear.
  • Research question: What are the effects of peripheral visual field loss on static postural control ?” 20
  • EXAMPLE 3. Comparative research question (quantitative research)
  • - Clarifies the difference among groups with an outcome variable (patients enrolled in COMPERA with moderate PH or severe PH in COPD) and another group without the outcome variable (patients with idiopathic pulmonary arterial hypertension (IPAH))
  • “BACKGROUND: Pulmonary hypertension (PH) in COPD is a poorly investigated clinical condition.
  • RESEARCH QUESTION: Which factors determine the outcome of PH in COPD?
  • STUDY DESIGN AND METHODS: We analyzed the characteristics and outcome of patients enrolled in the Comparative, Prospective Registry of Newly Initiated Therapies for Pulmonary Hypertension (COMPERA) with moderate or severe PH in COPD as defined during the 6th PH World Symposium who received medical therapy for PH and compared them with patients with idiopathic pulmonary arterial hypertension (IPAH) .” 21
  • EXAMPLE 4. Exploratory research question (qualitative research)
  • - Explores areas that have not been fully investigated (perspectives of families and children who receive care in clinic-based child obesity treatment) to have a deeper understanding of the research problem
  • “Problem: Interventions for children with obesity lead to only modest improvements in BMI and long-term outcomes, and data are limited on the perspectives of families of children with obesity in clinic-based treatment. This scoping review seeks to answer the question: What is known about the perspectives of families and children who receive care in clinic-based child obesity treatment? This review aims to explore the scope of perspectives reported by families of children with obesity who have received individualized outpatient clinic-based obesity treatment.” 22
  • EXAMPLE 5. Relationship research question (quantitative research)
  • - Defines interactions between dependent variable (use of ankle strategies) and independent variable (changes in muscle tone)
  • “Background: To maintain an upright standing posture against external disturbances, the human body mainly employs two types of postural control strategies: “ankle strategy” and “hip strategy.” While it has been reported that the magnitude of the disturbance alters the use of postural control strategies, it has not been elucidated how the level of muscle tone, one of the crucial parameters of bodily function, determines the use of each strategy. We have previously confirmed using forward dynamics simulations of human musculoskeletal models that an increased muscle tone promotes the use of ankle strategies. The objective of the present study was to experimentally evaluate a hypothesis: an increased muscle tone promotes the use of ankle strategies. Research question: Do changes in the muscle tone affect the use of ankle strategies ?” 23

EXAMPLES OF HYPOTHESES IN PUBLISHED ARTICLES

  • EXAMPLE 1. Working hypothesis (quantitative research)
  • - A hypothesis that is initially accepted for further research to produce a feasible theory
  • “As fever may have benefit in shortening the duration of viral illness, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response when taken during the early stages of COVID-19 illness .” 24
  • “In conclusion, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response . The difference in perceived safety of these agents in COVID-19 illness could be related to the more potent efficacy to reduce fever with ibuprofen compared to acetaminophen. Compelling data on the benefit of fever warrant further research and review to determine when to treat or withhold ibuprofen for early stage fever for COVID-19 and other related viral illnesses .” 24
  • EXAMPLE 2. Exploratory hypothesis (qualitative research)
  • - Explores particular areas deeper to clarify subjective experience and develop a formal hypothesis potentially testable in a future quantitative approach
  • “We hypothesized that when thinking about a past experience of help-seeking, a self distancing prompt would cause increased help-seeking intentions and more favorable help-seeking outcome expectations .” 25
  • “Conclusion
  • Although a priori hypotheses were not supported, further research is warranted as results indicate the potential for using self-distancing approaches to increasing help-seeking among some people with depressive symptomatology.” 25
  • EXAMPLE 3. Hypothesis-generating research to establish a framework for hypothesis testing (qualitative research)
  • “We hypothesize that compassionate care is beneficial for patients (better outcomes), healthcare systems and payers (lower costs), and healthcare providers (lower burnout). ” 26
  • Compassionomics is the branch of knowledge and scientific study of the effects of compassionate healthcare. Our main hypotheses are that compassionate healthcare is beneficial for (1) patients, by improving clinical outcomes, (2) healthcare systems and payers, by supporting financial sustainability, and (3) HCPs, by lowering burnout and promoting resilience and well-being. The purpose of this paper is to establish a scientific framework for testing the hypotheses above . If these hypotheses are confirmed through rigorous research, compassionomics will belong in the science of evidence-based medicine, with major implications for all healthcare domains.” 26
  • EXAMPLE 4. Statistical hypothesis (quantitative research)
  • - An assumption is made about the relationship among several population characteristics ( gender differences in sociodemographic and clinical characteristics of adults with ADHD ). Validity is tested by statistical experiment or analysis ( chi-square test, Students t-test, and logistic regression analysis)
  • “Our research investigated gender differences in sociodemographic and clinical characteristics of adults with ADHD in a Japanese clinical sample. Due to unique Japanese cultural ideals and expectations of women's behavior that are in opposition to ADHD symptoms, we hypothesized that women with ADHD experience more difficulties and present more dysfunctions than men . We tested the following hypotheses: first, women with ADHD have more comorbidities than men with ADHD; second, women with ADHD experience more social hardships than men, such as having less full-time employment and being more likely to be divorced.” 27
  • “Statistical Analysis
  • ( text omitted ) Between-gender comparisons were made using the chi-squared test for categorical variables and Students t-test for continuous variables…( text omitted ). A logistic regression analysis was performed for employment status, marital status, and comorbidity to evaluate the independent effects of gender on these dependent variables.” 27

EXAMPLES OF HYPOTHESIS AS WRITTEN IN PUBLISHED ARTICLES IN RELATION TO OTHER PARTS

  • EXAMPLE 1. Background, hypotheses, and aims are provided
  • “Pregnant women need skilled care during pregnancy and childbirth, but that skilled care is often delayed in some countries …( text omitted ). The focused antenatal care (FANC) model of WHO recommends that nurses provide information or counseling to all pregnant women …( text omitted ). Job aids are visual support materials that provide the right kind of information using graphics and words in a simple and yet effective manner. When nurses are not highly trained or have many work details to attend to, these job aids can serve as a content reminder for the nurses and can be used for educating their patients (Jennings, Yebadokpo, Affo, & Agbogbe, 2010) ( text omitted ). Importantly, additional evidence is needed to confirm how job aids can further improve the quality of ANC counseling by health workers in maternal care …( text omitted )” 28
  • “ This has led us to hypothesize that the quality of ANC counseling would be better if supported by job aids. Consequently, a better quality of ANC counseling is expected to produce higher levels of awareness concerning the danger signs of pregnancy and a more favorable impression of the caring behavior of nurses .” 28
  • “This study aimed to examine the differences in the responses of pregnant women to a job aid-supported intervention during ANC visit in terms of 1) their understanding of the danger signs of pregnancy and 2) their impression of the caring behaviors of nurses to pregnant women in rural Tanzania.” 28
  • EXAMPLE 2. Background, hypotheses, and aims are provided
  • “We conducted a two-arm randomized controlled trial (RCT) to evaluate and compare changes in salivary cortisol and oxytocin levels of first-time pregnant women between experimental and control groups. The women in the experimental group touched and held an infant for 30 min (experimental intervention protocol), whereas those in the control group watched a DVD movie of an infant (control intervention protocol). The primary outcome was salivary cortisol level and the secondary outcome was salivary oxytocin level.” 29
  • “ We hypothesize that at 30 min after touching and holding an infant, the salivary cortisol level will significantly decrease and the salivary oxytocin level will increase in the experimental group compared with the control group .” 29
  • EXAMPLE 3. Background, aim, and hypothesis are provided
  • “In countries where the maternal mortality ratio remains high, antenatal education to increase Birth Preparedness and Complication Readiness (BPCR) is considered one of the top priorities [1]. BPCR includes birth plans during the antenatal period, such as the birthplace, birth attendant, transportation, health facility for complications, expenses, and birth materials, as well as family coordination to achieve such birth plans. In Tanzania, although increasing, only about half of all pregnant women attend an antenatal clinic more than four times [4]. Moreover, the information provided during antenatal care (ANC) is insufficient. In the resource-poor settings, antenatal group education is a potential approach because of the limited time for individual counseling at antenatal clinics.” 30
  • “This study aimed to evaluate an antenatal group education program among pregnant women and their families with respect to birth-preparedness and maternal and infant outcomes in rural villages of Tanzania.” 30
  • “ The study hypothesis was if Tanzanian pregnant women and their families received a family-oriented antenatal group education, they would (1) have a higher level of BPCR, (2) attend antenatal clinic four or more times, (3) give birth in a health facility, (4) have less complications of women at birth, and (5) have less complications and deaths of infants than those who did not receive the education .” 30

Research questions and hypotheses are crucial components to any type of research, whether quantitative or qualitative. These questions should be developed at the very beginning of the study. Excellent research questions lead to superior hypotheses, which, like a compass, set the direction of research, and can often determine the successful conduct of the study. Many research studies have floundered because the development of research questions and subsequent hypotheses was not given the thought and meticulous attention needed. The development of research questions and hypotheses is an iterative process based on extensive knowledge of the literature and insightful grasp of the knowledge gap. Focused, concise, and specific research questions provide a strong foundation for constructing hypotheses which serve as formal predictions about the research outcomes. Research questions and hypotheses are crucial elements of research that should not be overlooked. They should be carefully thought of and constructed when planning research. This avoids unethical studies and poor outcomes by defining well-founded objectives that determine the design, course, and outcome of the study.

Disclosure: The authors have no potential conflicts of interest to disclose.

Author Contributions:

  • Conceptualization: Barroga E, Matanguihan GJ.
  • Methodology: Barroga E, Matanguihan GJ.
  • Writing - original draft: Barroga E, Matanguihan GJ.
  • Writing - review & editing: Barroga E, Matanguihan GJ.

Siirry sisältöön. | Siirry navigointiin

Jyväskylän yliopiston Koppa

HUOM! Kopan käyttö päättyy 31.7.2024! Lue lisää .

  • Cause and Effect

Humanistic research may aim to determine and establish cause and effect relations or causality between phenomena. Indicating cause and effect requires an experimental research setting. Relationships between variables can also be studied by investigating the co-variation or correlation between the cause and the effect and by measuring the strength of the effect between variables. This does not constitute proof of causality.

Research aiming to explore causal relationships between phenomena based on concrete observations and measurements can be defined as empirical research . You need to use a quantitative research strategy to show the explicit cause and effect relationships. The aims of qualitative research in exploring causes and effects usually focus on describing the connections between phenomena rather than an analytical indication of causality.  

Research into explicit cause and effect relationships can be based on several strategies, for example:

Longitudinal research enables you to explore causality over a long period of time.

Experimental research enables you to observe and indicate causality.

  • Data Collection

You can study cause and effect between phenomena through different types of data collected by a variety of methods. You can use either data collected for previous research by another researcher ( existing concrete materials ) or collect / produce your own data during the research process. You can use a variety of research strategies: Population research is suitable when the quantity of available data on a phenomenon is small. Sampling is suitable when the quantity of available data on a phenomenon is too large for you to analyse all of it. Random sampling enables you to select a small element without bias. Purposive sampling (goal-directed sampling) enables you to select samples that match the aim of the study. You can collect data for research on cause and effect through, for example, an experiment or survey to collect the data.

  • Data Analysis

In the strict sense quantitative analysis alone is suitable for analysing cause and effect.

Philosophy in Science

Quantitative analysis methods are based on positivism , which stresses the production of knowledge through exact measuring and use of numeral variables. Views emphasizing the exactness and correctness of measured knowledge are based on realism , which views knowledge as objective.

  • Mapping Research Methods
  • Exploring background
  • Temporal processes
  • Relationships and connections
  • Models and Theories
  • Forecasting the Future
  • Critical Views and Changes
  • Categories, Classes and Types
  • Experiences
  • Beliefs, Opinions and Attitudes
  • Construction of Meanings
  • Interpretation
  • Phenomena in their contexts
  • Philosophy of Science
  • Research Process
  • Menetelmäpolkuja humanisteille

Jyväskylän yliopisto | Digipalvelut | Korppi | Avoimen yliopiston Koppa

Koppa-info  | Tukipyyntö  | Saavutettavuusseloste

IMAGES

  1. Research

    what type of research aims to explore cause and effect

  2. Different types of research

    what type of research aims to explore cause and effect

  3. Research Aim and Objectives

    what type of research aims to explore cause and effect

  4. research title objectives examples

    what type of research aims to explore cause and effect

  5. Cause & Effect Essays Infographic

    what type of research aims to explore cause and effect

  6. cause and effect essays ideas, written short stories for students, dissertation topics in

    what type of research aims to explore cause and effect

VIDEO

  1. A Little Bit of Science: Coriolis Effect at Ecuador's Intiñan Museum #ecuator #CoriolisEffect

  2. Cause & Effect

  3. Cause Effect Graphing Pengujian dan Implementasi Sistem

  4. Cause and Effect: Causation in History- Dr Veenus Jain

  5. LESSON 63

  6. Types of Cancer Research: Basic, Clinical and Translational Research

COMMENTS

  1. Types of Research Designs Compared

    Types of research can be categorized based on the research aims, the type of data, and the subjects, timescale, and location of the research. ... Exploratory research aims to explore the main aspects of an under-researched problem, ... while experimental research manipulates and controls variables to determine cause and effect.

  2. Causal Research Design: Definition, Benefits, Examples

    Hugh Good. Causal research is sometimes called an explanatory or analytical study. It delves into the fundamental cause-and-effect connections between two or more variables. Researchers typically observe how changes in one variable affect another related variable. Examining these relationships gives researchers valuable insights into the ...

  3. Causal Research: Definition, examples and how to use it

    Help companies improve internally. By conducting causal research, management can make informed decisions about improving their employee experience and internal operations. For example, understanding which variables led to an increase in staff turnover. Repeat experiments to enhance reliability and accuracy of results.

  4. Causal Research: Definition, Design, Tips, Examples

    Causal research, on the other hand, seeks to identify cause-and-effect relationships between variables by systematically manipulating independent variables and observing their effects on dependent variables. Unlike descriptive research, causal research aims to determine whether changes in one variable directly cause changes in another variable.

  5. What Is a Research Design

    Step 1: Consider your aims and approach. Step 2: Choose a type of research design. Step 3: Identify your population and sampling method. Step 4: Choose your data collection methods. Step 5: Plan your data collection procedures. Step 6: Decide on your data analysis strategies. Other interesting articles.

  6. Causal Research (Explanatory research)

    Causal studies focus on an analysis of a situation or a specific problem to explain the patterns of relationships between variables. Experiments are the most popular primary data collection methods in studies with causal research design. The presence of cause cause-and-effect relationships can be confirmed only if specific causal evidence exists.

  7. What is Causal Research? Definition + Key Elements

    Defining Causal Research. Causal research investigates why one variable (the independent variable) is causing things to change in another ( the dependent variable). For example, a causal research study about the cause-and-effect relationship between smoking and the prevalence of lung cancer. Smoking prevalence would be the independent variable ...

  8. Explanatory Research

    Explanatory research can also be explained as a "cause and effect" model, investigating patterns and trends in existing data that haven't been previously investigated. ... Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well ...

  9. A Primer to Experimental and Nonexperimental Quantitative Research: The

    It is apparent then how many purposes quantitative research can serve. Some purposes focus on describing or exploring a problem, whereas others are concerned with establishing a cause-and-effect relationship. For each of these purposes (Fig 1), different types of quantitative research can be used to create credible evidence. Let us explore each ...

  10. Cause and effect

    Nature Methods 7 , 243 ( 2010) Cite this article. The experimental tractability of biological systems makes it possible to explore the idea that causal relationships can be estimated from ...

  11. Definitions of Research Designs

    In public health, this type of research allows researchers to explore social and behavioral issues and explore other social or human problems. Critical appraisal of a qualitative study will determine if there was a clear aim for the research, if the qualitative methodology and research design were appropriate for the aims, and if the data ...

  12. Case study research and causal inference

    However, in evaluative health research, case study designs remain relegated to a minor, supporting role [4, 5], typically at the bottom of evidence hierarchies. This relegation is largely due to assumptions that they offer little for making the kinds of causal claims that are essential to evaluating the effects of interventions.

  13. Methods for Evaluating Causality in Observational Studies

    Regression-discontinuity methods have been little used in medical research to date, but they can be helpful in the study of cause-and-effect relationships from observational data . Regression-discontinuity design is a quasi-experimental approach ( box 3 ) that was developed in educational psychology in the 1960s ( 18 ).

  14. Tell cause from effect: models and evaluation

    From Eq. 2, we obtain \(N={f}^{-1}(Y)-g(X)\), where X and Y are the two observed variables representing cause and effect, respectively. To identify the cause and effect, a particular type of constrained nonlinear ICA [17, 53] is performed to extract two components that are as independent as possible. The two extracted components are the assumed ...

  15. What Is Causal Research? (With Benefits, Examples, and Tips)

    Causal research, also known as explanatory research, is a method of conducting research that aims to identify the cause-and-effect relationship between situations or variables. This is a valuable research method, as various factors can contribute to observable events, changes, or developments . When conducting explanatory research, there are ...

  16. From cause and effect to causes and effects

    One-to-one (or cause and effect) relationships are amenable to the traditional randomized control trial design, while all others require systemic designs to understand 'causes and effects'. Researchers urgently need to re-evaluate their science models and embrace research designs that allow an exploration of the clinically obvious multiple ...

  17. Types of Research within Qualitative and Quantitative

    This type of research will recognize trends and patterns in data, but it does not go so far in its analysis to prove causes for these observed patterns. Cause and effect is not the basis of this type of observational research. The data, relationships, and distributions of variables are studied only. Variables are not manipulated; they are only ...

  18. Explanatory Research

    Definition: Explanatory research is a type of research that aims to uncover the underlying causes and relationships between different variables. It seeks to explain why a particular phenomenon occurs and how it relates to other factors. This type of research is typically used to test hypotheses or theories and to establish cause-and-effect ...

  19. Ch 2: Psychological Research Methods

    It aims to determine if one variable directly impacts and causes another. Correlational and experimental research both typically use hypothesis testing, whereas descriptive research does not. Each of these research methods has unique strengths and weaknesses, and each method may only be appropriate for certain types of research questions.

  20. A Practical Guide to Writing Quantitative and Qualitative Research

    INTRODUCTION. Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses.1,2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results.3,4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the ...

  21. Cause and Effect

    Cause and Effect. Viimeisin muutos tiistai 09. maaliskuuta 2010, 11.22. Humanistic research may aim to determine and establish cause and effect relations or causality between phenomena. Indicating cause and effect requires an experimental research setting. Relationships between variables can also be studied by investigating the co-variation or ...

  22. Establishing Cause and Effect

    Establishing Cause and Effect. A central goal of most research is the identification of causal relationships, or demonstrating that a particular independent variable (the cause) has an effect on the dependent variable of interest (the effect). The three criteria for establishing cause and effect - association, time ordering (or temporal precedence), and non-spuriousness - are familiar to ...

  23. SOC 290 ASU Quiz Knowledge Check #1 Flashcards

    Study with Quizlet and memorize flashcards containing terms like What type of research investigates research questions that have not yet been studied in-depth and little is known about the topic?, What type of research aims to describe a phenomenon?, What type of research aims to explore cause and effect? and more.