## Hypothesis Testing

About hypothesis testing.

Contents (Click to skip to the section):

## What is a Hypothesis?

What is hypothesis testing.

- Hypothesis Testing Examples (One Sample Z Test).
- Hypothesis Test on a Mean (TI 83).

## Bayesian Hypothesis Testing.

- More Hypothesis Testing Articles
- Hypothesis Tests in One Picture
- Critical Values

## What is the Null Hypothesis?

Need help with a homework problem? Check out our tutoring page!

A hypothesis is an educated guess about something in the world around you. It should be testable, either by experiment or observation. For example:

- A new medicine you think might work.
- A way of teaching you think might be better.
- A possible location of new species.
- A fairer way to administer standardized tests.

It can really be anything at all as long as you can put it to the test.

## What is a Hypothesis Statement?

If you are going to propose a hypothesis, it’s customary to write a statement. Your statement will look like this: “If I…(do this to an independent variable )….then (this will happen to the dependent variable ).” For example:

- If I (decrease the amount of water given to herbs) then (the herbs will increase in size).
- If I (give patients counseling in addition to medication) then (their overall depression scale will decrease).
- If I (give exams at noon instead of 7) then (student test scores will improve).
- If I (look in this certain location) then (I am more likely to find new species).

A good hypothesis statement should:

- Include an “if” and “then” statement (according to the University of California).
- Include both the independent and dependent variables.
- Be testable by experiment, survey or other scientifically sound technique.
- Be based on information in prior research (either yours or someone else’s).
- Have design criteria (for engineering or programming projects).

Hypothesis testing can be one of the most confusing aspects for students, mostly because before you can even perform a test, you have to know what your null hypothesis is. Often, those tricky word problems that you are faced with can be difficult to decipher. But it’s easier than you think; all you need to do is:

- Figure out your null hypothesis,
- State your null hypothesis,
- Choose what kind of test you need to perform,
- Either support or reject the null hypothesis .

If you trace back the history of science, the null hypothesis is always the accepted fact. Simple examples of null hypotheses that are generally accepted as being true are:

- DNA is shaped like a double helix.
- There are 8 planets in the solar system (excluding Pluto).
- Taking Vioxx can increase your risk of heart problems (a drug now taken off the market).

## How do I State the Null Hypothesis?

You won’t be required to actually perform a real experiment or survey in elementary statistics (or even disprove a fact like “Pluto is a planet”!), so you’ll be given word problems from real-life situations. You’ll need to figure out what your hypothesis is from the problem. This can be a little trickier than just figuring out what the accepted fact is. With word problems, you are looking to find a fact that is nullifiable (i.e. something you can reject).

## Hypothesis Testing Examples #1: Basic Example

A researcher thinks that if knee surgery patients go to physical therapy twice a week (instead of 3 times), their recovery period will be longer. Average recovery times for knee surgery patients is 8.2 weeks.

The hypothesis statement in this question is that the researcher believes the average recovery time is more than 8.2 weeks. It can be written in mathematical terms as: H 1 : μ > 8.2

Next, you’ll need to state the null hypothesis . That’s what will happen if the researcher is wrong . In the above example, if the researcher is wrong then the recovery time is less than or equal to 8.2 weeks. In math, that’s: H 0 μ ≤ 8.2

## Rejecting the null hypothesis

Ten or so years ago, we believed that there were 9 planets in the solar system. Pluto was demoted as a planet in 2006. The null hypothesis of “Pluto is a planet” was replaced by “Pluto is not a planet.” Of course, rejecting the null hypothesis isn’t always that easy— the hard part is usually figuring out what your null hypothesis is in the first place.

## Hypothesis Testing Examples (One Sample Z Test)

The one sample z test isn’t used very often (because we rarely know the actual population standard deviation ). However, it’s a good idea to understand how it works as it’s one of the simplest tests you can perform in hypothesis testing. In English class you got to learn the basics (like grammar and spelling) before you could write a story; think of one sample z tests as the foundation for understanding more complex hypothesis testing. This page contains two hypothesis testing examples for one sample z-tests .

## One Sample Hypothesis Testing Example: One Tailed Z Test

A principal at a certain school claims that the students in his school are above average intelligence. A random sample of thirty students IQ scores have a mean score of 112.5. Is there sufficient evidence to support the principal’s claim? The mean population IQ is 100 with a standard deviation of 15.

Step 1: State the Null hypothesis . The accepted fact is that the population mean is 100, so: H 0 : μ = 100.

Step 2: State the Alternate Hypothesis . The claim is that the students have above average IQ scores, so: H 1 : μ > 100. The fact that we are looking for scores “greater than” a certain point means that this is a one-tailed test.

Step 4: State the alpha level . If you aren’t given an alpha level , use 5% (0.05).

Step 5: Find the rejection region area (given by your alpha level above) from the z-table . An area of .05 is equal to a z-score of 1.645.

Step 6: If Step 6 is greater than Step 5, reject the null hypothesis. If it’s less than Step 5, you cannot reject the null hypothesis. In this case, it is more (4.56 > 1.645), so you can reject the null.

## One Sample Hypothesis Testing Examples: #3

Blood glucose levels for obese patients have a mean of 100 with a standard deviation of 15. A researcher thinks that a diet high in raw cornstarch will have a positive or negative effect on blood glucose levels. A sample of 30 patients who have tried the raw cornstarch diet have a mean glucose level of 140. Test the hypothesis that the raw cornstarch had an effect.

- State the null hypothesis : H 0 :μ=100
- State the alternate hypothesis : H 1 :≠100
- State your alpha level. We’ll use 0.05 for this example. As this is a two-tailed test, split the alpha into two. 0.05/2=0.025
- Find the z-score associated with your alpha level . You’re looking for the area in one tail only . A z-score for 0.75(1-0.025=0.975) is 1.96. As this is a two-tailed test, you would also be considering the left tail (z = 1.96)
- If Step 5 is less than -1.96 or greater than 1.96 (Step 3), reject the null hypothesis . In this case, it is greater, so you can reject the null.

*This process is made much easier if you use a TI-83 or Excel to calculate the z-score (the “critical value”). See:

- Critical z value TI 83
- Z Score in Excel

## Hypothesis Testing Examples: Mean (Using TI 83)

You can use the TI 83 calculator for hypothesis testing, but the calculator won’t figure out the null and alternate hypotheses; that’s up to you to read the question and input it into the calculator.

Example problem : A sample of 200 people has a mean age of 21 with a population standard deviation (σ) of 5. Test the hypothesis that the population mean is 18.9 at α = 0.05.

Step 1: State the null hypothesis. In this case, the null hypothesis is that the population mean is 18.9, so we write: H 0 : μ = 18.9

Step 2: State the alternative hypothesis. We want to know if our sample, which has a mean of 21 instead of 18.9, really is different from the population, therefore our alternate hypothesis: H 1 : μ ≠ 18.9

Step 3: Press Stat then press the right arrow twice to select TESTS.

Step 4: Press 1 to select 1:Z-Test… . Press ENTER.

Step 5: Use the right arrow to select Stats .

Step 6: Enter the data from the problem: μ 0 : 18.9 σ: 5 x : 21 n: 200 μ: ≠μ 0

Step 7: Arrow down to Calculate and press ENTER. The calculator shows the p-value: p = 2.87 × 10 -9

This is smaller than our alpha value of .05. That means we should reject the null hypothesis .

## Bayesian Hypothesis Testing: What is it?

Bayesian hypothesis testing helps to answer the question: Can the results from a test or survey be repeated? Why do we care if a test can be repeated? Let’s say twenty people in the same village came down with leukemia. A group of researchers find that cell-phone towers are to blame. However, a second study found that cell-phone towers had nothing to do with the cancer cluster in the village. In fact, they found that the cancers were completely random. If that sounds impossible, it actually can happen! Clusters of cancer can happen simply by chance . There could be many reasons why the first study was faulty. One of the main reasons could be that they just didn’t take into account that sometimes things happen randomly and we just don’t know why.

It’s good science to let people know if your study results are solid, or if they could have happened by chance. The usual way of doing this is to test your results with a p-value . A p value is a number that you get by running a hypothesis test on your data. A P value of 0.05 (5%) or less is usually enough to claim that your results are repeatable. However, there’s another way to test the validity of your results: Bayesian Hypothesis testing. This type of testing gives you another way to test the strength of your results.

Traditional testing (the type you probably came across in elementary stats or AP stats) is called Non-Bayesian. It is how often an outcome happens over repeated runs of the experiment. It’s an objective view of whether an experiment is repeatable. Bayesian hypothesis testing is a subjective view of the same thing. It takes into account how much faith you have in your results. In other words, would you wager money on the outcome of your experiment?

## Differences Between Traditional and Bayesian Hypothesis Testing.

Traditional testing (Non Bayesian) requires you to repeat sampling over and over, while Bayesian testing does not. The main different between the two is in the first step of testing: stating a probability model. In Bayesian testing you add prior knowledge to this step. It also requires use of a posterior probability , which is the conditional probability given to a random event after all the evidence is considered.

## Arguments for Bayesian Testing.

Many researchers think that it is a better alternative to traditional testing, because it:

- Includes prior knowledge about the data.
- Takes into account personal beliefs about the results.

## Arguments against.

- Including prior data or knowledge isn’t justifiable.
- It is difficult to calculate compared to non-Bayesian testing.

Back to top

## Hypothesis Testing Articles

- What is Ad Hoc Testing?
- Composite Hypothesis Test
- What is a Rejection Region?
- What is a Two Tailed Test?
- How to Decide if a Hypothesis Test is a One Tailed Test or a Two Tailed Test.
- How to Decide if a Hypothesis is a Left Tailed Test or a Right-Tailed Test.
- How to State the Null Hypothesis in Statistics.
- How to Find a Critical Value .
- How to Support or Reject a Null Hypothesis.

Specific Tests:

- Brunner Munzel Test (Generalized Wilcoxon Test).
- Chi Square Test for Normality.
- Cochran-Mantel-Haenszel Test.
- Granger Causality Test .
- Hotelling’s T-Squared.
- KPSS Test .
- What is a Likelihood-Ratio Test?
- Log rank test .
- MANCOVA Assumptions.
- MANCOVA Sample Size.
- Marascuilo Procedure
- Rao’s Spacing Test
- Rayleigh test of uniformity.
- Sequential Probability Ratio Test.
- How to Run a Sign Test.
- T Test: one sample.
- T-Test: Two sample .
- Welch’s ANOVA .
- Welch’s Test for Unequal Variances .
- Z-Test: one sample .
- Z Test: Two Proportion.
- Wald Test .

Related Articles:

- What is an Acceptance Region?
- How to Calculate Chebyshev’s Theorem.
- Contrast Analysis
- Decision Rule.
- Degrees of Freedom .
- Directional Test
- False Discovery Rate
- How to calculate the Least Significant Difference.
- Levels in Statistics.
- How to Calculate Margin of Error.
- Mean Difference (Difference in Means)
- The Multiple Testing Problem .
- What is the Neyman-Pearson Lemma?
- What is an Omnibus Test?
- One Sample Median Test .
- How to Find a Sample Size (General Instructions).
- Sig 2(Tailed) meaning in results
- What is a Standardized Test Statistic?
- How to Find Standard Error
- Standardized values: Example.
- How to Calculate a T-Score.
- T-Score Vs. a Z.Score.
- Testing a Single Mean.
- Unequal Sample Sizes.
- Uniformly Most Powerful Tests.
- How to Calculate a Z-Score.

## Tutorial Playlist

Statistics tutorial, everything you need to know about the probability density function in statistics, the best guide to understand central limit theorem, an in-depth guide to measures of central tendency : mean, median and mode, the ultimate guide to understand conditional probability.

A Comprehensive Look at Percentile in Statistics

## The Best Guide to Understand Bayes Theorem

Everything you need to know about the normal distribution, an in-depth explanation of cumulative distribution function, a complete guide to chi-square test, what is hypothesis testing in statistics types and examples, understanding the fundamentals of arithmetic and geometric progression, the definitive guide to understand spearman’s rank correlation, mean squared error: overview, examples, concepts and more, all you need to know about the empirical rule in statistics, the complete guide to skewness and kurtosis, a holistic look at bernoulli distribution.

All You Need to Know About Bias in Statistics

## A Complete Guide to Get a Grasp of Time Series Analysis

The Key Differences Between Z-Test Vs. T-Test

## The Complete Guide to Understand Pearson's Correlation

A complete guide on the types of statistical studies, everything you need to know about poisson distribution, your best guide to understand correlation vs. regression, the most comprehensive guide for beginners on what is correlation, what is hypothesis testing in statistics types and examples.

Lesson 10 of 24 By Avijeet Biswal

## Table of Contents

In today’s data-driven world , decisions are based on data all the time. Hypothesis plays a crucial role in that process, whether it may be making business decisions, in the health sector, academia, or in quality improvement. Without hypothesis & hypothesis tests, you risk drawing the wrong conclusions and making bad decisions. In this tutorial, you will look at Hypothesis Testing in Statistics.

## The Ultimate Ticket to Top Data Science Job Roles

## What Is Hypothesis Testing in Statistics?

Hypothesis Testing is a type of statistical analysis in which you put your assumptions about a population parameter to the test. It is used to estimate the relationship between 2 statistical variables.

Let's discuss few examples of statistical hypothesis from real-life -

- A teacher assumes that 60% of his college's students come from lower-middle-class families.
- A doctor believes that 3D (Diet, Dose, and Discipline) is 90% effective for diabetic patients.

Now that you know about hypothesis testing, look at the two types of hypothesis testing in statistics.

## Hypothesis Testing Formula

Z = ( x̅ – μ0 ) / (σ /√n)

- Here, x̅ is the sample mean,
- μ0 is the population mean,
- σ is the standard deviation,
- n is the sample size.

## How Hypothesis Testing Works?

An analyst performs hypothesis testing on a statistical sample to present evidence of the plausibility of the null hypothesis. Measurements and analyses are conducted on a random sample of the population to test a theory. Analysts use a random population sample to test two hypotheses: the null and alternative hypotheses.

The null hypothesis is typically an equality hypothesis between population parameters; for example, a null hypothesis may claim that the population means return equals zero. The alternate hypothesis is essentially the inverse of the null hypothesis (e.g., the population means the return is not equal to zero). As a result, they are mutually exclusive, and only one can be correct. One of the two possibilities, however, will always be correct.

## Your Dream Career is Just Around The Corner!

## Null Hypothesis and Alternate Hypothesis

The Null Hypothesis is the assumption that the event will not occur. A null hypothesis has no bearing on the study's outcome unless it is rejected.

H0 is the symbol for it, and it is pronounced H-naught.

The Alternate Hypothesis is the logical opposite of the null hypothesis. The acceptance of the alternative hypothesis follows the rejection of the null hypothesis. H1 is the symbol for it.

Let's understand this with an example.

A sanitizer manufacturer claims that its product kills 95 percent of germs on average.

To put this company's claim to the test, create a null and alternate hypothesis.

H0 (Null Hypothesis): Average = 95%.

Alternative Hypothesis (H1): The average is less than 95%.

Another straightforward example to understand this concept is determining whether or not a coin is fair and balanced. The null hypothesis states that the probability of a show of heads is equal to the likelihood of a show of tails. In contrast, the alternate theory states that the probability of a show of heads and tails would be very different.

## Become a Data Scientist with Hands-on Training!

## Hypothesis Testing Calculation With Examples

Let's consider a hypothesis test for the average height of women in the United States. Suppose our null hypothesis is that the average height is 5'4". We gather a sample of 100 women and determine that their average height is 5'5". The standard deviation of population is 2.

To calculate the z-score, we would use the following formula:

z = ( x̅ – μ0 ) / (σ /√n)

z = (5'5" - 5'4") / (2" / √100)

z = 0.5 / (0.045)

We will reject the null hypothesis as the z-score of 11.11 is very large and conclude that there is evidence to suggest that the average height of women in the US is greater than 5'4".

## Steps of Hypothesis Testing

Hypothesis testing is a statistical method to determine if there is enough evidence in a sample of data to infer that a certain condition is true for the entire population. Here’s a breakdown of the typical steps involved in hypothesis testing:

## Formulate Hypotheses

- Null Hypothesis (H0): This hypothesis states that there is no effect or difference, and it is the hypothesis you attempt to reject with your test.
- Alternative Hypothesis (H1 or Ha): This hypothesis is what you might believe to be true or hope to prove true. It is usually considered the opposite of the null hypothesis.

## Choose the Significance Level (α)

The significance level, often denoted by alpha (α), is the probability of rejecting the null hypothesis when it is true. Common choices for α are 0.05 (5%), 0.01 (1%), and 0.10 (10%).

## Select the Appropriate Test

Choose a statistical test based on the type of data and the hypothesis. Common tests include t-tests, chi-square tests, ANOVA, and regression analysis . The selection depends on data type, distribution, sample size, and whether the hypothesis is one-tailed or two-tailed.

## Collect Data

Gather the data that will be analyzed in the test. This data should be representative of the population to infer conclusions accurately.

## Calculate the Test Statistic

Based on the collected data and the chosen test, calculate a test statistic that reflects how much the observed data deviates from the null hypothesis.

## Determine the p-value

The p-value is the probability of observing test results at least as extreme as the results observed, assuming the null hypothesis is correct. It helps determine the strength of the evidence against the null hypothesis.

## Make a Decision

Compare the p-value to the chosen significance level:

- If the p-value ≤ α: Reject the null hypothesis, suggesting sufficient evidence in the data supports the alternative hypothesis.
- If the p-value > α: Do not reject the null hypothesis, suggesting insufficient evidence to support the alternative hypothesis.

## Report the Results

Present the findings from the hypothesis test, including the test statistic, p-value, and the conclusion about the hypotheses.

## Perform Post-hoc Analysis (if necessary)

Depending on the results and the study design, further analysis may be needed to explore the data more deeply or to address multiple comparisons if several hypotheses were tested simultaneously.

## Types of Hypothesis Testing

To determine whether a discovery or relationship is statistically significant, hypothesis testing uses a z-test. It usually checks to see if two means are the same (the null hypothesis). Only when the population standard deviation is known and the sample size is 30 data points or more, can a z-test be applied.

A statistical test called a t-test is employed to compare the means of two groups. To determine whether two groups differ or if a procedure or treatment affects the population of interest, it is frequently used in hypothesis testing.

## Chi-Square

You utilize a Chi-square test for hypothesis testing concerning whether your data is as predicted. To determine if the expected and observed results are well-fitted, the Chi-square test analyzes the differences between categorical variables from a random sample. The test's fundamental premise is that the observed values in your data should be compared to the predicted values that would be present if the null hypothesis were true.

## Hypothesis Testing and Confidence Intervals

Both confidence intervals and hypothesis tests are inferential techniques that depend on approximating the sample distribution. Data from a sample is used to estimate a population parameter using confidence intervals. Data from a sample is used in hypothesis testing to examine a given hypothesis. We must have a postulated parameter to conduct hypothesis testing.

Bootstrap distributions and randomization distributions are created using comparable simulation techniques. The observed sample statistic is the focal point of a bootstrap distribution, whereas the null hypothesis value is the focal point of a randomization distribution.

A variety of feasible population parameter estimates are included in confidence ranges. In this lesson, we created just two-tailed confidence intervals. There is a direct connection between these two-tail confidence intervals and these two-tail hypothesis tests. The results of a two-tailed hypothesis test and two-tailed confidence intervals typically provide the same results. In other words, a hypothesis test at the 0.05 level will virtually always fail to reject the null hypothesis if the 95% confidence interval contains the predicted value. A hypothesis test at the 0.05 level will nearly certainly reject the null hypothesis if the 95% confidence interval does not include the hypothesized parameter.

Become a Data Scientist through hands-on learning with hackathons, masterclasses, webinars, and Ask-Me-Anything! Start learning now!

## Simple and Composite Hypothesis Testing

Depending on the population distribution, you can classify the statistical hypothesis into two types.

Simple Hypothesis: A simple hypothesis specifies an exact value for the parameter.

Composite Hypothesis: A composite hypothesis specifies a range of values.

A company is claiming that their average sales for this quarter are 1000 units. This is an example of a simple hypothesis.

Suppose the company claims that the sales are in the range of 900 to 1000 units. Then this is a case of a composite hypothesis.

## One-Tailed and Two-Tailed Hypothesis Testing

The One-Tailed test, also called a directional test, considers a critical region of data that would result in the null hypothesis being rejected if the test sample falls into it, inevitably meaning the acceptance of the alternate hypothesis.

In a one-tailed test, the critical distribution area is one-sided, meaning the test sample is either greater or lesser than a specific value.

In two tails, the test sample is checked to be greater or less than a range of values in a Two-Tailed test, implying that the critical distribution area is two-sided.

If the sample falls within this range, the alternate hypothesis will be accepted, and the null hypothesis will be rejected.

## Become a Data Scientist With Real-World Experience

## Right Tailed Hypothesis Testing

If the larger than (>) sign appears in your hypothesis statement, you are using a right-tailed test, also known as an upper test. Or, to put it another way, the disparity is to the right. For instance, you can contrast the battery life before and after a change in production. Your hypothesis statements can be the following if you want to know if the battery life is longer than the original (let's say 90 hours):

- The null hypothesis is (H0 <= 90) or less change.
- A possibility is that battery life has risen (H1) > 90.

The crucial point in this situation is that the alternate hypothesis (H1), not the null hypothesis, decides whether you get a right-tailed test.

## Left Tailed Hypothesis Testing

Alternative hypotheses that assert the true value of a parameter is lower than the null hypothesis are tested with a left-tailed test; they are indicated by the asterisk "<".

Suppose H0: mean = 50 and H1: mean not equal to 50

According to the H1, the mean can be greater than or less than 50. This is an example of a Two-tailed test.

In a similar manner, if H0: mean >=50, then H1: mean <50

Here the mean is less than 50. It is called a One-tailed test.

## Type 1 and Type 2 Error

A hypothesis test can result in two types of errors.

Type 1 Error: A Type-I error occurs when sample results reject the null hypothesis despite being true.

Type 2 Error: A Type-II error occurs when the null hypothesis is not rejected when it is false, unlike a Type-I error.

Suppose a teacher evaluates the examination paper to decide whether a student passes or fails.

H0: Student has passed

H1: Student has failed

Type I error will be the teacher failing the student [rejects H0] although the student scored the passing marks [H0 was true].

Type II error will be the case where the teacher passes the student [do not reject H0] although the student did not score the passing marks [H1 is true].

## Level of Significance

The alpha value is a criterion for determining whether a test statistic is statistically significant. In a statistical test, Alpha represents an acceptable probability of a Type I error. Because alpha is a probability, it can be anywhere between 0 and 1. In practice, the most commonly used alpha values are 0.01, 0.05, and 0.1, which represent a 1%, 5%, and 10% chance of a Type I error, respectively (i.e. rejecting the null hypothesis when it is in fact correct).

A p-value is a metric that expresses the likelihood that an observed difference could have occurred by chance. As the p-value decreases the statistical significance of the observed difference increases. If the p-value is too low, you reject the null hypothesis.

Here you have taken an example in which you are trying to test whether the new advertising campaign has increased the product's sales. The p-value is the likelihood that the null hypothesis, which states that there is no change in the sales due to the new advertising campaign, is true. If the p-value is .30, then there is a 30% chance that there is no increase or decrease in the product's sales. If the p-value is 0.03, then there is a 3% probability that there is no increase or decrease in the sales value due to the new advertising campaign. As you can see, the lower the p-value, the chances of the alternate hypothesis being true increases, which means that the new advertising campaign causes an increase or decrease in sales.

Our Data Scientist Master's Program covers core topics such as R, Python, Machine Learning, Tableau, Hadoop, and Spark. Get started on your journey today!

## Why Is Hypothesis Testing Important in Research Methodology?

Hypothesis testing is crucial in research methodology for several reasons:

- Provides evidence-based conclusions: It allows researchers to make objective conclusions based on empirical data, providing evidence to support or refute their research hypotheses.
- Supports decision-making: It helps make informed decisions, such as accepting or rejecting a new treatment, implementing policy changes, or adopting new practices.
- Adds rigor and validity: It adds scientific rigor to research using statistical methods to analyze data, ensuring that conclusions are based on sound statistical evidence.
- Contributes to the advancement of knowledge: By testing hypotheses, researchers contribute to the growth of knowledge in their respective fields by confirming existing theories or discovering new patterns and relationships.

## When Did Hypothesis Testing Begin?

Hypothesis testing as a formalized process began in the early 20th century, primarily through the work of statisticians such as Ronald A. Fisher, Jerzy Neyman, and Egon Pearson. The development of hypothesis testing is closely tied to the evolution of statistical methods during this period.

- Ronald A. Fisher (1920s): Fisher was one of the key figures in developing the foundation for modern statistical science. In the 1920s, he introduced the concept of the null hypothesis in his book "Statistical Methods for Research Workers" (1925). Fisher also developed significance testing to examine the likelihood of observing the collected data if the null hypothesis were true. He introduced p-values to determine the significance of the observed results.
- Neyman-Pearson Framework (1930s): Jerzy Neyman and Egon Pearson built on Fisher’s work and formalized the process of hypothesis testing even further. In the 1930s, they introduced the concepts of Type I and Type II errors and developed a decision-making framework widely used in hypothesis testing today. Their approach emphasized the balance between these errors and introduced the concepts of the power of a test and the alternative hypothesis.

The dialogue between Fisher's and Neyman-Pearson's approaches shaped the methods and philosophy of statistical hypothesis testing used today. Fisher emphasized the evidential interpretation of the p-value. At the same time, Neyman and Pearson advocated for a decision-theoretical approach in which hypotheses are either accepted or rejected based on pre-determined significance levels and power considerations.

The application and methodology of hypothesis testing have since become a cornerstone of statistical analysis across various scientific disciplines, marking a significant statistical development.

## Limitations of Hypothesis Testing

Hypothesis testing has some limitations that researchers should be aware of:

- It cannot prove or establish the truth: Hypothesis testing provides evidence to support or reject a hypothesis, but it cannot confirm the absolute truth of the research question.
- Results are sample-specific: Hypothesis testing is based on analyzing a sample from a population, and the conclusions drawn are specific to that particular sample.
- Possible errors: During hypothesis testing, there is a chance of committing type I error (rejecting a true null hypothesis) or type II error (failing to reject a false null hypothesis).
- Assumptions and requirements: Different tests have specific assumptions and requirements that must be met to accurately interpret results.

## Learn All The Tricks Of The BI Trade

After reading this tutorial, you would have a much better understanding of hypothesis testing, one of the most important concepts in the field of Data Science . The majority of hypotheses are based on speculation about observed behavior, natural phenomena, or established theories.

If you are interested in statistics of data science and skills needed for such a career, you ought to explore the Post Graduate Program in Data Science.

If you have any questions regarding this ‘Hypothesis Testing In Statistics’ tutorial, do share them in the comment section. Our subject matter expert will respond to your queries. Happy learning!

## 1. What is hypothesis testing in statistics with example?

Hypothesis testing is a statistical method used to determine if there is enough evidence in a sample data to draw conclusions about a population. It involves formulating two competing hypotheses, the null hypothesis (H0) and the alternative hypothesis (Ha), and then collecting data to assess the evidence. An example: testing if a new drug improves patient recovery (Ha) compared to the standard treatment (H0) based on collected patient data.

## 2. What is H0 and H1 in statistics?

In statistics, H0 and H1 represent the null and alternative hypotheses. The null hypothesis, H0, is the default assumption that no effect or difference exists between groups or conditions. The alternative hypothesis, H1, is the competing claim suggesting an effect or a difference. Statistical tests determine whether to reject the null hypothesis in favor of the alternative hypothesis based on the data.

## 3. What is a simple hypothesis with an example?

A simple hypothesis is a specific statement predicting a single relationship between two variables. It posits a direct and uncomplicated outcome. For example, a simple hypothesis might state, "Increased sunlight exposure increases the growth rate of sunflowers." Here, the hypothesis suggests a direct relationship between the amount of sunlight (independent variable) and the growth rate of sunflowers (dependent variable), with no additional variables considered.

## 4. What are the 2 types of hypothesis testing?

- One-tailed (or one-sided) test: Tests for the significance of an effect in only one direction, either positive or negative.
- Two-tailed (or two-sided) test: Tests for the significance of an effect in both directions, allowing for the possibility of a positive or negative effect.

The choice between one-tailed and two-tailed tests depends on the specific research question and the directionality of the expected effect.

## 5. What are the 3 major types of hypothesis?

The three major types of hypotheses are:

- Null Hypothesis (H0): Represents the default assumption, stating that there is no significant effect or relationship in the data.
- Alternative Hypothesis (Ha): Contradicts the null hypothesis and proposes a specific effect or relationship that researchers want to investigate.
- Nondirectional Hypothesis: An alternative hypothesis that doesn't specify the direction of the effect, leaving it open for both positive and negative possibilities.

## Find our PL-300 Microsoft Power BI Certification Training Online Classroom training classes in top cities:

Name | Date | Place | |
---|---|---|---|

20 Jul -4 Aug 2024, Weekend batch | Your City | ||

10 Aug -25 Aug 2024, Weekend batch | Your City | ||

7 Sep -22 Sep 2024, Weekend batch | Your City |

## About the Author

Avijeet is a Senior Research Analyst at Simplilearn. Passionate about Data Analytics, Machine Learning, and Deep Learning, Avijeet is also interested in politics, cricket, and football.

## Recommended Resources

Free eBook: Top Programming Languages For A Data Scientist

Normality Test in Minitab: Minitab with Statistics

Machine Learning Career Guide: A Playbook to Becoming a Machine Learning Engineer

- PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

## User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

- Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
- Duis aute irure dolor in reprehenderit in voluptate
- Excepteur sint occaecat cupidatat non proident

## Keyboard Shortcuts

10.1 - setting the hypotheses: examples.

A significance test examines whether the null hypothesis provides a plausible explanation of the data. The null hypothesis itself does not involve the data. It is a statement about a parameter (a numerical characteristic of the population). These population values might be proportions or means or differences between means or proportions or correlations or odds ratios or any other numerical summary of the population. The alternative hypothesis is typically the research hypothesis of interest. Here are some examples.

## Example 10.2: Hypotheses with One Sample of One Categorical Variable Section

About 10% of the human population is left-handed. Suppose a researcher at Penn State speculates that students in the College of Arts and Architecture are more likely to be left-handed than people found in the general population. We only have one sample since we will be comparing a population proportion based on a sample value to a known population value.

- Research Question : Are artists more likely to be left-handed than people found in the general population?
- Response Variable : Classification of the student as either right-handed or left-handed

## State Null and Alternative Hypotheses

- Null Hypothesis : Students in the College of Arts and Architecture are no more likely to be left-handed than people in the general population (population percent of left-handed students in the College of Art and Architecture = 10% or p = .10).
- Alternative Hypothesis : Students in the College of Arts and Architecture are more likely to be left-handed than people in the general population (population percent of left-handed students in the College of Arts and Architecture > 10% or p > .10). This is a one-sided alternative hypothesis.

## Example 10.3: Hypotheses with One Sample of One Measurement Variable Section

A generic brand of the anti-histamine Diphenhydramine markets a capsule with a 50 milligram dose. The manufacturer is worried that the machine that fills the capsules has come out of calibration and is no longer creating capsules with the appropriate dosage.

- Research Question : Does the data suggest that the population mean dosage of this brand is different than 50 mg?
- Response Variable : dosage of the active ingredient found by a chemical assay.
- Null Hypothesis : On the average, the dosage sold under this brand is 50 mg (population mean dosage = 50 mg).
- Alternative Hypothesis : On the average, the dosage sold under this brand is not 50 mg (population mean dosage ≠ 50 mg). This is a two-sided alternative hypothesis.

## Example 10.4: Hypotheses with Two Samples of One Categorical Variable Section

Many people are starting to prefer vegetarian meals on a regular basis. Specifically, a researcher believes that females are more likely than males to eat vegetarian meals on a regular basis.

- Research Question : Does the data suggest that females are more likely than males to eat vegetarian meals on a regular basis?
- Response Variable : Classification of whether or not a person eats vegetarian meals on a regular basis
- Explanatory (Grouping) Variable: Sex
- Null Hypothesis : There is no sex effect regarding those who eat vegetarian meals on a regular basis (population percent of females who eat vegetarian meals on a regular basis = population percent of males who eat vegetarian meals on a regular basis or p females = p males ).
- Alternative Hypothesis : Females are more likely than males to eat vegetarian meals on a regular basis (population percent of females who eat vegetarian meals on a regular basis > population percent of males who eat vegetarian meals on a regular basis or p females > p males ). This is a one-sided alternative hypothesis.

## Example 10.5: Hypotheses with Two Samples of One Measurement Variable Section

Obesity is a major health problem today. Research is starting to show that people may be able to lose more weight on a low carbohydrate diet than on a low fat diet.

- Research Question : Does the data suggest that, on the average, people are able to lose more weight on a low carbohydrate diet than on a low fat diet?
- Response Variable : Weight loss (pounds)
- Explanatory (Grouping) Variable : Type of diet
- Null Hypothesis : There is no difference in the mean amount of weight loss when comparing a low carbohydrate diet with a low fat diet (population mean weight loss on a low carbohydrate diet = population mean weight loss on a low fat diet).
- Alternative Hypothesis : The mean weight loss should be greater for those on a low carbohydrate diet when compared with those on a low fat diet (population mean weight loss on a low carbohydrate diet > population mean weight loss on a low fat diet). This is a one-sided alternative hypothesis.

## Example 10.6: Hypotheses about the relationship between Two Categorical Variables Section

- Research Question : Do the odds of having a stroke increase if you inhale second hand smoke ? A case-control study of non-smoking stroke patients and controls of the same age and occupation are asked if someone in their household smokes.
- Variables : There are two different categorical variables (Stroke patient vs control and whether the subject lives in the same household as a smoker). Living with a smoker (or not) is the natural explanatory variable and having a stroke (or not) is the natural response variable in this situation.
- Null Hypothesis : There is no relationship between whether or not a person has a stroke and whether or not a person lives with a smoker (odds ratio between stroke and second-hand smoke situation is = 1).
- Alternative Hypothesis : There is a relationship between whether or not a person has a stroke and whether or not a person lives with a smoker (odds ratio between stroke and second-hand smoke situation is > 1). This is a one-tailed alternative.

This research question might also be addressed like example 11.4 by making the hypotheses about comparing the proportion of stroke patients that live with smokers to the proportion of controls that live with smokers.

## Example 10.7: Hypotheses about the relationship between Two Measurement Variables Section

- Research Question : A financial analyst believes there might be a positive association between the change in a stock's price and the amount of the stock purchased by non-management employees the previous day (stock trading by management being under "insider-trading" regulatory restrictions).
- Variables : Daily price change information (the response variable) and previous day stock purchases by non-management employees (explanatory variable). These are two different measurement variables.
- Null Hypothesis : The correlation between the daily stock price change (\$) and the daily stock purchases by non-management employees (\$) = 0.
- Alternative Hypothesis : The correlation between the daily stock price change (\$) and the daily stock purchases by non-management employees (\$) > 0. This is a one-sided alternative hypothesis.

## Example 10.8: Hypotheses about comparing the relationship between Two Measurement Variables in Two Samples Section

- Research Question : Is there a linear relationship between the amount of the bill (\$) at a restaurant and the tip (\$) that was left. Is the strength of this association different for family restaurants than for fine dining restaurants?
- Variables : There are two different measurement variables. The size of the tip would depend on the size of the bill so the amount of the bill would be the explanatory variable and the size of the tip would be the response variable.
- Null Hypothesis : The correlation between the amount of the bill (\$) at a restaurant and the tip (\$) that was left is the same at family restaurants as it is at fine dining restaurants.
- Alternative Hypothesis : The correlation between the amount of the bill (\$) at a restaurant and the tip (\$) that was left is the difference at family restaurants then it is at fine dining restaurants. This is a two-sided alternative hypothesis.

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

- Publications
- Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

- Advanced Search
- Journal List
- Indian J Crit Care Med
- v.23(Suppl 3); 2019 Sep

## An Introduction to Statistics: Understanding Hypothesis Testing and Statistical Errors

Priya ranganathan.

1 Department of Anesthesiology, Critical Care and Pain, Tata Memorial Hospital, Mumbai, Maharashtra, India

2 Department of Surgical Oncology, Tata Memorial Centre, Mumbai, Maharashtra, India

The second article in this series on biostatistics covers the concepts of sample, population, research hypotheses and statistical errors.

## How to cite this article

Ranganathan P, Pramesh CS. An Introduction to Statistics: Understanding Hypothesis Testing and Statistical Errors. Indian J Crit Care Med 2019;23(Suppl 3):S230–S231.

Two papers quoted in this issue of the Indian Journal of Critical Care Medicine report. The results of studies aim to prove that a new intervention is better than (superior to) an existing treatment. In the ABLE study, the investigators wanted to show that transfusion of fresh red blood cells would be superior to standard-issue red cells in reducing 90-day mortality in ICU patients. 1 The PROPPR study was designed to prove that transfusion of a lower ratio of plasma and platelets to red cells would be superior to a higher ratio in decreasing 24-hour and 30-day mortality in critically ill patients. 2 These studies are known as superiority studies (as opposed to noninferiority or equivalence studies which will be discussed in a subsequent article).

## SAMPLE VERSUS POPULATION

A sample represents a group of participants selected from the entire population. Since studies cannot be carried out on entire populations, researchers choose samples, which are representative of the population. This is similar to walking into a grocery store and examining a few grains of rice or wheat before purchasing an entire bag; we assume that the few grains that we select (the sample) are representative of the entire sack of grains (the population).

The results of the study are then extrapolated to generate inferences about the population. We do this using a process known as hypothesis testing. This means that the results of the study may not always be identical to the results we would expect to find in the population; i.e., there is the possibility that the study results may be erroneous.

## HYPOTHESIS TESTING

A clinical trial begins with an assumption or belief, and then proceeds to either prove or disprove this assumption. In statistical terms, this belief or assumption is known as a hypothesis. Counterintuitively, what the researcher believes in (or is trying to prove) is called the “alternate” hypothesis, and the opposite is called the “null” hypothesis; every study has a null hypothesis and an alternate hypothesis. For superiority studies, the alternate hypothesis states that one treatment (usually the new or experimental treatment) is superior to the other; the null hypothesis states that there is no difference between the treatments (the treatments are equal). For example, in the ABLE study, we start by stating the null hypothesis—there is no difference in mortality between groups receiving fresh RBCs and standard-issue RBCs. We then state the alternate hypothesis—There is a difference between groups receiving fresh RBCs and standard-issue RBCs. It is important to note that we have stated that the groups are different, without specifying which group will be better than the other. This is known as a two-tailed hypothesis and it allows us to test for superiority on either side (using a two-sided test). This is because, when we start a study, we are not 100% certain that the new treatment can only be better than the standard treatment—it could be worse, and if it is so, the study should pick it up as well. One tailed hypothesis and one-sided statistical testing is done for non-inferiority studies, which will be discussed in a subsequent paper in this series.

## STATISTICAL ERRORS

There are two possibilities to consider when interpreting the results of a superiority study. The first possibility is that there is truly no difference between the treatments but the study finds that they are different. This is called a Type-1 error or false-positive error or alpha error. This means falsely rejecting the null hypothesis.

The second possibility is that there is a difference between the treatments and the study does not pick up this difference. This is called a Type 2 error or false-negative error or beta error. This means falsely accepting the null hypothesis.

The power of the study is the ability to detect a difference between groups and is the converse of the beta error; i.e., power = 1-beta error. Alpha and beta errors are finalized when the protocol is written and form the basis for sample size calculation for the study. In an ideal world, we would not like any error in the results of our study; however, we would need to do the study in the entire population (infinite sample size) to be able to get a 0% alpha and beta error. These two errors enable us to do studies with realistic sample sizes, with the compromise that there is a small possibility that the results may not always reflect the truth. The basis for this will be discussed in a subsequent paper in this series dealing with sample size calculation.

Conventionally, type 1 or alpha error is set at 5%. This means, that at the end of the study, if there is a difference between groups, we want to be 95% certain that this is a true difference and allow only a 5% probability that this difference has occurred by chance (false positive). Type 2 or beta error is usually set between 10% and 20%; therefore, the power of the study is 90% or 80%. This means that if there is a difference between groups, we want to be 80% (or 90%) certain that the study will detect that difference. For example, in the ABLE study, sample size was calculated with a type 1 error of 5% (two-sided) and power of 90% (type 2 error of 10%) (1).

Table 1 gives a summary of the two types of statistical errors with an example

Statistical errors

(a) Types of statistical errors | |||

: Null hypothesis is | |||

True | False | ||

Null hypothesis is actually | True | Correct results! | Falsely rejecting null hypothesis - Type I error |

False | Falsely accepting null hypothesis - Type II error | Correct results! | |

(b) Possible statistical errors in the ABLE trial | |||

There is difference in mortality between groups receiving fresh RBCs and standard-issue RBCs | There difference in mortality between groups receiving fresh RBCs and standard-issue RBCs | ||

Truth | There is difference in mortality between groups receiving fresh RBCs and standard-issue RBCs | Correct results! | Falsely rejecting null hypothesis - Type I error |

There difference in mortality between groups receiving fresh RBCs and standard-issue RBCs | Falsely accepting null hypothesis - Type II error | Correct results! |

In the next article in this series, we will look at the meaning and interpretation of ‘ p ’ value and confidence intervals for hypothesis testing.

Source of support: Nil

Conflict of interest: None

## Statistics Tutorial

Descriptive statistics, inferential statistics, stat reference, statistics - hypothesis testing.

Hypothesis testing is a formal way of checking if a hypothesis about a population is true or not.

## Hypothesis Testing

A hypothesis is a claim about a population parameter .

A hypothesis test is a formal procedure to check if a hypothesis is true or not.

Examples of claims that can be checked:

The average height of people in Denmark is more than 170 cm.

The share of left handed people in Australia is not 10%.

The average income of dentists is less the average income of lawyers.

## The Null and Alternative Hypothesis

Hypothesis testing is based on making two different claims about a population parameter.

The null hypothesis (\(H_{0} \)) and the alternative hypothesis (\(H_{1}\)) are the claims.

The two claims needs to be mutually exclusive , meaning only one of them can be true.

The alternative hypothesis is typically what we are trying to prove.

For example, we want to check the following claim:

"The average height of people in Denmark is more than 170 cm."

In this case, the parameter is the average height of people in Denmark (\(\mu\)).

The null and alternative hypothesis would be:

Null hypothesis : The average height of people in Denmark is 170 cm.

Alternative hypothesis : The average height of people in Denmark is more than 170 cm.

The claims are often expressed with symbols like this:

\(H_{0}\): \(\mu = 170 \: cm \)

\(H_{1}\): \(\mu > 170 \: cm \)

If the data supports the alternative hypothesis, we reject the null hypothesis and accept the alternative hypothesis.

If the data does not support the alternative hypothesis, we keep the null hypothesis.

Note: The alternative hypothesis is also referred to as (\(H_{A} \)).

## The Significance Level

The significance level (\(\alpha\)) is the uncertainty we accept when rejecting the null hypothesis in the hypothesis test.

The significance level is a percentage probability of accidentally making the wrong conclusion.

Typical significance levels are:

- \(\alpha = 0.1\) (10%)
- \(\alpha = 0.05\) (5%)
- \(\alpha = 0.01\) (1%)

A lower significance level means that the evidence in the data needs to be stronger to reject the null hypothesis.

There is no "correct" significance level - it only states the uncertainty of the conclusion.

Note: A 5% significance level means that when we reject a null hypothesis:

We expect to reject a true null hypothesis 5 out of 100 times.

Advertisement

## The Test Statistic

The test statistic is used to decide the outcome of the hypothesis test.

The test statistic is a standardized value calculated from the sample.

Standardization means converting a statistic to a well known probability distribution .

The type of probability distribution depends on the type of test.

Common examples are:

- Standard Normal Distribution (Z): used for Testing Population Proportions
- Student's T-Distribution (T): used for Testing Population Means

Note: You will learn how to calculate the test statistic for each type of test in the following chapters.

## The Critical Value and P-Value Approach

There are two main approaches used for hypothesis tests:

- The critical value approach compares the test statistic with the critical value of the significance level.
- The p-value approach compares the p-value of the test statistic and with the significance level.

## The Critical Value Approach

The critical value approach checks if the test statistic is in the rejection region .

The rejection region is an area of probability in the tails of the distribution.

The size of the rejection region is decided by the significance level (\(\alpha\)).

The value that separates the rejection region from the rest is called the critical value .

Here is a graphical illustration:

If the test statistic is inside this rejection region, the null hypothesis is rejected .

For example, if the test statistic is 2.3 and the critical value is 2 for a significance level (\(\alpha = 0.05\)):

We reject the null hypothesis (\(H_{0} \)) at 0.05 significance level (\(\alpha\))

## The P-Value Approach

The p-value approach checks if the p-value of the test statistic is smaller than the significance level (\(\alpha\)).

The p-value of the test statistic is the area of probability in the tails of the distribution from the value of the test statistic.

If the p-value is smaller than the significance level, the null hypothesis is rejected .

The p-value directly tells us the lowest significance level where we can reject the null hypothesis.

For example, if the p-value is 0.03:

We reject the null hypothesis (\(H_{0} \)) at a 0.05 significance level (\(\alpha\))

We keep the null hypothesis (\(H_{0}\)) at a 0.01 significance level (\(\alpha\))

Note: The two approaches are only different in how they present the conclusion.

## Steps for a Hypothesis Test

The following steps are used for a hypothesis test:

- Check the conditions
- Define the claims
- Decide the significance level
- Calculate the test statistic

One condition is that the sample is randomly selected from the population.

The other conditions depends on what type of parameter you are testing the hypothesis for.

Common parameters to test hypotheses are:

- Proportions (for qualitative data)
- Mean values (for numerical data)

You will learn the steps for both types in the following pages.

## COLOR PICKER

## Contact Sales

If you want to use W3Schools services as an educational institution, team or enterprise, send us an e-mail: [email protected]

## Report Error

If you want to report an error, or if you want to make a suggestion, send us an e-mail: [email protected]

## Top Tutorials

Top references, top examples, get certified.

## Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

- Knowledge Base
- Null and Alternative Hypotheses | Definitions & Examples

## Null & Alternative Hypotheses | Definitions, Templates & Examples

Published on May 6, 2022 by Shaun Turney . Revised on June 22, 2023.

The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :

- Null hypothesis ( H 0 ): There’s no effect in the population .
- Alternative hypothesis ( H a or H 1 ) : There’s an effect in the population.

## Table of contents

Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, similarities and differences between null and alternative hypotheses, how to write null and alternative hypotheses, other interesting articles, frequently asked questions.

The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”:

- The null hypothesis ( H 0 ) answers “No, there’s no effect in the population.”
- The alternative hypothesis ( H a ) answers “Yes, there is an effect in the population.”

The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample. It’s critical for your research to write strong hypotheses .

You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.

## Here's why students love Scribbr's proofreading services

Discover proofreading & editing

The null hypothesis is the claim that there’s no effect in the population.

If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.

Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept . Be careful not to say you “prove” or “accept” the null hypothesis.

Null hypotheses often include phrases such as “no effect,” “no difference,” or “no relationship.” When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).

You can never know with complete certainty whether there is an effect in the population. Some percentage of the time, your inference about the population will be incorrect. When you incorrectly reject the null hypothesis, it’s called a type I error . When you incorrectly fail to reject it, it’s a type II error.

## Examples of null hypotheses

The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.

( ) | ||

Does tooth flossing affect the number of cavities? | Tooth flossing has on the number of cavities. | test: The mean number of cavities per person does not differ between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ = µ . |

Does the amount of text highlighted in the textbook affect exam scores? | The amount of text highlighted in the textbook has on exam scores. | : There is no relationship between the amount of text highlighted and exam scores in the population; β = 0. |

Does daily meditation decrease the incidence of depression? | Daily meditation the incidence of depression.* | test: The proportion of people with depression in the daily-meditation group ( ) is greater than or equal to the no-meditation group ( ) in the population; ≥ . |

*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .

The alternative hypothesis ( H a ) is the other answer to your research question . It claims that there’s an effect in the population.

Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.

The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.

Alternative hypotheses often include phrases such as “an effect,” “a difference,” or “a relationship.” When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes < or >). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.

## Examples of alternative hypotheses

The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.

Does tooth flossing affect the number of cavities? | Tooth flossing has an on the number of cavities. | test: The mean number of cavities per person differs between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ ≠ µ . |

Does the amount of text highlighted in a textbook affect exam scores? | The amount of text highlighted in the textbook has an on exam scores. | : There is a relationship between the amount of text highlighted and exam scores in the population; β ≠ 0. |

Does daily meditation decrease the incidence of depression? | Daily meditation the incidence of depression. | test: The proportion of people with depression in the daily-meditation group ( ) is less than the no-meditation group ( ) in the population; < . |

Null and alternative hypotheses are similar in some ways:

- They’re both answers to the research question.
- They both make claims about the population.
- They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the following table.

A claim that there is in the population. | A claim that there is in the population. | |

| ||

Equality symbol (=, ≥, or ≤) | Inequality symbol (≠, <, or >) | |

Rejected | Supported | |

Failed to reject | Not supported |

To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the test-specific template sentences. Otherwise, you can use the general template sentences.

## General template sentences

The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:

Does independent variable affect dependent variable ?

- Null hypothesis ( H 0 ): Independent variable does not affect dependent variable.
- Alternative hypothesis ( H a ): Independent variable affects dependent variable.

## Test-specific template sentences

Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.

( ) | ||

test
with two groups | The mean dependent variable does not differ between group 1 (µ ) and group 2 (µ ) in the population; µ = µ . | The mean dependent variable differs between group 1 (µ ) and group 2 (µ ) in the population; µ ≠ µ . |

with three groups | The mean dependent variable does not differ between group 1 (µ ), group 2 (µ ), and group 3 (µ ) in the population; µ = µ = µ . | The mean dependent variable of group 1 (µ ), group 2 (µ ), and group 3 (µ ) are not all equal in the population. |

There is no correlation between independent variable and dependent variable in the population; ρ = 0. | There is a correlation between independent variable and dependent variable in the population; ρ ≠ 0. | |

There is no relationship between independent variable and dependent variable in the population; β = 0. | There is a relationship between independent variable and dependent variable in the population; β ≠ 0. | |

Two-proportions test | The dependent variable expressed as a proportion does not differ between group 1 ( ) and group 2 ( ) in the population; = . | The dependent variable expressed as a proportion differs between group 1 ( ) and group 2 ( ) in the population; ≠ . |

Note: The template sentences above assume that you’re performing one-tailed tests . One-tailed tests are appropriate for most studies.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

- Normal distribution
- Descriptive statistics
- Measures of central tendency
- Correlation coefficient

Methodology

- Cluster sampling
- Stratified sampling
- Types of interviews
- Cohort study
- Thematic analysis

Research bias

- Implicit bias
- Cognitive bias
- Survivorship bias
- Availability heuristic
- Nonresponse bias
- Regression to the mean

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).

The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (“ x affects y because …”).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses . In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

## Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Turney, S. (2023, June 22). Null & Alternative Hypotheses | Definitions, Templates & Examples. Scribbr. Retrieved June 24, 2024, from https://www.scribbr.com/statistics/null-and-alternative-hypotheses/

## Is this article helpful?

## Shaun Turney

Other students also liked, inferential statistics | an easy introduction & examples, hypothesis testing | a step-by-step guide with easy examples, type i & type ii errors | differences, examples, visualizations, what is your plagiarism score.

- Bipolar Disorder
- Therapy Center
- When To See a Therapist
- Types of Therapy
- Best Online Therapy
- Best Couples Therapy
- Best Family Therapy
- Managing Stress
- Sleep and Dreaming
- Understanding Emotions
- Self-Improvement
- Healthy Relationships
- Student Resources
- Personality Types
- Guided Meditations
- Verywell Mind Insights
- 2024 Verywell Mind 25
- Mental Health in the Classroom
- Editorial Process
- Meet Our Review Board
- Crisis Support

## How to Write a Great Hypothesis

Hypothesis Definition, Format, Examples, and Tips

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Amy Morin, LCSW, is a psychotherapist and international bestselling author. Her books, including "13 Things Mentally Strong People Don't Do," have been translated into more than 40 languages. Her TEDx talk, "The Secret of Becoming Mentally Strong," is one of the most viewed talks of all time.

Verywell / Alex Dos Diaz

- The Scientific Method

## Hypothesis Format

Falsifiability of a hypothesis.

- Operationalization

## Hypothesis Types

Hypotheses examples.

- Collecting Data

A hypothesis is a tentative statement about the relationship between two or more variables. It is a specific, testable prediction about what you expect to happen in a study. It is a preliminary answer to your question that helps guide the research process.

Consider a study designed to examine the relationship between sleep deprivation and test performance. The hypothesis might be: "This study is designed to assess the hypothesis that sleep-deprived people will perform worse on a test than individuals who are not sleep-deprived."

## At a Glance

A hypothesis is crucial to scientific research because it offers a clear direction for what the researchers are looking to find. This allows them to design experiments to test their predictions and add to our scientific knowledge about the world. This article explores how a hypothesis is used in psychology research, how to write a good hypothesis, and the different types of hypotheses you might use.

## The Hypothesis in the Scientific Method

In the scientific method , whether it involves research in psychology, biology, or some other area, a hypothesis represents what the researchers think will happen in an experiment. The scientific method involves the following steps:

- Forming a question
- Performing background research
- Creating a hypothesis
- Designing an experiment
- Collecting data
- Analyzing the results
- Drawing conclusions
- Communicating the results

The hypothesis is a prediction, but it involves more than a guess. Most of the time, the hypothesis begins with a question which is then explored through background research. At this point, researchers then begin to develop a testable hypothesis.

Unless you are creating an exploratory study, your hypothesis should always explain what you expect to happen.

In a study exploring the effects of a particular drug, the hypothesis might be that researchers expect the drug to have some type of effect on the symptoms of a specific illness. In psychology, the hypothesis might focus on how a certain aspect of the environment might influence a particular behavior.

Remember, a hypothesis does not have to be correct. While the hypothesis predicts what the researchers expect to see, the goal of the research is to determine whether this guess is right or wrong. When conducting an experiment, researchers might explore numerous factors to determine which ones might contribute to the ultimate outcome.

In many cases, researchers may find that the results of an experiment do not support the original hypothesis. When writing up these results, the researchers might suggest other options that should be explored in future studies.

In many cases, researchers might draw a hypothesis from a specific theory or build on previous research. For example, prior research has shown that stress can impact the immune system. So a researcher might hypothesize: "People with high-stress levels will be more likely to contract a common cold after being exposed to the virus than people who have low-stress levels."

In other instances, researchers might look at commonly held beliefs or folk wisdom. "Birds of a feather flock together" is one example of folk adage that a psychologist might try to investigate. The researcher might pose a specific hypothesis that "People tend to select romantic partners who are similar to them in interests and educational level."

## Elements of a Good Hypothesis

So how do you write a good hypothesis? When trying to come up with a hypothesis for your research or experiments, ask yourself the following questions:

- Is your hypothesis based on your research on a topic?
- Can your hypothesis be tested?
- Does your hypothesis include independent and dependent variables?

Before you come up with a specific hypothesis, spend some time doing background research. Once you have completed a literature review, start thinking about potential questions you still have. Pay attention to the discussion section in the journal articles you read . Many authors will suggest questions that still need to be explored.

## How to Formulate a Good Hypothesis

To form a hypothesis, you should take these steps:

- Collect as many observations about a topic or problem as you can.
- Evaluate these observations and look for possible causes of the problem.
- Create a list of possible explanations that you might want to explore.
- After you have developed some possible hypotheses, think of ways that you could confirm or disprove each hypothesis through experimentation. This is known as falsifiability.

In the scientific method , falsifiability is an important part of any valid hypothesis. In order to test a claim scientifically, it must be possible that the claim could be proven false.

Students sometimes confuse the idea of falsifiability with the idea that it means that something is false, which is not the case. What falsifiability means is that if something was false, then it is possible to demonstrate that it is false.

One of the hallmarks of pseudoscience is that it makes claims that cannot be refuted or proven false.

## The Importance of Operational Definitions

A variable is a factor or element that can be changed and manipulated in ways that are observable and measurable. However, the researcher must also define how the variable will be manipulated and measured in the study.

Operational definitions are specific definitions for all relevant factors in a study. This process helps make vague or ambiguous concepts detailed and measurable.

For example, a researcher might operationally define the variable " test anxiety " as the results of a self-report measure of anxiety experienced during an exam. A "study habits" variable might be defined by the amount of studying that actually occurs as measured by time.

These precise descriptions are important because many things can be measured in various ways. Clearly defining these variables and how they are measured helps ensure that other researchers can replicate your results.

## Replicability

One of the basic principles of any type of scientific research is that the results must be replicable.

Replication means repeating an experiment in the same way to produce the same results. By clearly detailing the specifics of how the variables were measured and manipulated, other researchers can better understand the results and repeat the study if needed.

Some variables are more difficult than others to define. For example, how would you operationally define a variable such as aggression ? For obvious ethical reasons, researchers cannot create a situation in which a person behaves aggressively toward others.

To measure this variable, the researcher must devise a measurement that assesses aggressive behavior without harming others. The researcher might utilize a simulated task to measure aggressiveness in this situation.

## Hypothesis Checklist

- Does your hypothesis focus on something that you can actually test?
- Does your hypothesis include both an independent and dependent variable?
- Can you manipulate the variables?
- Can your hypothesis be tested without violating ethical standards?

The hypothesis you use will depend on what you are investigating and hoping to find. Some of the main types of hypotheses that you might use include:

- Simple hypothesis : This type of hypothesis suggests there is a relationship between one independent variable and one dependent variable.
- Complex hypothesis : This type suggests a relationship between three or more variables, such as two independent and dependent variables.
- Null hypothesis : This hypothesis suggests no relationship exists between two or more variables.
- Alternative hypothesis : This hypothesis states the opposite of the null hypothesis.
- Statistical hypothesis : This hypothesis uses statistical analysis to evaluate a representative population sample and then generalizes the findings to the larger group.
- Logical hypothesis : This hypothesis assumes a relationship between variables without collecting data or evidence.

A hypothesis often follows a basic format of "If {this happens} then {this will happen}." One way to structure your hypothesis is to describe what will happen to the dependent variable if you change the independent variable .

The basic format might be: "If {these changes are made to a certain independent variable}, then we will observe {a change in a specific dependent variable}."

## A few examples of simple hypotheses:

- "Students who eat breakfast will perform better on a math exam than students who do not eat breakfast."
- "Students who experience test anxiety before an English exam will get lower scores than students who do not experience test anxiety."
- "Motorists who talk on the phone while driving will be more likely to make errors on a driving course than those who do not talk on the phone."
- "Children who receive a new reading intervention will have higher reading scores than students who do not receive the intervention."

## Examples of a complex hypothesis include:

- "People with high-sugar diets and sedentary activity levels are more likely to develop depression."
- "Younger people who are regularly exposed to green, outdoor areas have better subjective well-being than older adults who have limited exposure to green spaces."

## Examples of a null hypothesis include:

- "There is no difference in anxiety levels between people who take St. John's wort supplements and those who do not."
- "There is no difference in scores on a memory recall task between children and adults."
- "There is no difference in aggression levels between children who play first-person shooter games and those who do not."

## Examples of an alternative hypothesis:

- "People who take St. John's wort supplements will have less anxiety than those who do not."
- "Adults will perform better on a memory task than children."
- "Children who play first-person shooter games will show higher levels of aggression than children who do not."

## Collecting Data on Your Hypothesis

Once a researcher has formed a testable hypothesis, the next step is to select a research design and start collecting data. The research method depends largely on exactly what they are studying. There are two basic types of research methods: descriptive research and experimental research.

## Descriptive Research Methods

Descriptive research such as case studies , naturalistic observations , and surveys are often used when conducting an experiment is difficult or impossible. These methods are best used to describe different aspects of a behavior or psychological phenomenon.

Once a researcher has collected data using descriptive methods, a correlational study can examine how the variables are related. This research method might be used to investigate a hypothesis that is difficult to test experimentally.

## Experimental Research Methods

Experimental methods are used to demonstrate causal relationships between variables. In an experiment, the researcher systematically manipulates a variable of interest (known as the independent variable) and measures the effect on another variable (known as the dependent variable).

Unlike correlational studies, which can only be used to determine if there is a relationship between two variables, experimental methods can be used to determine the actual nature of the relationship—whether changes in one variable actually cause another to change.

The hypothesis is a critical part of any scientific exploration. It represents what researchers expect to find in a study or experiment. In situations where the hypothesis is unsupported by the research, the research still has value. Such research helps us better understand how different aspects of the natural world relate to one another. It also helps us develop new hypotheses that can then be tested in the future.

Thompson WH, Skau S. On the scope of scientific hypotheses . R Soc Open Sci . 2023;10(8):230607. doi:10.1098/rsos.230607

Taran S, Adhikari NKJ, Fan E. Falsifiability in medicine: what clinicians can learn from Karl Popper [published correction appears in Intensive Care Med. 2021 Jun 17;:]. Intensive Care Med . 2021;47(9):1054-1056. doi:10.1007/s00134-021-06432-z

Eyler AA. Research Methods for Public Health . 1st ed. Springer Publishing Company; 2020. doi:10.1891/9780826182067.0004

Nosek BA, Errington TM. What is replication ? PLoS Biol . 2020;18(3):e3000691. doi:10.1371/journal.pbio.3000691

Aggarwal R, Ranganathan P. Study designs: Part 2 - Descriptive studies . Perspect Clin Res . 2019;10(1):34-36. doi:10.4103/picr.PICR_154_18

Nevid J. Psychology: Concepts and Applications. Wadworth, 2013.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

- Skip to secondary menu
- Skip to main content
- Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

## Null Hypothesis: Definition, Rejecting & Examples

By Jim Frost 6 Comments

## What is a Null Hypothesis?

The null hypothesis in statistics states that there is no difference between groups or no relationship between variables. It is one of two mutually exclusive hypotheses about a population in a hypothesis test.

- Null Hypothesis H 0 : No effect exists in the population.
- Alternative Hypothesis H A : The effect exists in the population.

In every study or experiment, researchers assess an effect or relationship. This effect can be the effectiveness of a new drug, building material, or other intervention that has benefits. There is a benefit or connection that the researchers hope to identify. Unfortunately, no effect may exist. In statistics, we call this lack of an effect the null hypothesis. Researchers assume that this notion of no effect is correct until they have enough evidence to suggest otherwise, similar to how a trial presumes innocence.

In this context, the analysts don’t necessarily believe the null hypothesis is correct. In fact, they typically want to reject it because that leads to more exciting finds about an effect or relationship. The new vaccine works!

You can think of it as the default theory that requires sufficiently strong evidence to reject. Like a prosecutor, researchers must collect sufficient evidence to overturn the presumption of no effect. Investigators must work hard to set up a study and a data collection system to obtain evidence that can reject the null hypothesis.

Related post : What is an Effect in Statistics?

## Null Hypothesis Examples

Null hypotheses start as research questions that the investigator rephrases as a statement indicating there is no effect or relationship.

Does the vaccine prevent infections? | The vaccine does not affect the infection rate. |

Does the new additive increase product strength? | The additive does not affect mean product strength. |

Does the exercise intervention increase bone mineral density? | The intervention does not affect bone mineral density. |

As screen time increases, does test performance decrease? | There is no relationship between screen time and test performance. |

After reading these examples, you might think they’re a bit boring and pointless. However, the key is to remember that the null hypothesis defines the condition that the researchers need to discredit before suggesting an effect exists.

Let’s see how you reject the null hypothesis and get to those more exciting findings!

## When to Reject the Null Hypothesis

So, you want to reject the null hypothesis, but how and when can you do that? To start, you’ll need to perform a statistical test on your data. The following is an overview of performing a study that uses a hypothesis test.

The first step is to devise a research question and the appropriate null hypothesis. After that, the investigators need to formulate an experimental design and data collection procedures that will allow them to gather data that can answer the research question. Then they collect the data. For more information about designing a scientific study that uses statistics, read my post 5 Steps for Conducting Studies with Statistics .

After data collection is complete, statistics and hypothesis testing enter the picture. Hypothesis testing takes your sample data and evaluates how consistent they are with the null hypothesis. The p-value is a crucial part of the statistical results because it quantifies how strongly the sample data contradict the null hypothesis.

When the sample data provide sufficient evidence, you can reject the null hypothesis. In a hypothesis test, this process involves comparing the p-value to your significance level .

## Rejecting the Null Hypothesis

Reject the null hypothesis when the p-value is less than or equal to your significance level. Your sample data favor the alternative hypothesis, which suggests that the effect exists in the population. For a mnemonic device, remember—when the p-value is low, the null must go!

When you can reject the null hypothesis, your results are statistically significant. Learn more about Statistical Significance: Definition & Meaning .

## Failing to Reject the Null Hypothesis

Conversely, when the p-value is greater than your significance level, you fail to reject the null hypothesis. The sample data provides insufficient data to conclude that the effect exists in the population. When the p-value is high, the null must fly!

Note that failing to reject the null is not the same as proving it. For more information about the difference, read my post about Failing to Reject the Null .

That’s a very general look at the process. But I hope you can see how the path to more exciting findings depends on being able to rule out the less exciting null hypothesis that states there’s nothing to see here!

Let’s move on to learning how to write the null hypothesis for different types of effects, relationships, and tests.

Related posts : How Hypothesis Tests Work and Interpreting P-values

## How to Write a Null Hypothesis

The null hypothesis varies by the type of statistic and hypothesis test. Remember that inferential statistics use samples to draw conclusions about populations. Consequently, when you write a null hypothesis, it must make a claim about the relevant population parameter . Further, that claim usually indicates that the effect does not exist in the population. Below are typical examples of writing a null hypothesis for various parameters and hypothesis tests.

Related posts : Descriptive vs. Inferential Statistics and Populations, Parameters, and Samples in Inferential Statistics

## Group Means

T-tests and ANOVA assess the differences between group means. For these tests, the null hypothesis states that there is no difference between group means in the population. In other words, the experimental conditions that define the groups do not affect the mean outcome. Mu (µ) is the population parameter for the mean, and you’ll need to include it in the statement for this type of study.

For example, an experiment compares the mean bone density changes for a new osteoporosis medication. The control group does not receive the medicine, while the treatment group does. The null states that the mean bone density changes for the control and treatment groups are equal.

- Null Hypothesis H 0 : Group means are equal in the population: µ 1 = µ 2 , or µ 1 – µ 2 = 0
- Alternative Hypothesis H A : Group means are not equal in the population: µ 1 ≠ µ 2 , or µ 1 – µ 2 ≠ 0.

## Group Proportions

Proportions tests assess the differences between group proportions. For these tests, the null hypothesis states that there is no difference between group proportions. Again, the experimental conditions did not affect the proportion of events in the groups. P is the population proportion parameter that you’ll need to include.

For example, a vaccine experiment compares the infection rate in the treatment group to the control group. The treatment group receives the vaccine, while the control group does not. The null states that the infection rates for the control and treatment groups are equal.

- Null Hypothesis H 0 : Group proportions are equal in the population: p 1 = p 2 .
- Alternative Hypothesis H A : Group proportions are not equal in the population: p 1 ≠ p 2 .

## Correlation and Regression Coefficients

Some studies assess the relationship between two continuous variables rather than differences between groups.

In these studies, analysts often use either correlation or regression analysis . For these tests, the null states that there is no relationship between the variables. Specifically, it says that the correlation or regression coefficient is zero. As one variable increases, there is no tendency for the other variable to increase or decrease. Rho (ρ) is the population correlation parameter and beta (β) is the regression coefficient parameter.

For example, a study assesses the relationship between screen time and test performance. The null states that there is no correlation between this pair of variables. As screen time increases, test performance does not tend to increase or decrease.

- Null Hypothesis H 0 : The correlation in the population is zero: ρ = 0.
- Alternative Hypothesis H A : The correlation in the population is not zero: ρ ≠ 0.

For all these cases, the analysts define the hypotheses before the study. After collecting the data, they perform a hypothesis test to determine whether they can reject the null hypothesis.

The preceding examples are all for two-tailed hypothesis tests. To learn about one-tailed tests and how to write a null hypothesis for them, read my post One-Tailed vs. Two-Tailed Tests .

Related post : Understanding Correlation

Neyman, J; Pearson, E. S. (January 1, 1933). On the Problem of the most Efficient Tests of Statistical Hypotheses . Philosophical Transactions of the Royal Society A . 231 (694–706): 289–337.

## Share this:

## Reader Interactions

January 11, 2024 at 2:57 pm

Thanks for the reply.

January 10, 2024 at 1:23 pm

Hi Jim, In your comment you state that equivalence test null and alternate hypotheses are reversed. For hypothesis tests of data fits to a probability distribution, the null hypothesis is that the probability distribution fits the data. Is this correct?

January 10, 2024 at 2:15 pm

Those two separate things, equivalence testing and normality tests. But, yes, you’re correct for both.

Hypotheses are switched for equivalence testing. You need to “work” (i.e., collect a large sample of good quality data) to be able to reject the null that the groups are different to be able to conclude they’re the same.

With typical hypothesis tests, if you have low quality data and a low sample size, you’ll fail to reject the null that they’re the same, concluding they’re equivalent. But that’s more a statement about the low quality and small sample size than anything to do with the groups being equal.

So, equivalence testing make you work to obtain a finding that the groups are the same (at least within some amount you define as a trivial difference).

For normality testing, and other distribution tests, the null states that the data follow the distribution (normal or whatever). If you reject the null, you have sufficient evidence to conclude that your sample data don’t follow the probability distribution. That’s a rare case where you hope to fail to reject the null. And it suffers from the problem I describe above where you might fail to reject the null simply because you have a small sample size. In that case, you’d conclude the data follow the probability distribution but it’s more that you don’t have enough data for the test to register the deviation. In this scenario, if you had a larger sample size, you’d reject the null and conclude it doesn’t follow that distribution.

I don’t know of any equivalence testing type approach for distribution fit tests where you’d need to work to show the data follow a distribution, although I haven’t looked for one either!

February 20, 2022 at 9:26 pm

Is a null hypothesis regularly (always) stated in the negative? “there is no” or “does not”

February 23, 2022 at 9:21 pm

Typically, the null hypothesis includes an equal sign. The null hypothesis states that the population parameter equals a particular value. That value is usually one that represents no effect. In the case of a one-sided hypothesis test, the null still contains an equal sign but it’s “greater than or equal to” or “less than or equal to.” If you wanted to translate the null hypothesis from its native mathematical expression, you could use the expression “there is no effect.” But the mathematical form more specifically states what it’s testing.

It’s the alternative hypothesis that typically contains does not equal.

There are some exceptions. For example, in an equivalence test where the researchers want to show that two things are equal, the null hypothesis states that they’re not equal.

In short, the null hypothesis states the condition that the researchers hope to reject. They need to work hard to set up an experiment and data collection that’ll gather enough evidence to be able to reject the null condition.

February 15, 2022 at 9:32 am

Dear sir I always read your notes on Research methods.. Kindly tell is there any available Book on all these..wonderfull Urgent

## Comments and Questions Cancel reply

## Stats and R

Hypothesis test by hand.

- Confidence interval

## Hypothesis test

- Inferential statistics

## Descriptive versus inferential statistics

Motivations and limitations, step #1: stating the null and alternative hypothesis, step #2: computing the test statistic, step #3: finding the critical value, why don’t we accept \(h_0\) , step #3: computing the p -value, step #4: concluding and interpreting the results, step #2: computing the confidence interval, step #3: concluding and interpreting the results, which method to choose.

Remember that descriptive statistics is the branch of statistics aiming at describing and summarizing a set of data in the best possible manner, that is, by reducing it down to a few meaningful key measures and visualizations—with as little loss of information as possible. In other words, the branch of descriptive statistics helps to have a better understanding and a clear image about a set of observations thanks to summary statistics and graphics. With descriptive statistics, there is no uncertainty because we describe only the group of observations that we decided to work on and no attempt is made to generalize the observed characteristics to another or to a larger group of observations.

Inferential statistics , one the other hand, is the branch of statistics that uses a random sample of data taken from a population to make inferences, i.e., to draw conclusions about the population of interest (see the difference between population and sample if you need a refresh of the two concepts). In other words, information from the sample is used to make generalizations about the parameter of interest in the population.

The two most important tools used in the domain of inferential statistics are:

- hypothesis test (which is the main subject of the present article), and
- confidence interval (which is briefly discussed in this section )

Via my teaching tasks, I realized that many students (especially in introductory statistic classes) struggle to perform hypothesis tests and interpret the results. It seems to me that these students often encounter difficulties mainly because hypothesis testing is rather unclear and abstract to them.

One of the reason it looks abstract to them is because they do not understand the final goal of hypothesis testing—the “why” behind this tool. They often do inferential statistics without understanding the reasoning behind it, as if they were following a cooking recipe which does not require any thinking. However, as soon as they understand the principle underlying hypothesis testing, it is much easier for them to apply the concepts and solve the exercises.

For this reason, I though it would be useful to write an article on the goal of hypothesis tests (the “why?”), in which context they should be used (the “when?”), how they work (the “how?”) and how to interpret the results (the “so what?”). Like anything else in statistics, it becomes much easier to apply a concept in practice when we understand what we are testing or what we are trying to demonstrate beforehand.

In this article, I present—as comprehensibly as possible—the different steps required to perform and conclude a hypothesis test by hand .

These steps are illustrated with a basic example. This will build the theoretical foundations of hypothesis testing, which will in turn be of great help for the understanding of most statistical tests .

Hypothesis tests come in many forms and can be used for many parameters or research questions. The steps I present in this article are not applicable to all hypothesis test, unfortunately.

They are however, appropriate for at least the most common hypothesis tests—the tests on:

- One mean: \(\mu\)
- independent samples: \(\mu_1\) and \(\mu_2\)
- paired samples: \(\mu_D\)
- One proportion: \(p\)
- Two proportions: \(p_1\) and \(p_2\)
- One variance: \(\sigma^2\)
- Two variances: \(\sigma^2_1\) and \(\sigma^2_2\)

The good news is that the principles behind these 6 statistical tests (and many more) are exactly the same. So if you understand the intuition and the process for one of them, all others pretty much follow.

Unlike descriptive statistics where we only describe the data at hand, hypothesis tests use a subset of observations , referred as a sample , to draw conclusions about a population .

One may wonder why we would try to “guess” or make inference about a parameter of a population based on a sample, instead of simply collecting data for the entire population, compute statistics we are interested in and take decisions based upon that.

The main reason we actually use a sample instead of the entire population is because, most of the time, collecting data on the entire population is practically impossible, too complex, too expensive, it would take too long, or a combination of any of these. 1

So the overall objective of a hypothesis test is to draw conclusions in order to confirm or refute a belief about a population , based on a smaller group of observations.

In practice, we take some measurements of the variable of interest—representing the sample(s)—and we check whether our measurements are likely or not given our assumption (our belief). Based on the probability of observing the sample(s) we have, we decide whether we can trust our belief or not.

Hypothesis tests have many practical applications.

Here are different situations illustrating when the 6 tests mentioned above would be appropriate:

- One mean: suppose that a health professional would like to test whether the mean weight of Belgian adults is different than 80 kg (176.4 lbs).
- Independent samples: suppose that a physiotherapist would like to test the effectiveness of a new treatment by measuring the mean response time (in seconds) for patients in a control group and patients in a treatment group, where patients in the two groups are different.
- Paired samples: suppose that a physiotherapist would like to test the effectiveness of a new treatment by measuring the mean response time (in seconds) before and after a treatment, where patients are measured twice—before and after treatment, so patients are the same in the 2 samples.
- One proportion: suppose that a political pundit would like to test whether the proportion of citizens who are going to vote for a specific candidate is smaller than 30%.
- Two proportions: suppose that a doctor would like to test whether the proportion of smokers is different between professional and amateur athletes.
- One variance: suppose that an engineer would like to test whether a voltmeter has a lower variability than what is imposed by the safety standards.
- Two variances: suppose that, in a factory, two production lines work independently from each other. The financial manager would like to test whether the costs of the weekly maintenance of these two machines have the same variance. Note that a test on two variances is also often performed to verify the assumption of equal variances, which is required for several other statistical tests, such as the Student’s t-test for instance.

Of course, this is a non-exhaustive list of potential applications and many research questions can be answered thanks to a hypothesis test.

One important point to remember is that in hypothesis testing we are always interested in the population and not in the sample. The sample is used for the aim of drawing conclusions about the population, so we always test in terms of the population.

Usually, hypothesis tests are used to answer research questions in confirmatory analyses . Confirmatory analyses refer to statistical analyses where hypotheses—deducted from theory—are defined beforehand (preferably before data collection). In this approach, the researcher has a specific idea about the variables under consideration and she is trying to see if her idea, specified as hypotheses, is supported by data.

On the other hand, hypothesis tests are rarely used in exploratory analyses. 2 Exploratory analyses aims to uncover possible relationships between the variables under investigation. In this approach, the researcher does not have any clear theory-driven assumptions or ideas in mind before data collection. This is the reason exploratory analyses are sometimes referred as hypothesis-generating analyses—they are used to create some hypotheses, which in turn may be tested via confirmatory analyses at a later stage.

There are, to my knowledge, 3 different methods to perform a hypothesis tests:

## Method A: Comparing the test statistic with the critical value

Method b: comparing the p -value with the significance level \(\alpha\), method c: comparing the target parameter with the confidence interval.

Although the process for these 3 approaches may slightly differ, they all lead to the exact same conclusions. Using one method or another is, therefore, more often than not a matter of personal choice or a matter of context. See this section to know which method I use depending on the context.

I present the 3 methods in the following sections, starting with, in my opinion, the most comprehensive one when it comes to doing it by hand: comparing the test statistic with the critical value.

For the three methods, I will explain the required steps to perform a hypothesis test from a general point of view and illustrate them with the following situation: 3

Suppose a health professional who would like to test whether the mean weight of Belgian adults is different than 80 kg.

Note that, as for most hypothesis tests, the test we are going to use as example below requires some assumptions. Since the aim of the present article is to explain a hypothesis test, we assume that all assumptions are met. For the interested reader, see the assumptions (and how to verify them) for this type of hypothesis test in the article presenting the one-sample t-test .

Method A, which consists in comparing the test statistic with the critical value, boils down to the following 4 steps:

- Stating the null and alternative hypothesis
- Computing the test statistic
- Finding the critical value
- Concluding and interpreting the results

Each step is detailed below.

As discussed before, a hypothesis test first requires an idea, that is, an assumption about a phenomenon. This assumption, referred as hypothesis, is derived from the theory and/or the research question.

Since a hypothesis test is used to confirm or refute a prior belief, we need to formulate our belief so that there is a null and an alternative hypothesis . Those hypotheses must be mutually exclusive , which means that they cannot be true at the same time. This is step #1.

In the context of our scenario, the null and alternative hypothesis are thus:

- Null hypothesis \(H_0: \mu = 80\)
- Alternative hypothesis \(H_1: \mu \ne 80\)

When stating the null and alternative hypothesis, bear in mind the following three points:

- We are always interested in the population and not in the sample. This is the reason \(H_0\) and \(H_1\) will always be written in terms of the population and not in terms of the sample (in this case, \(\mu\) and not \(\bar{x}\) ).
- The assumption we would like to test is often the alternative hypothesis. If the researcher wanted to test whether the mean weight of Belgian adults was less than 80 kg, she would have stated \(H_0: \mu = 80\) (or equivalently, \(H_0: \mu \ge 80\) ) and \(H_1: \mu < 80\) . 4 Do not mix the null with the alternative hypothesis, or the conclusions will be diametrically opposed!
- The null hypothesis is often the status quo. For instance, suppose that a doctor wants to test whether the new treatment A is more efficient than the old treatment B. The status quo is that the new and old treatments are equally efficient. Assuming a larger value is better, she will then write \(H_0: \mu_A = \mu_B\) (or equivalently, \(H_0: \mu_A - \mu_B = 0\) ) and \(H_1: \mu_A > \mu_B\) (or equivalently, \(H_0: \mu_A - \mu_B > 0\) ). On the opposite, if the lower the better, she would have written \(H_0: \mu_A = \mu_B\) (or equivalently, \(H_0: \mu_A - \mu_B = 0\) ) and \(H_1: \mu_A < \mu_B\) (or equivalently, \(H_0: \mu_A - \mu_B < 0\) ).

The test statistic (often called t-stat ) is, in some sense, a metric indicating how extreme the observations are compared to the null hypothesis . The higher the t-stat (in absolute value), the more extreme the observations are.

There are several formulas to compute the t-stat, with one formula for each type of hypothesis test—one or two means, one or two proportions, one or two variances. This means that there is a formula to compute the t-stat for a hypothesis test on one mean, another formula for a test on two means, another for a test on one proportion, etc. 5

The only difficulty in this second step is to choose the appropriate formula. As soon as you know which formula to use based on the type of test, you simply have to apply it to the data. For the interested reader, see the different formulas to compute the t-stat for the most common tests in this Shiny app .

Luckily, formulas for hypothesis tests on one and two means, and one and two proportions follow the same structure.

Computing the test statistic for these tests is similar than scaling a random variable (a process also knows as “standardization” or “normalization”) which consists in subtracting the mean from that random variable, and dividing the result by the standard deviation:

\[Z = \frac{X - \mu}{\sigma}\]

For these 4 hypothesis tests (one/two means and one/two proportions), computing the test statistic is like scaling the estimator (computed from the sample) corresponding to the parameter of interest (in the population). So we basically subtract the target parameter from the point estimator and then divide the result by the standard error (which is equivalent to the standard deviation but for an estimator).

If this is unclear, here is how the test statistic (denoted \(t_{obs}\) ) is computed in our scenario (assuming that the variance of the population is unknown):

\[t_{obs} = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}}\]

- \(\bar{x}\) is the sample mean (i.e., the estimator)
- \(\mu\) is the mean under the null hypothesis (i.e., the target parameter)
- \(s\) is the sample standard deviation
- \(n\) is the sample size
- ( \(\frac{s}{\sqrt{n}}\) is the standard error)

Notice the similarity between the formula of this test statistic and the formula used to standardize a random variable. This structure is the same for a test on two means, one proportion and two proportions, except that the estimator, the parameter and the standard error are, of course, slightly different for each type of test.

Suppose that in our case we have a sample mean of 71 kg ( \(\bar{x}\) = 71), a sample standard deviation of 13 kg ( \(s\) = 13) and a sample size of 10 adults ( \(n\) = 10). Remember that the population mean (the mean under the null hypothesis) is 80 kg ( \(\mu\) = 80).

The t-stat is thus:

\[t_{obs} = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}} = \frac{71 - 80}{\frac{13}{\sqrt{10}}} = -2.189\]

Although formulas are different depending on which parameter you are testing, the value found for the test statistic gives us an indication on how extreme our observations are.

We keep this value of -2.189 in mind because it will be used again in step #4.

Although the t-stat gives us an indication of how extreme our observations are, we cannot tell whether this “score of extremity” is too extreme or not based on its value only.

So, at this point, we cannot yet tell whether our data are too extreme or not. For this, we need to compare our t-stat with a threshold—referred as critical value —given by the probability distribution tables (and which can, of course, also be found with R).

In the same way that the formula to compute the t-stat is different for each parameter of interest, the underlying probability distribution—and thus the statistical table—on which the critical value is based is also different for each target parameter. This means that, in addition to choosing the appropriate formula to compute the t-stat, we also need to select the appropriate probability distribution depending on the parameter we are testing.

Luckily, there are only 4 different probability distributions for the 6 hypothesis tests covered in this article (one/two means, one/two proportions and one/two variances):

- test on one and two means with known population variance(s)
- test on two paired samples where the variance of the difference between the 2 samples \(\sigma^2_D\) is known
- test on one and two proportions (given that some assumptions are met)
- test on one and two means with un known population variance(s)
- test on two paired samples where the variance of the difference between the 2 samples \(\sigma^2_D\) is un known
- test on one variance
- test on two variances

Each probability distribution also has its own parameters (up to two parameters for the 4 distribution considered here), defining its shape and/or location. Parameter(s) of a probability distribution can be seen as its DNA; meaning that the distribution is entirely defined by its parameter(s).

Taking our initial scenario—a health professional who would like to test whether the mean weight of Belgian adults is different than 80 kg—as example.

The underlying probability distribution of a test on one mean is either the standard Normal or the Student distribution, depending on whether the variance of the population (not sample variance!) is known or unknown: 6

- If the population variance is known \(\rightarrow\) the standard Normal distribution is used
- If the population variance is un known \(\rightarrow\) the Student distribution is used

If no population variance is explicitly given, you can assume that it is unknown since you cannot compute it based on a sample. If you could compute it, that would mean you have access to the entire population and there is, in this case, no point in performing a hypothesis test (you could simply use some descriptive statistics to confirm or refute your belief).

In our example, no population variance is specified so it is assumed to be unknown. We therefore use the Student distribution.

The Student distribution has one parameter which defines it; the number of degrees of freedom. The number of degrees of freedom depends on the type of hypothesis test. For instance, the number of degrees of freedom for a test on one mean is equal to the number of observations minus one ( \(n\) - 1). Without going too far into the details, the - 1 comes from the fact that there is one quantity which is estimated (i.e., the mean). 7 The sample size being equal to 10 in our example, the degrees of freedom is equal to \(n\) - 1 = 10 - 1 = 9.

There is only one last element missing to find the critical value: the significance level . The significance level , denoted \(\alpha\) , is the probability of wrongly rejecting the null hypothesis, so the probability of rejecting the null hypothesis although it is in reality true . In this sense, it is an error (type I error, as opposed to the type II error 8 ) that we accept to deal with, in order to be able to draw conclusions about a population based on a subset of it.

As you may have read in many statistical textbooks, the significance level is very often set to 5%. 9 In some fields (such as medicine or engineering, among others), the significance level is also sometimes set to 1% to decrease the error rate.

It is best to specify the significance level before performing a hypothesis test to avoid the temptation to set the significance level in accordance to the results (the temptation is even bigger when the results are on the edge of being significant). As I always tell my students, you cannot “guess” nor compute the significance level. Therefore, if it is not explicitly specified, you can safely assume it is 5%. In our case, we did not indicate it, so we take \(\alpha\) = 5% = 0.05.

Furthermore, in our example, we want to test whether the mean weight of Belgian adults is different than 80 kg. Since we do not specify the direction of the test, it is a two-sided test . If we wanted to test that the mean weight was less than 80 kg ( \(H_1: \mu <\) 80) or greater than 80 kg ( \(H_1: \mu >\) 80), we would have done a one-sided test.

Make sure that you perform the correct test (two-sided or one-sided) because it has an impact on how to find the critical value (see more in the following paragraphs).

So now that we know the appropriate distribution (Student distribution), its parameter (degrees of freedom (df) = 9), the significance level ( \(\alpha\) = 0.05) and the direction (two-sided), we have all we need to find the critical value in the statistical tables :

By looking at the row df = 9 and the column \(t_.025\) in the Student’s distribution table, we find a critical value of:

\[t_{n-1; \alpha / 2} = t_{9; 0.025} = 2.262\]

One may wonder why we take \(t_{\alpha/2} = t_.025\) and not \(t_\alpha = t_.05\) since the significance level is 0.05. The reason is that we are doing a two-sided test ( \(H_1: \mu \ne\) 80), so the error rate of 0.05 must be divided in 2 to find the critical value to the right of the distribution. Since the Student’s distribution is symmetric, the critical value to the left of the distribution is simply: -2.262.

Visually, the error rate of 0.05 is partitioned into two parts:

- 0.025 to the left of -2.262 and
- 0.025 to the right of 2.262

We keep in mind these critical values of -2.262 and 2.262 for the fourth and last step.

Note that the red shaded areas in the previous plot are also known as the rejection regions. More on that in the following section.

These critical values can also be found in R, thanks to the qt() function:

The qt() function is used for the Student’s distribution ( q stands for quantile and t for Student). There are other functions accompanying the different distributions:

- qnorm() for the Normal distribution
- qchisq() for the Chi-square distribution
- qf() for the Fisher distribution

In this fourth and last step, all we have to do is to compare the test statistic (computed in step #2) with the critical values (found in step #3) in order to conclude the hypothesis test .

The only two possibilities when concluding a hypothesis test are:

- Rejection of the null hypothesis
- Non-rejection of the null hypothesis

In our example of adult weight, remember that:

- the t-stat is -2.189
- the critical values are -2.262 and 2.262

Also remember that:

- the t-stat gives an indication on how extreme our sample is compared to the null hypothesis
- the critical values are the threshold from which the t-stat is considered as too extreme

To compare the t-stat with the critical values, I always recommend to plot them:

These two critical values form the rejection regions (the red shaded areas):

- from \(- \infty\) to -2.262, and
- from 2.262 to \(\infty\)

If the t-stat lies within one of the rejection region, we reject the null hypothesis . On the contrary, if the t-stat does not lie within any of the rejection region, we do not reject the null hypothesis .

As we can see from the above plot, the t-stat is less extreme than the critical value and therefore does not lie within any of the rejection region. In conclusion, we do not reject the null hypothesis that \(\mu = 80\) .

This is the conclusion in statistical terms but they are meaningless without proper interpretation. So it is a good practice to also interpret the result in the context of the problem:

At the 5% significance level, we do not reject the hypothesis that the mean weight of Belgian adults is 80 kg.

From a more philosophical (but still very important) perspective, note that we wrote “we do not reject the null hypothesis” and “we do not reject the hypothesis that the mean weight of Belgian adults is equal to 80 kg”. We did not write “we accept the null hypothesis” nor “the mean weight of Belgian adults is 80 kg”.

The reason is due to the fact that, in hypothesis testing, we conclude something about the population based on a sample. There is, therefore, always some uncertainty and we cannot be 100% sure that our conclusion is correct.

Perhaps it is the case that the mean weight of Belgian adults is in reality different than 80 kg, but we failed to prove it based on the data at hand. It may be the case that if we had more observations, we would have rejected the null hypothesis (since all else being equal, a larger sample size implies a more extreme t-stat). Or, it may be the case that even with more observations, we would not have rejected the null hypothesis because the mean weight of Belgian adults is in reality close to 80 kg. We cannot distinguish between the two.

So we can just say that we did not find enough evidence against the hypothesis that the mean weight of Belgian adults is 80 kg, but we do not conclude that the mean is equal to 80 kg.

If the difference is still not clear to you, the following example may help. Suppose a person is suspected of having committed a crime. This person is either innocent—the null hypothesis—or guilty—the alternative hypothesis. In the attempt to know if the suspect committed the crime, the police collects as much information and proof as possible. This is similar to the researcher collecting data to form a sample. And then the judge, based on the collected evidence, decides whether the suspect is considered as innocent or guilty. If there is enough evidence that the suspect committed the crime, the judge will conclude that the suspect is guilty. In other words, she will reject the null hypothesis of the suspect being innocent because there are enough evidence that the suspect committed the crime.

This is similar to the t-stat being more extreme than the critical value: we have enough information (based on the sample) to say that the null hypothesis is unlikely because our data would be too extreme if the null hypothesis were true. Since the sample cannot be “wrong” (it corresponds to the collected data), the only remaining possibility is that the null hypothesis is in fact wrong. This is the reason we write “we reject the null hypothesis”.

On the other hand, if there is not enough evidence that the suspect committed the crime (or no evidence at all), the judge will conclude that the suspect is considered as not guilty. In other words, she will not reject the null hypothesis of the suspect being innocent. But even if she concludes that the suspect is considered as not guilty, she will never be 100% sure that he is really innocent.

It may be the case that:

- the suspect did not commit the crime, or
- the suspect committed the crime but the police was not able to collect enough information against the suspect.

In the former case the suspect is really innocent, whereas in the latter case the suspect is guilty but the police and the judge failed to prove it because they failed to find enough evidence against him. Similar to hypothesis testing, the judge has to conclude the case by considering the suspect not guilty, without being able to distinguish between the two.

This is the main reason we write “we do not reject the null hypothesis” or “we fail to reject the null hypothesis” (you may even read in some textbooks conclusion such as “there is no sufficient evidence in the data to reject the null hypothesis”), and we do not write “we accept the null hypothesis”.

I hope this metaphor helped you to understand the reason why we reject the null hypothesis instead of accepting it.

In the following sections, we present two other methods used in hypothesis testing.

These methods will result in the exact same conclusion: non-rejection of the null hypothesis, that is, we do not reject the hypothesis that the mean weight of Belgian adults is 80 kg. It is thus presented only if you prefer to use these methods over the first one.

Method B, which consists in computing the p -value and comparing this p -value with the significance level \(\alpha\) , boils down to the following 4 steps:

- Computing the p -value

In this second method which uses the p -value, the first and second steps are similar than in the first method.

The null and alternative hypotheses remain the same:

- \(H_0: \mu = 80\)
- \(H_1: \mu \ne 80\)

Remember that the formula for the t-stat is different depending on the type of hypothesis test (one or two means, one or two proportions, one or two variances). In our case of one mean with unknown variance, we have:

The p -value is the probability (so it goes from 0 to 1) of observing a sample at least as extreme as the one we observed if the null hypothesis were true. In some sense, it gives you an indication on how likely your null hypothesis is . It is also defined as the smallest level of significance for which the data indicate rejection of the null hypothesis.

For more information about the p -value, I recommend reading this note about the p -value and the significance level \(\alpha\) .

Formally, the p -value is the area beyond the test statistic. Since we are doing a two-sided test, the p -value is thus the sum of the area above 2.189 and below -2.189.

Visually, the p -value is the sum of the two blue shaded areas in the following plot:

The p -value can computed with precision in R with the pt() function:

The p -value is 0.0563, which indicates that there is a 5.63% chance to observe a sample at least as extreme as the one observed if the null hypothesis were true. This already gives us a hint on whether our t-stat is too extreme or not (and thus whether our null hypothesis is likely or not), but we formally conclude in step #4.

Like the qt() function to find the critical value, we use pt() to find the p -value because the underlying distribution is the Student’s distribution.

Use pnorm() , pchisq() and pf() for the Normal, Chi-square and Fisher distribution, respectively. See also this Shiny app to compute the p -value given a certain t-stat for most probability distributions.

If you do not have access to a computer (during exams for example) you will not be able to compute the p -value precisely, but you can bound it using the statistical table referring to your test.

In our case, we use the Student distribution and we look at the row df = 9 (since df = n - 1):

- The test statistic is -2.189
- We take the absolute value, which gives 2.189
- The value 2.189 is between 1.833 and 2.262 (highlighted in blue in the above table)
- the area to the right of 1.833 is 0.05
- the area to the right of 2.262 is 0.025
- So we know that the area to the right of 2.189 must be between 0.025 and 0.05
- Since the Student distribution is symmetric, we know that the area to the left of -2.189 must also be between 0.025 and 0.05
- Therefore, the sum of the two areas must be between 0.05 and 0.10
- In other words, the p -value is between 0.05 and 0.10 (i.e., 0.05 < p -value < 0.10)

Although we could not compute it precisely, it is enough to conclude our hypothesis test in the last step.

The final step is now to simply compare the p -value (computed in step #3) with the significance level \(\alpha\) . As for all statistical tests :

- If the p -value is smaller than \(\alpha\) ( p -value < 0.05) \(\rightarrow H_0\) is unlikely \(\rightarrow\) we reject the null hypothesis
- If the p -value is greater than or equal to \(\alpha\) ( p -value \(\ge\) 0.05) \(\rightarrow H_0\) is likely \(\rightarrow\) we do not reject the null hypothesis

No matter if we take into consideration the exact p -value (i.e., 0.0563) or the bounded one (0.05 < p -value < 0.10), it is larger than 0.05, so we do not reject the null hypothesis. 10 In the context of the problem, we do not reject the null hypothesis that the mean weight of Belgian adults is 80 kg.

Remember that rejecting (or not rejecting) a null hypothesis at the significance level \(\alpha\) using the critical value method (method A) is equivalent to rejecting (or not rejecting) the null hypothesis when the p -value is lower (equal or greater) than \(\alpha\) (method B).

This is the reason we find the exact same conclusion than with method A, and why you should too if you use both methods on the same data and with the same significance level.

Method C, which consists in computing the confidence interval and comparing this confidence interval with the target parameter (the parameter under the null hypothesis), boils down to the following 3 steps:

- Computing the confidence interval

In this last method which uses the confidence interval, the first step is similar than in the first two methods.

Like hypothesis testing, confidence intervals are a well-known tool in inferential statistics.

Confidence interval is an estimation procedure which produces an interval (i.e., a range of values) containing the true parameter with a certain —usually high— probability .

In the same way that there is a formula for each type of hypothesis test when computing the test statistics, there exists a formula for each type of confidence interval. Formulas for the different types of confidence intervals can be found in this Shiny app .

Here is the formula for a confidence interval on one mean \(\mu\) (with unknown population variance):

\[ (1-\alpha)\text{% CI for } \mu = \bar{x} \pm t_{\alpha/2, n - 1} \frac{s}{\sqrt{n}} \]

where \(t_{\alpha/2, n - 1}\) is found in the Student distribution table (and is similar to the critical value found in step #3 of method A).

Given our data and with \(\alpha\) = 0.05, we have:

\[ \begin{aligned} 95\text{% CI for } \mu &= \bar{x} \pm t_{\alpha/2, n - 1} \frac{s}{\sqrt{n}} \\ &= 71 \pm 2.262 \frac{13}{\sqrt{10}} \\ &= [61.70; 80.30] \end{aligned} \]

The 95% confidence interval for \(\mu\) is [61.70; 80.30] kg. But what does a 95% confidence interval mean?

We know that this estimation procedure has a 95% probability of producing an interval containing the true mean \(\mu\) . In other words, if we construct many confidence intervals (with different samples of the same size), 95% of them will , on average, include the mean of the population (the true parameter). So on average, 5% of these confidence intervals will not cover the true mean.

If you wish to decrease this last percentage, you can decrease the significance level (set \(\alpha\) = 0.01 or 0.02 for instance). All else being equal, this will increase the range of the confidence interval and thus increase the probability that it includes the true parameter.

The final step is simply to compare the confidence interval (constructed in step #2) with the value of the target parameter (the value under the null hypothesis, mentioned in step #1):

- If the confidence interval does not include the hypothesized value \(\rightarrow H_0\) is unlikely \(\rightarrow\) we reject the null hypothesis
- If the confidence interval includes the hypothesized value \(\rightarrow H_0\) is likely \(\rightarrow\) we do not reject the null hypothesis

In our example:

- the hypothesized value is 80 (since \(H_0: \mu\) = 80)
- 80 is included in the 95% confidence interval since it goes from 61.70 to 80.30 kg
- So we do not reject the null hypothesis

In the terms of the problem, we do not reject the hypothesis that the mean weight of Belgian adults is 80 kg.

As you can see, the conclusion is equivalent than with the critical value method (method A) and the p -value method (method B). Again, this must be the case since we use the same data and the same significance level \(\alpha\) for all three methods.

All three methods give the same conclusion. However, each method has its own advantage so I usually select the most convenient one depending on the situation:

- It is, in my opinion, the easiest and most straightforward method of the three when I do not have access to R.
- In addition to being able to know whether the null hypothesis is rejected or not, computing the exact p -value can be very convenient so I tend to use this method if I have access to R.
- If I need to test several hypothesized values , I tend to choose this method because I can construct one single confidence interval and compare it to as many values as I want. For example, with our 95% confidence interval [61.70; 80.30], I know that any value below 61.70 kg and above 80.30 kg will be rejected, without testing it for each value.

In this article, we reviewed the goals and when hypothesis testing is used. We then showed how to do a hypothesis test by hand through three different methods (A. critical value , B. p -value and C. confidence interval ). We also showed how to interpret the results in the context of the initial problem.

Although all three methods give the exact same conclusion when using the same data and the same significance level (otherwise there is a mistake somewhere), I also presented my personal preferences when it comes to choosing one method over the other two.

Thanks for reading.

I hope this article helped you to understand the structure of a hypothesis by hand. I remind you that, at least for the 6 hypothesis tests covered in this article, the formulas are different, but the structure and the reasoning behind it remain the same. So you basically have to know which formulas to use, and simply follow the steps mentioned in this article.

For the interested reader, I created two accompanying Shiny apps:

- Hypothesis testing and confidence intervals : after entering your data, the app illustrates all the steps in order to conclude the test and compute a confidence interval. See more information in this article .
- How to read statistical tables : the app helps you to compute the p -value given a t-stat for most probability distributions. See more information in this article .

As always, if you have a question or a suggestion related to the topic covered in this article, please add it as a comment so other readers can benefit from the discussion.

Suppose a researcher wants to test whether Belgian women are taller than French women. Suppose a health professional would like to know whether the proportion of smokers is different among athletes and non-athletes. It would take way too long to measure the height of all Belgian and French women and to ask all athletes and non-athletes their smoking habits. So most of the time, decisions are based on a representative sample of the population and not on the whole population. If we could measure the entire population in a reasonable time frame, we would not do any inferential statistics. ↩︎

Don’t get me wrong, this does not mean that hypothesis tests are never used in exploratory analyses. It is just much less frequent in exploratory research than in confirmatory research. ↩︎

You may see more or less steps in other articles or textbooks, depending on whether these steps are detailed or concise. Hypothesis testing should, however, follows the same process regardless of the number of steps. ↩︎

For one-sided tests, writing \(H_0: \mu = 80\) or \(H_0: \mu \ge 80\) are both correct. The point is that the null and alternative hypothesis must be mutually exclusive since you are testing one hypothesis against the other, so both cannot be true at the same time. ↩︎

To be complete, there are even different formulas within each type of test, depending on whether some assumptions are met or not. For the interested reader, see all the different scenarios and thus the different formulas for a test on one mean and on two means . ↩︎

There are more uncertainty if the population variance is unknown than if it is known, and this greater uncertainty is taken into account by using the Student distribution instead of the standard Normal distribution. Also note that as the sample size increases, the degrees of freedom of the Student distribution increases and the two distributions become more and more similar. For large sample size (usually from \(n >\) 30), the Student distribution becomes so close to the standard Normal distribution that, even if the population variance is unknown, the standard Normal distribution can be used. ↩︎

For a test on two independent samples, the degrees of freedom is \(n_1 + n_2 - 2\) , where \(n_1\) and \(n_2\) are the size of the first and second sample, respectively. Note the - 2 due to the fact that in this case, two quantities are estimated. ↩︎

The type II error is the probability of not rejecting the null hypothesis although it is in reality false. ↩︎

Whether this is a good or a bad standard is a question that comes up often and is debatable. This is, however, beyond the scope of the article. ↩︎

Again, p -values found via a statistical table or via R must be coherent. ↩︎

## Related articles

- One-sample Wilcoxon test in R
- Correlation coefficient and correlation test in R
- One-proportion and chi-square goodness of fit test
- How to perform a one-sample t-test by hand and in R: test on one mean

## Liked this post?

- Get updates every time a new article is published (no spam and unsubscribe anytime):

Yes, receive new posts by email

- Support the blog

FAQ Contribute Sitemap

## IMAGES

## VIDEO

## COMMENTS

5.2 - Writing Hypotheses. The first step in conducting a hypothesis test is to write the hypothesis statements that are going to be tested. For each test you will have a null hypothesis ( H 0) and an alternative hypothesis ( H a ). Null Hypothesis. The statement that there is not a difference in the population (s), denoted as H 0.

Present the findings in your results and discussion section. Though the specific details might vary, the procedure you will use when testing a hypothesis will always follow some version of these steps. Table of contents. Step 1: State your null and alternate hypothesis. Step 2: Collect data. Step 3: Perform a statistical test.

Developing a hypothesis (with example) Step 1. Ask a question. Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project. Example: Research question.

Step 2: State the Alternate Hypothesis. The claim is that the students have above average IQ scores, so: H 1: μ > 100. The fact that we are looking for scores "greater than" a certain point means that this is a one-tailed test. Step 3: Draw a picture to help you visualize the problem. Step 4: State the alpha level.

Hypothesis testing is a crucial procedure to perform when you want to make inferences about a population using a random sample. These inferences include estimating population properties such as the mean, differences between means, proportions, and the relationships between variables. This post provides an overview of statistical hypothesis testing.

In hypothesis testing, the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis.The null hypothesis is usually denoted \(H_0\) while the alternative hypothesis is usually denoted \(H_1\). An hypothesis test is a statistical decision; the conclusion will either be to reject the null hypothesis in favor ...

Test Statistic: z = x¯¯¯ −μo σ/ n−−√ z = x ¯ − μ o σ / n since it is calculated as part of the testing of the hypothesis. Definition 7.1.4 7.1. 4. p - value: probability that the test statistic will take on more extreme values than the observed test statistic, given that the null hypothesis is true.

Hypothesis Testing is a type of statistical analysis in which you put your assumptions about a population parameter to the test. It is used to estimate the relationship between 2 statistical variables. Let's discuss few examples of statistical hypothesis from real-life -. A teacher assumes that 60% of his college's students come from lower ...

A hypothesis test consists of five steps: 1. State the hypotheses. State the null and alternative hypotheses. These two hypotheses need to be mutually exclusive, so if one is true then the other must be false. 2. Determine a significance level to use for the hypothesis. Decide on a significance level.

The general idea of hypothesis testing involves: Making an initial assumption. Collecting evidence (data). Based on the available evidence (data), deciding whether to reject or not reject the initial assumption. Every hypothesis test — regardless of the population parameter involved — requires the above three steps.

10.1 - Setting the Hypotheses: Examples. A significance test examines whether the null hypothesis provides a plausible explanation of the data. The null hypothesis itself does not involve the data. It is a statement about a parameter (a numerical characteristic of the population). These population values might be proportions or means or ...

HYPOTHESIS TESTING. A clinical trial begins with an assumption or belief, and then proceeds to either prove or disprove this assumption. In statistical terms, this belief or assumption is known as a hypothesis. Counterintuitively, what the researcher believes in (or is trying to prove) is called the "alternate" hypothesis, and the opposite ...

Hypothesis testing is based on making two different claims about a population parameter. The null hypothesis ( H 0) and the alternative hypothesis ( H 1) are the claims. The two claims needs to be mutually exclusive, meaning only one of them can be true. The alternative hypothesis is typically what we are trying to prove.

Test statistics represent effect sizes in hypothesis tests because they denote the difference between your sample effect and no effect —the null hypothesis. Consequently, you use the test statistic to calculate the p-value for your hypothesis test. The above p-value definition is a bit tortuous.

The alternative hypothesis (H a) is the other answer to your research question. It claims that there's an effect in the population. Often, your alternative hypothesis is the same as your research hypothesis. In other words, it's the claim that you expect or hope will be true. The alternative hypothesis is the complement to the null hypothesis.

A hypothesis is a tentative statement about the relationship between two or more variables. It is a specific, testable prediction about what you expect to happen in a study. It is a preliminary answer to your question that helps guide the research process. Consider a study designed to examine the relationship between sleep deprivation and test ...

This statistics video tutorial provides a basic introduction into hypothesis testing. It provides examples and practice problems that explains how to state ...

Hypothesis testing. In hypothesis testing, two mutually exclusive statements about a parameter or population (hypotheses) are evaluated to decide which statement is best supported by sample data. Parameters and statistics. In statistics, a parameter is a description of a population, while a statistic describes a small portion of a population ...

When writing the conclusion of a hypothesis test, we typically include: Whether we reject or fail to reject the null hypothesis. The significance level. A short explanation in the context of the hypothesis test. For example, we would write: We reject the null hypothesis at the 5% significance level.

H 0 (Null Hypothesis): Population parameter =, ≤, ≥ some value. H A (Alternative Hypothesis): Population parameter <, >, ≠ some value. Note that the null hypothesis always contains the equal sign. We interpret the hypotheses as follows: Null hypothesis: The sample data provides no evidence to support some claim being made by an individual.

The null hypothesis in statistics states that there is no difference between groups or no relationship between variables. It is one of two mutually exclusive hypotheses about a population in a hypothesis test. When your sample contains sufficient evidence, you can reject the null and conclude that the effect is statistically significant.

These two critical values form the rejection regions (the red shaded areas): from \(- \infty\) to -2.262, and; from 2.262 to \(\infty\) If the t-stat lies within one of the rejection region, we reject the null hypothesis. On the contrary, if the t-stat does not lie within any of the rejection region, we do not reject the null hypothesis.