Introduction
Null hypothesis testing is a fundamental concept in statistics that helps us make informed decisions based on empirical evidence.
It may sound complex, but in this article, we'll break it down into simple terms, starting with the basics and gradually moving on to key concepts like p-values, significance levels, and the distinction between null and alternative hypotheses.
To make it more relatable, we'll illustrate the process with a real-world example: checking if a coin is fair or not.
The Fundamentals and Intuition
Imagine you have a coin, and you're curious if it's a fair coin, meaning it has an equal chance of landing heads or tails when flipped. Null hypothesis testing helps you evaluate this hypothesis.
1. Null Hypothesis (H0):
You begin with the null hypothesis, which asserts that there's no significant difference or effect. In this case, it could be: "The coin is fair, with an equal chance of heads and tails."
We assume this statement to be true. Our goal is to prove that it is not. (For example, the goal of a lawyer is to prove that the accused is not innocent)
2. Collecting Data:
To test your hypothesis, you flip the coin 100 times and record the results. After counting, you find that it landed on heads 60 times and tails 40 times.
We also create something called Null Distribution. When we conduct a hypothesis test, we are essentially asking, "What would our data look like if the null hypothesis were true?"
To answer this question, we create what is known as the sampling distribution under the null hypothesis.
For Example: If the coin was fair, the Probability of obtaining 50 heads would be highest.
3. Calculating a Test Statistic:
This statistic quantifies how different your observed data is from what you'd expect under the null hypothesis.
For example: We would expect the no: of tails to be 50 if the coin was fair and was tossed 100 times. If we get 55 tails, the outcome is not that different. But if we get 70 tails, then that is way too extreme.
If our null hypothesis was true, we would not have got an outcome this extreme. In this scenario, you might use the chi-squared test statistic to measure the deviation from a fair coin.
4. Alternative Hypothesis (Ha):
The alternative hypothesis represents what you're trying to show. Here, it might be: "The coin is not fair; it has a bias towards heads or tails."
5 Calculating the p-value:
The p-value is a central concept in null hypothesis testing. It tells you the probability of obtaining your observed results (or more extreme results) if the null hypothesis were true.
A low p-value indicates that your data contradicts the null hypothesis.
P-Value and Significance Level
Now, let's dive deeper into p-values and significance levels.
P-Value:
Think of the p-value as a measure of evidence against the null hypothesis. It ranges from 0 to 1.
A low p-value (typically less than 0.05) suggests that your observed data is
unlikely to have occurred by random chance if the null hypothesis were true.
(Obtaining 80 tails is rarer than obtaining 55 tails if the null hypothesis was true. Therefore, the former has less p-value)
In our coin example, a p-value of 0.03 indicates a 3% chance of getting these results if the coin were truly fair.
Significance Level (α):
The significance level, denoted as α, is a predetermined threshold you choose to make decisions about the null hypothesis.
Common values for α are 0.05 and 0.01. If your calculated p-value is less than or equal to α, you reject the null hypothesis in favor of the alternative. If it's greater, you fail to reject the null hypothesis.
Is the Coin Fair?
Applying null hypothesis testing to our coin example:
Null Hypothesis (H0): The coin is fair, with an equal chance of heads and tails.
Alternative Hypothesis (Ha): The coin is not fair; it has a bias towards heads or tails.
After flipping the coin 100 times and calculating a p-value of 0.03, using a significance level of α = 0.05, you would reject the null hypothesis.
This suggests that the coin may not be fair, and there might be a bias towards either heads or tails.
Conclusion
Null hypothesis testing is a powerful tool for making data-driven decisions based on evidence.
By understanding its fundamentals, as well as concepts like p-values, significance levels, and the distinction between null and alternative hypotheses, you can better interpret the statistical results