A quality control manager at a factory wants to ensure that the average weight of a product is at least 500 grams. They take a random sample of 30 products and find the sample mean weight to be 495 grams with a standard deviation of 10 grams. The manager wants to test if the average weight of the products is significantly less than 500 grams at a 5% significance level.
Null Hypothesis (\(H_0\)): The average weight of the products is at least 500 grams (\(\mu\)\(\geq\) 500).
Alternative Hypothesis (\(H_1\)): The average weight of the products is less than 500 grams (\(\mu\) < 500).
Examples Highlighting the Need for Hypothesis Testing
Medical Research:
Scenario: A pharmaceutical company develops a new drug intended to lower blood pressure.
Hypothesis Testing: The null hypothesis (\(H_0\)) might state that the new drug has no effect on blood pressure, while the alternative hypothesis (\(H_1\)) states that the drug does lower blood pressure. Hypothesis testing helps determine if the observed effects in clinical trials are statistically significant or if they could have occurred by random chance.
Quality Control:
Scenario: A factory produces light bulbs, and the quality control team wants to ensure that the average lifespan of the bulbs is 1000 hours.
Hypothesis Testing: The null hypothesis (\(H_0\)) could be that the mean lifespan of the bulbs is 1000 hours. The alternative hypothesis (\(H_1\)) might be that the mean lifespan is not 1000 hours. Hypothesis testing helps the team decide whether to accept the production process or take corrective actions.
Marketing:
Scenario: A company launches a new advertising campaign and wants to know if it has increased sales.
Hypothesis Testing: The null hypothesis (\(H_0\)) might state that the advertising campaign has no effect on sales, while the alternative hypothesis (\(H_1\)) states that the campaign has increased sales. Hypothesis testing helps the company determine if the increase in sales is statistically significant.
Education:
Scenario: An educator wants to test if a new teaching method is more effective than the traditional method.
Hypothesis Testing: The null hypothesis (\(H_0\)) could be that there is no difference in effectiveness between the new and traditional methods. The alternative hypothesis (\(H_1\)) might be that the new method is more effective. Hypothesis testing helps in making data-driven decisions about adopting new teaching strategies.
Environmental Science:
Scenario: Researchers want to determine if a new policy has reduced pollution levels in a city.
Hypothesis Testing: The null hypothesis (\(H_0\)) might state that the policy has no effect on pollution levels, while the alternative hypothesis (\(H_1\)) states that the policy has reduced pollution levels. Hypothesis testing helps in evaluating the effectiveness of environmental policies.
These examples illustrate how hypothesis testing is a crucial tool in various fields for making informed decisions based on data.
Important Terminology
Null Hypothesis (\(H_0\)): The hypothesis that there is no effect or no difference. It is the default assumption that any observed effect is due to random chance. It is the hypothesis that researchers aim to test against.
Alternative Hypothesis (\(H_1\) or \(H_a\)): The hypothesis that there is an effect or a difference. It is what researchers want to prove.
Test Statistic: A standardized value that is calculated from sample data during a hypothesis test. It is used to decide whether to reject the null hypothesis.
P-value: The probability of obtaining test results at least as extreme as the observed results, assuming that the null hypothesis is true. A smaller p-value indicates stronger evidence against the null hypothesis.
Significance Level (\(\alpha\)): A threshold set by the researcher which the p-value must be below in order to reject the null hypothesis. Common significance levels are 0.05, 0.01, and 0.10.
Critical Value: The value that the test statistic must exceed in order to reject the null hypothesis. It is determined based on the significance level and the distribution of the test statistic.
Power of a Test: The probability that the test correctly rejects a false null hypothesis (i.e., it does not make a type II error). Higher power indicates a greater ability to detect an effect when there is one.
Type I Error: The error made when the null hypothesis is true, but is incorrectly rejected. The probability of making a type I error is denoted by \(\alpha\).
Type II Error: The error made when the null hypothesis is false, but is incorrectly accepted. The probability of making a type II error is denoted by \(\beta\).
Confidence Interval: A range of values derived from the sample data that is likely to contain the population parameter. It provides an estimate of the parameter with a certain level of confidence (e.g., 95%).
One-tailed Test: A hypothesis test in which the region of rejection is on only one side of the sampling distribution. It tests for the possibility of the relationship in one direction.
Two-tailed Test: A hypothesis test in which the region of rejection is on both sides of the sampling distribution. It tests for the possibility of the relationship in both directions.
Introduction
A statistical hypothesis is typically a statement regarding a set of parameters of a population distribution.
It is termed a hypothesis because its true value is unknown.
The main challenge is to devise a method to determine whether the values of a random sample from this population align with the hypothesis.
Consider a population with distribution \(F_\theta\), where \(\theta\) is unknown.
We aim to test a specific hypothesis about \(\theta\).
This hypothesis is denoted by \(H_0\) and is referred to as the null hypothesis.
For instance, if \(F_\theta\) is a normal distribution function with mean \(\theta\) and variance equal to 1, two possible null hypotheses about \(\theta\) are:
\[
H_{0}: \theta = 1
\]
\[
H_{0}: \theta > 1
\]
\[
H_{0}: \theta \leq 1
\]
It is important to note that the null hypothesis in the first case fully specifies the population distribution.
Whereas the null hypothesis in the second and third cases do not.
Simple and Composite Hypotheses
A hypothesis that fully specifies the population distribution when true is known as a simple hypothesis. (Eg; \(H_{0}: \theta = 1\))
A hypothesis that does not fully specifies the population distribution is referred to as a composite hypothesis. (Eg; \(H_{0}: \theta > 1\), H_{0}: )
Testing a Null Hypothesis
To test a specific null hypothesis \(H_0\), we take a sample of size \(n\) from the population, say \(X_1, X_2, \ldots, X_n\).
Based on these \(n\) values, we decide whether to accept or reject \(H_0\).
We define a region \(C\) in the \(n\)-dimensional space. This region is called the critical region.
If the sample \(X_1, X_2, \ldots, X_n\) falls within the critical region \(C\), we reject \(H_0\). Otherwise, we accept \(H_0\).
In simple terms, the critical region \(C\) helps us determine the outcome of the statistical test.
When developing a procedure for testing a given null hypothesis \(H_0\), it is crucial to recognize that two different types of errors can occur.
A type I error occurs if the test incorrectly rejects \(H_0\) when it is actually true.
A type II error occurs if the test incorrectly accepts \(H_0\) when it is actually false.
Note
The goal of a statistical test for \(H_0\) is not to definitively determine its truth but to assess if the data is consistent with \(H_0\).
Significance Level and Classical Approach
\(H_0\) should be rejected only if the observed data is highly unlikely under \(H_0\).
The classical method involves specifying a value \(\alpha\), known as the level of significance.
The test is designed so that the probability of rejecting \(H_0\) when it is true does not exceed \(\alpha\).
Common choices for \(\alpha\) are 0.1, 0.05, and 0.005.
This approach ensures that the probability of a type I error (incorrectly rejecting \(H_0\)) is controlled and does not exceed the chosen \(\alpha\).
Example
For instance, consider testing the hypothesis that the mean of a normal distribution with parameters \((\theta, 1)\) is equal to 1.
The test rejects the null hypothesis if the point estimate of \(\theta\) (i.e., the sample mean) deviates more than \(\frac{1.96}{\sqrt{n}}\) from 1.
As we will discuss in the next section, the value \(\frac{1.96}{\sqrt{n}}\) is selected to achieve a significance level of \(\alpha = 0.05\).
4 Hypothesis Tests Concerning the mean of a normal population
4.1 With known Variance (Z-test)
Let \(X_1, X_2, \ldots, X_n\) be a sample of size \(n\) from a normal distribution with an unknown mean \(\mu\) and a known variance \(\sigma^2\).
We are interested in testing the null hypothesis:
\[
H_0 : \mu = \mu_0
\]
Against the alternative hypothesis:
\[
H_1 : \mu \neq \mu_0
\]
Where \(\mu_0\) is a specified constant.
Since \(\bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i\) is a natural point estimator of \(\mu\), it is reasonable to accept \(H_0\) if \(\bar{X}\) is not too far from \(\mu_0\).
Thus, the critical region of the test would be of the form:
To ensure that the test has a significance level \(\alpha\), we must determine the critical value \(c\) in the above equation such that the type I error is equal to \(\alpha\). This means \(c\) must satisfy:
where \(P_{\mu_0}\) denotes that the probability is computed under the assumption that population mean, \(\mu = \mu_0\).
When \(\mu = \mu_0\), \(\bar{X}\) follows a normal distribution with mean \(\mu_0\) and variance \(\frac{\sigma^2}{n}\). Therefore, the standardized variable \(Z\) defined by:
\[
Z = \frac{\bar{X} - \mu_0}{\sigma / \sqrt{n}}
\]
will have a standard normal distribution.
The probability of a type I error is given by:
\[
P \left( |\bar{X} - \mu_0| > c \right) = \alpha
\]
If a signal of value \(\mu\) is sent from location A, then the value received at location B is normally distributed with mean \(\mu\) and standard deviation 2. That is, the random noise added to the signal is an N(0, 4) random variable. There is reason for the people at location B to suspect that the signal value \(\mu\) = 8 will be sent today. Test this hypothesis if the same signal value is independently sent five times and the average value received at location B is X = 9. 5.
4.1.1 Choosing the Significance Level
The appropriate significance level \(\alpha\) depends on the specific context and consequences of the hypothesis test.
If rejecting the null hypothesis \(H_0\) would lead to significant costs or consequences, a more conservative significance level (e.g., 0.05 or 0.01) should be chosen.
If there is a strong initial belief that \(H_0\) is true, strict evidence is required to reject \(H_0\), implying a lower significance level.
The test can be described as follows: For an observed value of the test statistic \(\sqrt{n} \frac{|\bar{X} - \mu_0|}{\sigma}\), denoted as \(v\), reject \(H_0\) if the probability of the test statistic being as large as \(v\) under \(H_0\) is less than or equal to \(\alpha\).
This probability is known as the p-value of the test. \(H_0\) is accepted if \(\alpha\) is less than the p-value and rejected if \(\alpha\) is greater than or equal to the p-value.
In practice, the significance level is sometimes not set in advance. Instead, the p-value is calculated from the data, and decisions are made based on the p-value.
If the p-value is much larger than any reasonable significance level, \(H_0\) is accepted. Conversely, if the p-value is very small, \(H_0\) is rejected.
4.1.2 Hypothesis Testing Summary: Z- Test
Caption: Summary of hypothesis testing for a sample from a \(N(\mu, \sigma^2)\) population with known \(\sigma^2\).
Sample and Population
Details
Sample
\(\{X_1, X_2, . . . , X_n\}\)
Population
\(N(\mu, \sigma^2)\)
Known Parameter
\(\sigma^2\)
Sample Mean
\(\bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i\)
Significance Level
\(\alpha\)
Hypothesis
Test Statistic (TS)
Reject if
p-Value if TS = t
\(H_0: \mu = \mu_0\) vs \(H_1: \mu \neq \mu_0\)
\(\sqrt{n}(\bar{X} - \mu_0)/\sigma\)
\(|TS| > z_{\alpha/2}\)
\(2P\{Z \geq |t|\}\)
\(H_0: \mu \leq \mu_0\) vs \(H_1: \mu > \mu_0\)
\(\sqrt{n}(\bar{X} - \mu_0)/\sigma\)
\(TS > z_{\alpha}\)
\(P\{Z \geq t\}\)
\(H_0: \mu \geq \mu_0\) vs \(H_1: \mu < \mu_0\)
\(\sqrt{n}(\bar{X} - \mu_0)/\sigma\)
\(TS < -z_{\alpha}\)
\(P\{Z \leq t\}\)
Problem
Imagine you’re the quality control manager at a company that prides itself on the precision of its product weights. The company claims that the average weight of their product is exactly 100 grams. But, as a diligent manager, you decide to put this claim to the test. You randomly select a sample of 30 products and measure their weights. To your surprise, the average weight of your sample is 110 grams! Now, you need to determine if this difference is statistically significant or just a fluke. Assume the population standard deviation as 15 grams.
import numpy as npfrom scipy import stats# Given datasample_mean =495population_mean =500std_dev =10sample_size =30alpha =0.05# Calculate the Z-scorez_score = (sample_mean - population_mean) / (std_dev / np.sqrt(sample_size))# Calculate the p-valuep_value = stats.norm.cdf(z_score)# Determine if we reject the null hypothesisreject_null = p_value < alpha# Output the resultsprint(f"Z-score: {z_score}")print(f"P-value: {p_value}")print(f"Reject the null hypothesis: {reject_null}")
Z-score: -2.7386127875258306
P-value: 0.00308494966027208
Reject the null hypothesis: True
4.2 With unknown Variance (T-test)
Let \(X_1, X_2, \ldots, X_n\) be a sample of size \(n\) from a normal distribution with an unknown mean \(\mu\) and a unknown variance.
Say, we are interested in testing the null hypothesis:
\[
H_0 : \mu = \mu_0
\]
Against the alternative hypothesis:
\[
H_1 : \mu \neq \mu_0
\]
Where \(\mu_0\) is a specified constant.
In the previous case (with known variance), for a significance level (\(\alpha\)) we accepted the null hypothesis if:
\(\text{\textcolor{gray}{$T_{n−1}$ is a t-random variable with (n - 1) degrees of freedom: P($T_{n−1}$ > $t_{\alpha,n−1}$) = $\alpha$.}}\)
Problem
A public health official claims that the mean home water use is at most 350 gallons a day. To verify this claim, a study of 20 randomly selected homes was instigated with the result that the average daily water uses of these 20 homes were as follows:
\(H_0: \sigma^2 \leq \sigma^2_0\) vs \(H_1: \sigma^2 > \sigma^2_0\)
\(\frac{(n-1)S^2}{\sigma^2_0}\)
\(TS > \chi^2_{\alpha, \,\, n-1}\)
\(P\left(\chi^2_{n-1} \geq t\right)\)
\(H_0: \sigma^2 \geq \sigma^2_0\) vs \(H_1: \sigma^2 < \sigma^2_0\)
\(\frac{(n-1)S^2}{\sigma^2_0}\)
\(TS < \chi^2_{1-\alpha, \,\, n-1}\)
\(P\left(\chi^2_{n-1} \leq t\right)\)
Problem
A machine that automatically controls the amount of ribbon on a tape has recently been installed. This machine will be judged to be effective if the standard deviation \(\sigma\) of the amount of ribbon on a tape is less than .15 cm. If a sample of 20 tapes yields a sample variance of S\(^2\) = .025 cm\(^2\), are we justified in concluding that the machine is ineffective? Assume the level of significance as 0.05.
import numpy as npfrom scipy.stats import chi2# Given datasample_variance =0.025sample_size =20population_variance =0.15**2alpha =0.05# Calculate the test statistictest_statistic = (sample_size -1) * sample_variance / population_variance# Calculate the critical valueschi2_critical_low = chi2.ppf(alpha /2, df=sample_size -1)chi2_critical_high = chi2.ppf(1- alpha /2, df=sample_size -1)# Calculate the p-valuep_value =1- chi2.cdf(test_statistic, df=sample_size -1)# Determine if we reject the null hypothesisreject_null = test_statistic < chi2_critical_low or test_statistic > chi2_critical_high# Output the resultsprint(f"Test Statistic: {test_statistic}")print(f"Chi-square Critical Low: {chi2_critical_low}")print(f"Chi-square Critical High: {chi2_critical_high}")print(f"P-value: {p_value}")print(f"Reject the null hypothesis: {reject_null}")
Test Statistic: 21.111111111111114
Chi-square Critical Low: 8.906516481987971
Chi-square Critical High: 32.85232686172969
P-value: 0.33069403418551535
Reject the null hypothesis: False
Additional Topic
Equality of Means of two normal populations
Comparing the means of two different normal populations is common in hypothesis testing.
Example scenarios include comparing average test scores of students from two schools or average lifespans of two brands of light bulbs.
Use a two-sample z-test to test the equality of means of two normal populations with unequal variances.
Null Hypothesis (\(H_0\)): The means of the two populations are equal.
Alternative Hypothesis (\(H_1\)): The means of the two populations are not equal.
\(\sigma_X^2\) and \(\sigma_Y^2\): Population variances
\(n_X\) and \(n_Y\): Sample sizes of the two groups
Compare the calculated z-value to the critical z-value from the standard normal distribution table with the chosen significance level (\(\alpha\)).
If the calculated z-value is greater than the critical z-value, reject the null hypothesis and conclude a significant difference between the means of the two populations.
Important
Additionally, refer to section 8.4, titled “TESTING THE EQUALITY OF MEANS OF TWO NORMAL POPULATIONS,” in “INTRODUCTION TO PROBABILITY AND STATISTICS FOR ENGINEERS AND SCIENTISTS” by Sheldon M. Ross.