3  Hypothesis Testing

Author

Abhijith M S

Published

February 26, 2025

Problem

A quality control manager at a factory wants to ensure that the average weight of a product is at least 500 grams. They take a random sample of 30 products and find the sample mean weight to be 495 grams with a standard deviation of 10 grams. The manager wants to test if the average weight of the products is significantly less than 500 grams at a 5% significance level.

  • Null Hypothesis (\(H_0\)): The average weight of the products is at least 500 grams (\(\mu\) \(\geq\) 500).

  • Alternative Hypothesis (\(H_1\)): The average weight of the products is less than 500 grams (\(\mu\) < 500).

Examples Highlighting the Need for Hypothesis Testing
  1. Medical Research:
    • Scenario: A pharmaceutical company develops a new drug intended to lower blood pressure.
    • Hypothesis Testing: The null hypothesis (\(H_0\)) might state that the new drug has no effect on blood pressure, while the alternative hypothesis (\(H_1\)) states that the drug does lower blood pressure. Hypothesis testing helps determine if the observed effects in clinical trials are statistically significant or if they could have occurred by random chance.
  2. Quality Control:
    • Scenario: A factory produces light bulbs, and the quality control team wants to ensure that the average lifespan of the bulbs is 1000 hours.
    • Hypothesis Testing: The null hypothesis (\(H_0\)) could be that the mean lifespan of the bulbs is 1000 hours. The alternative hypothesis (\(H_1\)) might be that the mean lifespan is not 1000 hours. Hypothesis testing helps the team decide whether to accept the production process or take corrective actions.
  3. Marketing:
    • Scenario: A company launches a new advertising campaign and wants to know if it has increased sales.
    • Hypothesis Testing: The null hypothesis (\(H_0\)) might state that the advertising campaign has no effect on sales, while the alternative hypothesis (\(H_1\)) states that the campaign has increased sales. Hypothesis testing helps the company determine if the increase in sales is statistically significant.
  4. Education:
    • Scenario: An educator wants to test if a new teaching method is more effective than the traditional method.
    • Hypothesis Testing: The null hypothesis (\(H_0\)) could be that there is no difference in effectiveness between the new and traditional methods. The alternative hypothesis (\(H_1\)) might be that the new method is more effective. Hypothesis testing helps in making data-driven decisions about adopting new teaching strategies.
  5. Environmental Science:
    • Scenario: Researchers want to determine if a new policy has reduced pollution levels in a city.
    • Hypothesis Testing: The null hypothesis (\(H_0\)) might state that the policy has no effect on pollution levels, while the alternative hypothesis (\(H_1\)) states that the policy has reduced pollution levels. Hypothesis testing helps in evaluating the effectiveness of environmental policies.

These examples illustrate how hypothesis testing is a crucial tool in various fields for making informed decisions based on data.

Important Terminology
  • Null Hypothesis (\(H_0\)): The hypothesis that there is no effect or no difference. It is the default assumption that any observed effect is due to random chance. It is the hypothesis that researchers aim to test against.

  • Alternative Hypothesis (\(H_1\) or \(H_a\)): The hypothesis that there is an effect or a difference. It is what researchers want to prove.

  • Test Statistic: A standardized value that is calculated from sample data during a hypothesis test. It is used to decide whether to reject the null hypothesis.

  • P-value: The probability of obtaining test results at least as extreme as the observed results, assuming that the null hypothesis is true. A smaller p-value indicates stronger evidence against the null hypothesis.

  • Significance Level (\(\alpha\)): A threshold set by the researcher which the p-value must be below in order to reject the null hypothesis. Common significance levels are 0.05, 0.01, and 0.10.

  • Critical Value: The value that the test statistic must exceed in order to reject the null hypothesis. It is determined based on the significance level and the distribution of the test statistic.

  • Power of a Test: The probability that the test correctly rejects a false null hypothesis (i.e., it does not make a type II error). Higher power indicates a greater ability to detect an effect when there is one.

  • Type I Error: The error made when the null hypothesis is true, but is incorrectly rejected. The probability of making a type I error is denoted by \(\alpha\).

  • Type II Error: The error made when the null hypothesis is false, but is incorrectly accepted. The probability of making a type II error is denoted by \(\beta\).

  • Confidence Interval: A range of values derived from the sample data that is likely to contain the population parameter. It provides an estimate of the parameter with a certain level of confidence (e.g., 95%).

  • One-tailed Test: A hypothesis test in which the region of rejection is on only one side of the sampling distribution. It tests for the possibility of the relationship in one direction.

  • Two-tailed Test: A hypothesis test in which the region of rejection is on both sides of the sampling distribution. It tests for the possibility of the relationship in both directions.

Introduction

  • A statistical hypothesis is typically a statement regarding a set of parameters of a population distribution.

  • It is termed a hypothesis because its true value is unknown.

  • The main challenge is to devise a method to determine whether the values of a random sample from this population align with the hypothesis.

  • Consider a population with distribution \(F_\theta\), where \(\theta\) is unknown.

  • We aim to test a specific hypothesis about \(\theta\).

  • This hypothesis is denoted by \(H_0\) and is referred to as the null hypothesis.

  • For instance, if \(F_\theta\) is a normal distribution function with mean \(\theta\) and variance equal to 1, two possible null hypotheses about \(\theta\) are:

\[ H_{0}: \theta = 1 \]

\[ H_{0}: \theta > 1 \]

\[ H_{0}: \theta \leq 1 \]

  • It is important to note that the null hypothesis in the first case fully specifies the population distribution.

  • Whereas the null hypothesis in the second and third cases do not.

Simple and Composite Hypotheses
  • A hypothesis that fully specifies the population distribution when true is known as a simple hypothesis. (Eg; \(H_{0}: \theta = 1\))

  • A hypothesis that does not fully specifies the population distribution is referred to as a composite hypothesis. (Eg; \(H_{0}: \theta > 1\), H_{0}: )

Testing a Null Hypothesis

  • To test a specific null hypothesis \(H_0\), we take a sample of size \(n\) from the population, say \(X_1, X_2, \ldots, X_n\).

  • Based on these \(n\) values, we decide whether to accept or reject \(H_0\).

  • We define a region \(C\) in the \(n\)-dimensional space. This region is called the critical region.

  • If the sample \(X_1, X_2, \ldots, X_n\) falls within the critical region \(C\), we reject \(H_0\). Otherwise, we accept \(H_0\).

  • In simple terms, the critical region \(C\) helps us determine the outcome of the statistical test.

\[ accepts \,\,\, H_0 \,\,\, if \,\,\, \left( X_1, X_2, . . . , X_n \right) \,\,\, \notin \,\,\, C \]

and

\[ rejects \,\,\, H_0 \,\,\, if \,\,\, \left( X_1, X_2, . . . , X_n \right) \,\,\, \in \,\,\, C \]

Types of Errors in Hypothesis Testing
  • When developing a procedure for testing a given null hypothesis \(H_0\), it is crucial to recognize that two different types of errors can occur.

  • A type I error occurs if the test incorrectly rejects \(H_0\) when it is actually true.

  • A type II error occurs if the test incorrectly accepts \(H_0\) when it is actually false.

Note

The goal of a statistical test for \(H_0\) is not to definitively determine its truth but to assess if the data is consistent with \(H_0\).

Significance Level and Classical Approach

  • \(H_0\) should be rejected only if the observed data is highly unlikely under \(H_0\).

  • The classical method involves specifying a value \(\alpha\), known as the level of significance.

  • The test is designed so that the probability of rejecting \(H_0\) when it is true does not exceed \(\alpha\).

  • Common choices for \(\alpha\) are 0.1, 0.05, and 0.005.

  • This approach ensures that the probability of a type I error (incorrectly rejecting \(H_0\)) is controlled and does not exceed the chosen \(\alpha\).

Example
  • For instance, consider testing the hypothesis that the mean of a normal distribution with parameters \((\theta, 1)\) is equal to 1.

  • The test rejects the null hypothesis if the point estimate of \(\theta\) (i.e., the sample mean) deviates more than \(\frac{1.96}{\sqrt{n}}\) from 1.

  • As we will discuss in the next section, the value \(\frac{1.96}{\sqrt{n}}\) is selected to achieve a significance level of \(\alpha = 0.05\).

4 Hypothesis Tests Concerning the mean of a normal population

4.1 With known Variance (Z-test)

  • Let \(X_1, X_2, \ldots, X_n\) be a sample of size \(n\) from a normal distribution with an unknown mean \(\mu\) and a known variance \(\sigma^2\).

  • We are interested in testing the null hypothesis:

\[ H_0 : \mu = \mu_0 \]

  • Against the alternative hypothesis:

\[ H_1 : \mu \neq \mu_0 \]

  • Where \(\mu_0\) is a specified constant.

  • Since \(\bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i\) is a natural point estimator of \(\mu\), it is reasonable to accept \(H_0\) if \(\bar{X}\) is not too far from \(\mu_0\).

  • Thus, the critical region of the test would be of the form:

\[ C = \{X_1, \ldots, X_n : |\bar{X} - \mu_0| > c\} \]

for some suitably chosen value \(c\).

  • To ensure that the test has a significance level \(\alpha\), we must determine the critical value \(c\) in the above equation such that the type I error is equal to \(\alpha\). This means \(c\) must satisfy:

\[ P_{\mu_0} \{|\bar{X} - \mu_0| > c\} = \alpha \]

where \(P_{\mu_0}\) denotes that the probability is computed under the assumption that population mean, \(\mu = \mu_0\).

  • When \(\mu = \mu_0\), \(\bar{X}\) follows a normal distribution with mean \(\mu_0\) and variance \(\frac{\sigma^2}{n}\). Therefore, the standardized variable \(Z\) defined by:

\[ Z = \frac{\bar{X} - \mu_0}{\sigma / \sqrt{n}} \]

will have a standard normal distribution.

  • The probability of a type I error is given by:

\[ P \left( |\bar{X} - \mu_0| > c \right) = \alpha \]

  • Equivalently, this can be written as:

\[ 2P \left( Z > \frac{c \sqrt{n}}{\sigma} \right) = \alpha \]

  • Where \(Z\) is a standard normal random variable. We know that:

\[ P \left( Z > z_{\alpha/2} \right) = \frac{\alpha}{2} \]

  • Therefore, we have:

\[ \frac{c \sqrt{n}}{\sigma} = z_{\alpha/2} \]

  • Solving for \(c\), we get:

\[ c = \frac{z_{\alpha/2} \sigma}{\sqrt{n}} \]

  • Thus, the test at significance level \(\alpha\) is to reject \(H_0\) if:

\[ |\bar{X} - \mu_0| > \frac{z_{\alpha/2} \sigma}{\sqrt{n}} \]

  • And accept \(H_0\) otherwise. Equivalently, we can reject \(H_0\) if:

\[ \sqrt{n} \frac{|\bar{X} - \mu_0|}{\sigma} > z_{\alpha/2} \]

  • And accept \(H_0\) if:

\[ \sqrt{n} \frac{|\bar{X} - \mu_0|}{\sigma} \leq z_{\alpha/2} \]

Problem

If a signal of value \(\mu\) is sent from location A, then the value received at location B is normally distributed with mean \(\mu\) and standard deviation 2. That is, the random noise added to the signal is an N(0, 4) random variable. There is reason for the people at location B to suspect that the signal value \(\mu\) = 8 will be sent today. Test this hypothesis if the same signal value is independently sent five times and the average value received at location B is X = 9. 5.

4.1.1 Choosing the Significance Level

  • The appropriate significance level \(\alpha\) depends on the specific context and consequences of the hypothesis test.

  • If rejecting the null hypothesis \(H_0\) would lead to significant costs or consequences, a more conservative significance level (e.g., 0.05 or 0.01) should be chosen.

  • If there is a strong initial belief that \(H_0\) is true, strict evidence is required to reject \(H_0\), implying a lower significance level.

  • The test can be described as follows: For an observed value of the test statistic \(\sqrt{n} \frac{|\bar{X} - \mu_0|}{\sigma}\), denoted as \(v\), reject \(H_0\) if the probability of the test statistic being as large as \(v\) under \(H_0\) is less than or equal to \(\alpha\).

  • This probability is known as the p-value of the test. \(H_0\) is accepted if \(\alpha\) is less than the p-value and rejected if \(\alpha\) is greater than or equal to the p-value.

  • In practice, the significance level is sometimes not set in advance. Instead, the p-value is calculated from the data, and decisions are made based on the p-value.

  • If the p-value is much larger than any reasonable significance level, \(H_0\) is accepted. Conversely, if the p-value is very small, \(H_0\) is rejected.

4.1.2 Hypothesis Testing Summary: Z- Test

Caption: Summary of hypothesis testing for a sample from a \(N(\mu, \sigma^2)\) population with known \(\sigma^2\).
Sample and Population Details
Sample \(\{X_1, X_2, . . . , X_n\}\)
Population \(N(\mu, \sigma^2)\)
Known Parameter \(\sigma^2\)
Sample Mean \(\bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i\)
Significance Level \(\alpha\)
Hypothesis Test Statistic (TS) Reject if p-Value if TS = t
\(H_0: \mu = \mu_0\) vs \(H_1: \mu \neq \mu_0\) \(\sqrt{n}(\bar{X} - \mu_0)/\sigma\) \(|TS| > z_{\alpha/2}\) \(2P\{Z \geq |t|\}\)
\(H_0: \mu \leq \mu_0\) vs \(H_1: \mu > \mu_0\) \(\sqrt{n}(\bar{X} - \mu_0)/\sigma\) \(TS > z_{\alpha}\) \(P\{Z \geq t\}\)
\(H_0: \mu \geq \mu_0\) vs \(H_1: \mu < \mu_0\) \(\sqrt{n}(\bar{X} - \mu_0)/\sigma\) \(TS < -z_{\alpha}\) \(P\{Z \leq t\}\)
Problem

Imagine you’re the quality control manager at a company that prides itself on the precision of its product weights. The company claims that the average weight of their product is exactly 100 grams. But, as a diligent manager, you decide to put this claim to the test. You randomly select a sample of 30 products and measure their weights. To your surprise, the average weight of your sample is 110 grams! Now, you need to determine if this difference is statistically significant or just a fluke. Assume the population standard deviation as 15 grams.

import numpy as np
from scipy import stats

# Given data
sample_mean = 495
population_mean = 500
std_dev = 10
sample_size = 30
alpha = 0.05

# Calculate the Z-score
z_score = (sample_mean - population_mean) / (std_dev / np.sqrt(sample_size))

# Calculate the p-value
p_value = stats.norm.cdf(z_score)

# Determine if we reject the null hypothesis
reject_null = p_value < alpha

# Output the results
print(f"Z-score: {z_score}")
print(f"P-value: {p_value}")
print(f"Reject the null hypothesis: {reject_null}")
Z-score: -2.7386127875258306
P-value: 0.00308494966027208
Reject the null hypothesis: True

4.2 With unknown Variance (T-test)

  • Let \(X_1, X_2, \ldots, X_n\) be a sample of size \(n\) from a normal distribution with an unknown mean \(\mu\) and a unknown variance.

  • Say, we are interested in testing the null hypothesis:

\[ H_0 : \mu = \mu_0 \]

  • Against the alternative hypothesis:

\[ H_1 : \mu \neq \mu_0 \]

  • Where \(\mu_0\) is a specified constant.

  • In the previous case (with known variance), for a significance level (\(\alpha\)) we accepted the null hypothesis if:

\[ \left| \frac{\bar{X} - \mu_0 }{\sigma/\sqrt{n}} \right| \leq z_{\alpha/2} \]

  • But in this case, \(\sigma\) is unknown.

  • We know that the statistic, T, as given below has a t-distribution with n-1 degrees of freedom when \(\mu\) = \(\mu_0\).

\[ T = \frac{\bar{X} - \mu_0 }{S\sqrt{n}} \]

where S is the sample standard deviation.

  • Hence here with \(H_0\): \(\mu\) = \(\mu_0\) and \(H_1\); \(\mu\) \(\neq\) \(\mu_0\); analogous to the Z-test here in T-test we can:
two-sided t-test
  • reject the null hypothesis (\(H_0\)) if:

\[ \left| \frac{\bar{X} - \mu_0}{S/\sqrt{n}} \right| > t_{\alpha/2} \]

  • accept \(H_0\) if:

\[ \left| \frac{\bar{X} - \mu_0}{S/\sqrt{n}} \right| \leq t_{\alpha/2} \]

4.2.1 Hypothesis Testing Summary: T- Test

Caption: Summary of hypothesis testing for a sample from a \(N(\mu, \sigma^2)\) population with unknown \(\sigma^2\).
Sample and Population Details
Sample \(\{X_1, X_2, . . . , X_n\}\)
Population \(N(\mu, \sigma^2)\)
Sample Mean \(\bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i\)
Sample Variance \(S^2 = \frac{1}{n-1} \sum_{i=1}^{n} (X_i-\bar{X})^2\)
Significance Level \(\alpha\)
Hypothesis Test Statistic (TS) Reject if p-Value if TS = t
\(H_0: \mu = \mu_0\) vs \(H_1: \mu \neq \mu_0\) \(\sqrt{n}(\bar{X} - \mu_0)/S\) \(|TS| > t_{\alpha/2, n-1}\) \(2P\{T_{n-1} \geq |t|\}\)
\(H_0: \mu \leq \mu_0\) vs \(H_1: \mu > \mu_0\) \(\sqrt{n}(\bar{X} - \mu_0)/S\) \(TS > t_{\alpha, n-1}\) \(P\{T_{n-1} \geq t\}\)
\(H_0: \mu \geq \mu_0\) vs \(H_1: \mu < \mu_0\) \(\sqrt{n}(\bar{X} - \mu_0)/S\) \(TS < -t_{\alpha, n-1}\) \(P\{T_{n-1} \leq t\}\)

\(\text{\textcolor{gray}{$T_{n−1}$ is a t-random variable with (n - 1) degrees of freedom: P($T_{n−1}$ > $t_{\alpha,n−1}$) = $\alpha$.}}\)

Problem

A public health official claims that the mean home water use is at most 350 gallons a day. To verify this claim, a study of 20 randomly selected homes was instigated with the result that the average daily water uses of these 20 homes were as follows:

340 344 362 375 356 386 354 364 332 402 340 355 362 322 372 324 318 360 338 370

Do the data contradict the official’s claim?

import numpy as np
from scipy import stats

# Given data
data = [340, 344, 362, 375, 356, 386, 354, 364, 332, 402, 340, 355, 362, 322, 372, 324, 318, 360, 338, 370]
sample_mean = np.mean(data)
sample_std = np.std(data, ddof=1)
sample_size = len(data)
population_mean = 350
alpha = 0.05

# Calculate the T-score
t_score = (sample_mean - population_mean) / (sample_std / np.sqrt(sample_size))

# Calculate the p-value
p_value = 2 * (1 - stats.t.cdf(np.abs(t_score), df=sample_size-1))

# Determine if we reject the null hypothesis
reject_null = p_value < alpha

# Output the results
print(f"Sample Mean: {sample_mean}")
print(f"Sample Standard Deviation: {sample_std}")
print(f"T-score: {t_score}")
print(f"P-value: {p_value}")
print(f"Reject the null hypothesis: {reject_null}")
Sample Mean: 353.8
Sample Standard Deviation: 21.847798877449275
T-score: 0.7778411328447066
P-value: 0.4462410900531899
Reject the null hypothesis: False

5 Hypothesis Tests Concerning the variance of a normal population

  • Let \(X_1, X_2, \ldots, X_n\) be a sample of size \(n\) from a normal distribution with an unknown mean \(\mu\) and variance \(\sigma^2\).

  • We are interested in testing the null hypothesis:

\[ H_0 : \sigma^2 = \sigma^2_0 \]

  • Against the alternative hypothesis:

\[ H_1 : \sigma^2 \neq \sigma^2_0 \]

  • Where \(\sigma^2_0\) is a specified constant.

  • We know from the discussion on sampling distribution, \(\frac{(n-1)S^2}{\sigma^2_0}\) has a chi-squared distribution with (n-1) degrees of freedom.

\[ \frac{(n-1)S^2}{\sigma^2_0} \sim \chi^2_{n-1} \]

  • Also,

\[ P_{H_0} \left( \chi^2_{1-\alpha/2,\,\, n-1} \leq \frac{(n-1)S^2}{\sigma^2_0} \leq \chi^2_{\alpha/2,\,\, n-1} \right) = 1 - \alpha \]

  • In this case the test statistic (TS) is:

\[ TS = \frac{(n-1)S^2}{\sigma^2_0} \]

  • The p-value for this case is:

\[ p-value = 2 \,\, \min\left( P(\chi^2_{n-1} < TS), 1 - P(\chi^2_{n-1} < TS) \right) \]

5.0.1 Hypothesis Testing Summary: chi-square Test

Caption: Summary of hypothesis testing for a sample from a \(N(\mu, \sigma^2)\) population.
Sample and Population Details
Sample \(\{X_1, X_2, . . . , X_n\}\)
Population \(N(\mu, \sigma^2)\)
Sample Mean \(\bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i\)
Sample Variance \(S^2 = \frac{1}{n-1} \sum_{i=1}^{n} (X_i-\bar{X})^2\)
Significance Level \(\alpha\)
Hypothesis Test Statistic (TS) Reject if p-Value if TS = t
\(H_0: \sigma^2 = \sigma^2_0\) vs \(H_1: \sigma^2 \neq \sigma^2_0\) \(\frac{(n-1)S^2}{\sigma^2_0}\) \(TS \notin \left[\chi^2_{1-\alpha/2, \,\, n-1},\,\,\, \chi^2_{\alpha/2, \,\, n-1} \right]\) \(2\,\, \min\left( P(\chi^2_{n-1} < t), 1 - P(\chi^2_{n-1} < t) \right)\)
\(H_0: \sigma^2 \leq \sigma^2_0\) vs \(H_1: \sigma^2 > \sigma^2_0\) \(\frac{(n-1)S^2}{\sigma^2_0}\) \(TS > \chi^2_{\alpha, \,\, n-1}\) \(P\left(\chi^2_{n-1} \geq t\right)\)
\(H_0: \sigma^2 \geq \sigma^2_0\) vs \(H_1: \sigma^2 < \sigma^2_0\) \(\frac{(n-1)S^2}{\sigma^2_0}\) \(TS < \chi^2_{1-\alpha, \,\, n-1}\) \(P\left(\chi^2_{n-1} \leq t\right)\)
Problem

A machine that automatically controls the amount of ribbon on a tape has recently been installed. This machine will be judged to be effective if the standard deviation \(\sigma\) of the amount of ribbon on a tape is less than .15 cm. If a sample of 20 tapes yields a sample variance of S\(^2\) = .025 cm\(^2\), are we justified in concluding that the machine is ineffective? Assume the level of significance as 0.05.

import numpy as np
from scipy.stats import chi2

# Given data
sample_variance = 0.025
sample_size = 20
population_variance = 0.15**2
alpha = 0.05

# Calculate the test statistic
test_statistic = (sample_size - 1) * sample_variance / population_variance

# Calculate the critical values
chi2_critical_low = chi2.ppf(alpha / 2, df=sample_size - 1)
chi2_critical_high = chi2.ppf(1 - alpha / 2, df=sample_size - 1)

# Calculate the p-value
p_value = 1 - chi2.cdf(test_statistic, df=sample_size - 1)

# Determine if we reject the null hypothesis
reject_null = test_statistic < chi2_critical_low or test_statistic > chi2_critical_high

# Output the results
print(f"Test Statistic: {test_statistic}")
print(f"Chi-square Critical Low: {chi2_critical_low}")
print(f"Chi-square Critical High: {chi2_critical_high}")
print(f"P-value: {p_value}")
print(f"Reject the null hypothesis: {reject_null}")
Test Statistic: 21.111111111111114
Chi-square Critical Low: 8.906516481987971
Chi-square Critical High: 32.85232686172969
P-value: 0.33069403418551535
Reject the null hypothesis: False

Additional Topic

Equality of Means of two normal populations
  • Comparing the means of two different normal populations is common in hypothesis testing.

  • Example scenarios include comparing average test scores of students from two schools or average lifespans of two brands of light bulbs.

  • Use a two-sample z-test to test the equality of means of two normal populations with unequal variances.

  • Null Hypothesis (\(H_0\)): The means of the two populations are equal.

  • Alternative Hypothesis (\(H_1\)): The means of the two populations are not equal.

  • Test statistic for the two-sample z-test:

\[ z = \frac{\left(\bar{X} - \bar{Y}\right) - \left(\mu_X -\mu_Y\right)}{\sqrt{\frac{\sigma_X^2}{n_X} + \frac{\sigma_Y^2}{n_X}}} = \frac{\left(\bar{X} - \bar{Y}\right)}{\sqrt{\frac{\sigma_X^2}{n_X} + \frac{\sigma_Y^2}{n_Y}}} \]

  • where,

    • \(\bar{X}\) and \(\bar{Y}\): Sample means
    • \(\sigma_X^2\) and \(\sigma_Y^2\): Population variances
    • \(n_X\) and \(n_Y\): Sample sizes of the two groups
  • Compare the calculated z-value to the critical z-value from the standard normal distribution table with the chosen significance level (\(\alpha\)).

  • If the calculated z-value is greater than the critical z-value, reject the null hypothesis and conclude a significant difference between the means of the two populations.

Important

Additionally, refer to section 8.4, titled “TESTING THE EQUALITY OF MEANS OF TWO NORMAL POPULATIONS,” in “INTRODUCTION TO PROBABILITY AND STATISTICS FOR ENGINEERS AND SCIENTISTS” by Sheldon M. Ross.