where \(S^2 = \frac{\sum_{i = 1}^n (X_i - \bar{X})}{n-1}\) is the sample variance and \(\chi^2_{n-1}\) is the Chi-squared distribution with \(n-1\) degrees of freedom.
Hypotheses
We may which to test if there is evidence to suggest that population variance differs for some hypothesized value \(\sigma_0^2\).
As before, we start with a null hypothesis (\(H_0\)) that the population variance equals a specified value (\(\sigma^2 = \sigma_0^2\))
We test this against the alternative hypothesis \(H_A\) which can either be one-sided (\(\sigma^2 < \sigma_0^2\) or \(\sigma^2 > \sigma_0^2\)) or two-sided (\(\sigma^2 \neq \sigma_0^2\)).
Test Statistic
Recall that our test statistic is calculated assuming the null hypothesis is true. Hence, if we are testing \(H_0: \sigma^2 = \sigma_0^2\), the test statistic we use is : \[
\chi^2 = \dfrac{(n-1)S^2}{\sigma_0^2}
\] where \(\chi^2 \sim \chi^2_{n-1}\).
Chi-square distrbituion
Assumptions
For the following inference procedures to be valid we require:
A simple random sample from the population
A normally distributed population (very important, even for large sample sizes)
Warning
It is important to note that if the population is not approximately normally distributed, chi-squared distribution may not accurately represent the sampling distribution of the test statistic.
Rejection Regions and \(p\)-values for the chi-square test concerning one variance
Alternative
Reject \(H_A\) if…
\(p\)-value
\(H_A: \sigma^2 < \sigma_0^2\)
\(\chi^2_{\text{obs}} \geq \chi^2_\alpha\)
Area to the right of \(\chi^2_{\text{obs}}\)
\(H_A: \sigma^2 > \sigma_0^2\)
\(\chi^2_{\text{obs}} \leq \chi^2_{1-\alpha}\)
Area to the left of \(\chi^2_{\text{obs}}\)
\(H_A: \sigma^2 \neq \sigma_0^2\)
\(\chi^2_{\text{obs}} \geq \chi^2_{\alpha/2}\) or \(\chi^2_{\text{obs}} \leq \chi^2_{1-\alpha/2}\)
Double the area to the left or right of \(\chi^2_{\text{obs}}\); whichever is smallest.
Critical Region (upper-tailed)
The rejection region associated with an upper-tailed test for the population variance. Note that the critical value will depend on the chosen significance level (\(\alpha\)) and the d.f.
Critical Region (lower-tailed)
The rejection region associated with an upper-tailed test for the population variance. Note that the critical value will depend on the chosen significance level (\(\alpha\)) and the d.f.
Critical Region (two-tailed)
The rejection region associated with an upper-tailed test for the population variance. Note that the critical value will depend on the chosen significance level (\(\alpha\)) and the d.f.
P-values
Similarly we can find \(p\)-values from Chi-squared tables or R
\(p\)-value for lower-tailed: \[\Pr(\chi^2 < \chi^2_{\text{obs}})\]\(p\)-value for upper-tailed: \[\Pr(\chi^2 > \chi^2_{\text{obs}})\]\(p\)-value for two-tailed:
Similarly we can find \(p\)-values from Chi-squared tables or R
\(p\)-value for lower-tailed: \[\Pr(\chi^2 < \chi^2_{\text{obs}})\]\(p\)-value for upper-tailed: \[\Pr(\chi^2 > \chi^2_{\text{obs}})\]\(p\)-value for two-tailed:
Beyond Burgers claim to have 18g grams of fat. A random sample of 6 burgers had a mean of 19.45 and a variance of 0.85 grams\(^2\). Suppose that the quality assurance team at the company will on accept at most a \(\sigma\) of 0.5. Use the 0.05 level of significance to test the null hypotehsis \(\sigma = 0.5\) against the appropriate alternative.
par(mar=c(4,4,0,0) +0.1)curve(dchisq(x, df = n-1), from =0, to =20, ylab ="Density", xlab =expression(chi^2))
Under the null hypothesis, the test statistic follows \(\chi^2 = (n-1)S^2/0.5^2\) a chi-square distribution with df = 5
Critical value
The critical value can be found by determining what value on the chi-square curve with 5 df yield a 5 percent probability in the upper tail (since we are doing an upper-tailed test). In R: qchisq(alpha, df=n-1, lower.tail = FALSE). Verify using \(\chi^2\) table.
Observed Test Statistic
Compute the observed test statistic which we denote by \(\chi^2_{\text{obs}}\)
Since the observed test statistic falls in the rejection region, i.e. \(\chi^2_{\text{obs}} > \chi^2_{\alpha}\), we rejection the null hypothesis in favour of the alternative.
P-value in R
Code
n =6; s =0.85; sig0 =0.5x_obs = ((n-1)*s^2)/(sig0^2)pval =pchisq(x_obs, df = n-1, lower.tail =FALSE)
Alternatively we could compute the p-value which in this case is 0.013. Since this is smaller than the alpha-level of 0.05, we reject the null hypothesis in favour of the alternative. Verify using \(\chi^2\) table.
P-value from tables
Using the chi-square distribution table we can see that our observed test statistic falls between two values. We can use the neigbouring values to approximate our p-value.
Approximate P-value
It is clear from the visualization that \[\begin{align}
\Pr(\chi^2_{5} > \chi^2_{0.025}) > \Pr(\chi^2_{5} > \chi^2_{\text{obs}})\\
\Pr(\chi^2_{5} > \chi^2_{\text{obs}}) < \Pr(\chi^2_{5} > \chi^2_{0.01}) \\
\end{align}\]
The \(p\)-value, \(\Pr(\chi^2_{5} > 14.45)\) can then be expressed as: \[\begin{align}
0.01 < p\text{-value } < 0.025
\end{align}\]
Conclusion
Since:
the \(p\)-value (0.013) is less than \(\alpha\) = 0.05 OR
the the observed test statistic (\(\chi^2_{\text{obs}}\) = 14.45) is larger than the critical value \(\chi^2_{\alpha}\)
we reject the null hypothesis in favour of the alternative. More specifically, there is very strong evidence to suggest that the population variance \(\sigma^2\) is greater than \(0.5^2\).