
STAT 205: Introduction to Mathematical Statistics
University of British Columbia Okanagan
Population
π©ββοΈπ¨βπ³π§βπ¬π©βπ¨π¨βππ§βπ«π©ββοΈπ¨ββοΈπ§βπΎπ©βπ§π¨βπ€π§βππ©βππ¨βππ§ββοΈπ©βπ¬π¨βπ¨π§βππ©βππ¨βπ§
\(\theta = ?\)
\(\downarrow\)
Sample
π¨βπ π©ββοΈ π©βπ¬ π§βπΎ π§βπ
\(\hat \theta\)

Sample
π¨βπ π©ββοΈ π©βπ¬ π§βπΎ π§βπ
\(\downarrow\)
\(\hat \theta\)
point estimates
Sample
π¨βπ π©ββοΈ π©βπ¬ π§βπΎ π§βπ
\(\downarrow\)
\[ \begin{align} \Big[\hat \theta + \text{ME}, \hat \theta + \text{ME}\Big] \\ \text{ confidence intervals} \end{align} \]
Sample
π¨βπ π©ββοΈ π©βπ¬ π§βπΎ π§βπ
\(\downarrow\)
\[\begin{align} H_0: \theta = \theta_0 \quad\quad \text{ vs } \quad\quad H_A: \begin{cases} \theta < \theta_0\\ \theta > \theta_0\\ \theta \neq \theta_0 \end{cases} \end{align}\]
Assuming \(H_0: \theta = \theta_0\) is true, we get our null distribution
\(\downarrow\)

Observed test statistics falling in the rejection region \(\implies\) reject \(H_0\).
Null distribution with two-tailed critical regions (\(\alpha\) = 0.05).
In this simulation, 52 out of 1000 samples (5.2%) generated under the null hypothesis fall in the rejection regions.
All of the red dots
represent the case where we rejected the null hypothesis when we shouldnβt have.
In other words, the 52 unlucky samples are examples of Type I errors.
The simulation demonstrates the Type I error rate which theroretically1 is equal to \(\alpha\) (default 0.05).
iClicker: Identifying a Type I Error
Exercise 1 A university administrator is investigating whether the average GPA of students at her university is different from the national average of 3.2. She collects a random sample of students and performs a hypothesis test. Let \(\mu\) represent the true average GPA of students at her university and
\[ H_0: \mu = 3.2 \quad\quad H_a: \mu \neq 3.2 \]
Under which condition would the administrator commit a Type I error?
She concludes the universityβs average GPA is not 3.2 when it actually is.
She concludes the universityβs average GPA is not 3.2 when it actually is not.
She concludes the universityβs average GPA is 3.2 when it actually is.
She concludes the universityβs average GPA is 3.2 when it actually is not.
Type I Error
A Type I Error occurs when we incorrectly reject the null hypothesis (\(H_0\)) even though it is actually TRUE.
The probability of making a Type I error is \(\alpha\), i.e. the significance level.
Why donβt we just set our significance level to 1%?
Well there is a tradeoff with another type of error,β¦
Reality
| \(H_0\) True | \(H_0\) False | |
|---|---|---|
| Reject \(H_0\) | \(\textcolor{red}{\textsf{Type I error}}\) | \(\textcolor{green}{\textsf{Correct}}\) |
| Fail to reject \(H_0\) | \(\textcolor{green}{\textsf{Correct}}\) | \(\textcolor{red}{\textsf{Type II error}}\) |
Here columns represent the reality or underlying truth (that we never know), and rows represent out decision we make base on the hypothesis test.
Error Calculations
Exercise 2 A packaging machine is supposed to fill 500 g bags of rice. Historical data suggest the fill weights are approximately Normal with known \(\sigma\) = 12 g. At \(\alpha\) = 0.05 you plan to take a sample of size 36 to test:
\[ \begin{align} H_0: \mu = 500 \quad \text{vs} \quad H_A: \mu < 500 \end{align} \] If the true mean is \(\mu = 496\),
Figure 1: Null distribution with lower-tailed critical value at \(\bar x\) = 496.7
\[ \begin{align} \alpha &= \Pr(\text{Type I Error})\\ &= \Pr(\text{Reject }H_0 \mid \textcolor{NavyBlue}{H_0 \text{ is true}})\\ &= \Pr(\text{Reject }H_0 \mid \textcolor{NavyBlue}{\mu = 500})\\ &= \Pr(\textcolor{NavyBlue}{\bar X} < 496.7 \mid \textcolor{NavyBlue}{\mu = 500}) = 0.05 \end{align} \]
where \(\textcolor{NavyBlue}{\bar X \sim \text{Normal}(\mu_{\bar X} = 500, \sigma_{\bar X} = 12/\sqrt{36})}\)
Figure 2: Null distribution (green) next to the true sampling distribution of \(\bar X\) (orange).
Figure 3: Type II error if failing to reject the null when the null is false.
Power (green shaded region) is the probability of rejecting the null when the null is false.
\[ \begin{align} \beta &= \Pr(\text{Type II Error})\\ &= \Pr(\text{Fail to reject }H_0 \mid \textcolor{orange}{H_0 \text{ is false}})\\ &= \Pr(\text{Fail to reject }H_0 \mid \textcolor{orange}{\mu = 496})\\ &= \Pr(\textcolor{orange}{\bar X} \geq 496.7 \mid \textcolor{orange}{\mu = 496}) = 0.3612 \end{align} \]
where \(\textcolor{orange}{\bar X \sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{36})}\)
Figure 4: Power is the probability of rejecting the null when the null is false.
Power (green shaded region) is the probability of rejecting the null when the null is false.
\[ \begin{align} \text{Power} &= \Pr(\text{Reject }H_0 \mid \textcolor{orange}{H_0 \text{ is false}})\\ &= \Pr(\text{Reject }H_0 \mid \textcolor{orange}{\mu = 496})\\ &= \Pr(\textcolor{orange}{\bar X} < 496.7 \mid \textcolor{orange}{\mu = 496}) = 0.6388 \end{align} \]
where \(\textcolor{orange}{\bar X \sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{36})}\)
\[ \begin{align} \text{Power} &= \Pr(\text{Reject }H_0 \mid \textcolor{orange}{H_0 \text{ is false}})\\ &= \Pr(\text{Reject }H_0 \mid \textcolor{orange}{\mu = 496})\\ &= \Pr(\textcolor{orange}{\bar X} < 496.7 \mid \textcolor{orange}{\mu = 496}) = 0.6388\\ &= 1 - \Pr(\textcolor{orange}{\bar X} \geq 496.7 \mid \textcolor{orange}{\mu = 496}) \\ &= 1 - \Pr(\text{Failing to reject }H_0 \mid \textcolor{orange}{\mu = 496})\\ & = 1 - \Pr(\text{Type II Error}) \end{align} \]
\[\text{Power} = 1 - \beta\]
iClicker
Example 1 All else equal, if we reduce \(\alpha\) from 0.05 to 0.01, what happens to power.
A test with low power means we might not detect a real effect, leading to Type II errors.
A high-power test increases the likelihood of detecting true differences.
Effect of sample size on Type II errors
If we increase sample size but keep \(\alpha\) = 0.01 fixed, what happens to power?
The probability of making a type II error with a sample size of 36
The probability of making a type II error with a sample size of 41
The probability of making a type II error with a sample size of 43
The probability of making a type II error with a sample size of 46
The probability of making a type II error with a sample size of 51
The probability of making a type II error with a sample size of 56
The probability of making a type II error with a sample size of 61
The probability of making a type II error with a sample size of 66
The probability of making a type II error with a sample size of 76
The probability of making a type II error with a sample size of 86
The probability of making a type II error with a sample size of 96
The probability of making a type II error with a sample size of 106
The probability of making a type II error with a sample size of 116
The probability of making a type II error with a sample size of 126
The probability of making a type II error with a sample size of 136
Effect of sample size on Power
If we increase sample size but keep \(\alpha\) = 0.01 fixed, what happens to power?
\[ \begin{align} \bar X &\sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{36}) & \text{(True)}\\ \bar X &\sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{36}) & \text{(Null)} \end{align} \]
Figure 5: The null/true distribution with a sample size of 36
\[ \begin{align} \bar X &\sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{108}) & \text{(True)}\\ \bar X &\sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{108}) & \text{(Null)} \end{align} \]
The null/true distribution with a sample size of 36 (solid) verses 108 (dotted)
\[ \begin{align} \bar X &\sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{36}) & \text{critical } \bar X = 495.35 \\ \bar X &\sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{108}) & \text{critical } \bar X = 497.31 \end{align} \]
Null distribution (green) next to the true sampling distribution of \(\bar X\) (orange).
\[ \begin{align} \bar X &\sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{36}) & \text{critical } \bar X = 495.35 \\ \bar X &\sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{108}) & \text{critical } \bar X = 497.31 \end{align} \]
Null distribution (green) next to the true sampling distribution of \(\bar X\) (orange).
\[ \begin{align} \bar X &\sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{36}) & \text{critical } \bar X = 495.35 \\ \bar X &\sim \text{Normal}(\mu_{\bar X} = 496, \sigma_{\bar X} = 12/\sqrt{108}) & \text{critical } \bar X = 497.31 \end{align} \]
Null distribution (green) next to the true sampling distribution of \(\bar X\) (orange).
Effect of sample size on significance level
If we increase sample size but keep \(\alpha\) = 0.01 fixed, what happens to the signficance level?
\[ n \uparrow \implies \text{SE} \downarrow \implies \text{overlap} \downarrow \]
Type II error decreases: \(\beta \downarrow\)
Power increases: \(1-\beta \uparrow\)
Effect of \(\alpha\) on Power
All else equal, if we increase or significance level \(\alpha\) what happens to power?
Significance Level (\(\alpha\))
Sample Size (\(n\))
In hypothesis tests, effect size refers to how different the true mean (\(\mu\)) is from the hypothesized value (\(\mu_0\)).
Larger effect sizes make it easier to detect a difference, while smaller effect sizes require larger sample sizes for detection.
For two sample tests for \(\mu\), the effect size (often denoted by \(\delta\)) is the magnitude of the difference between group means.
\[ \delta = |\mu - \mu_0| \]
Significance Level (\(\alpha\))
Sample Size (\(n\))
Effect Size (\(\delta\))
Type I Error (False Positive)
A Type I Error occurs when we incorrectly reject the null hypothesis (\(H_0\)) even though it is actually true. \(\Pr(\text{Type I error} \mid H_0 \text{ true}) = \alpha\).
Type II Error (False Negative)
A Type II Error occurs when we fail to reject the null hypothesis (\(H_0\)) even though the alternative hypothesis (\(H_A\)) is actually true. \(\Pr(\text{Type II error} \mid H_A \text{ true}) = \beta\).
Power (\(1- \beta\))
The power of a test is the probability of correctly rejecting (\(H_0\)) when \(H_A\)) is true. It measures the sensitivity of a test to detect an effect when one truly exists.
Identifying Statistical Power
Exercise 3 In the decision matrix shown below, which cell represents statistical power?
| Reality | \(H_0\) True | \(H_0\) False |
|---|---|---|
| Reject \(H_0\) | (a) Type I error | (b) Correct |
| Fail to reject \(H_0\) | (c) Correct | (d) Type II error |
(e) None of the above