
STAT 205: Introduction to Mathematical Statistics
University of British Columbia Okanagan
So far, we have done inference for population means \(\mu\).
The exact same ideas extend to population proportions.
| Population | Sample Statistic | Sampling Distribution |
|---|---|---|
| \(\mu\) | \(\bar X\) | \(\bar X \sim N(\mu, \sigma/\sqrt{n})\) |
| \(p\) | \(\hat p\) | \(\hat p \sim ?\) |
What is the sampling distribution of \(\hat p\)?
Let’s look at an example to help us conceptualize.
Proportion of children diagnosed with ADHD
Suppose we are interested in the proportion of children in our region who have been diagnosed with ADHD.
Two possible outcomes:
✅ the child has been diagnosed with ADHD
❌ the child has NOT been diagnosed with ADHD
Binary outcome:
✅ 1
❌ 0
We can denote random variable
\[ X_i = \begin{cases} 1 & \text{ADHD diagnosis}\\ 0 & \text{No diagnosis} \end{cases} \] If we take a simple random sample of size \(n\) our estimated proportion is:
\[ \hat p = \frac{X_1 + X_2 + \dots + X_n}{n} \phantom{ = \bar X} \]
We can denote random variable
\[ X_i = \begin{cases} 1 & \text{ADHD diagnosis}\\ 0 & \text{No diagnosis} \end{cases} \] If we take a simple random sample of size \(n\) our estimated proportion is:
\[ \hat p = \frac{X_1 + X_2 + \dots + X_n}{n} = \bar X \]
Under certain conditions, the CLT tells us that
So all that is left to do is recognize the population distribution and identify the mean and variance.
\[ X_i \sim \quad ? \]
So all that is left to do is recognize the population distribution and identify the mean and variance.
\[ X_i \sim \text{Bernoulli}(p) \]
where \(p\) is the probability that a randomly selected child has an ADHD diagnosis.
For \(X_i \sim \text{Bernoulli}(p)\), \(\quad \quad \boxed{\text{recall } \bar X \sim N(\mu, \sigma/\sqrt{n})}\)
\[ \begin{align} \mathbb{E}[X_i] =\mu & = p &\mathbb{Var}(X_i) = \sigma^2 = p(1-p)\\ & & \implies \sigma = \sqrt{p(1-p)} \end{align} \]
\[ \hat p \sim N\left(p, \sqrt{\frac{p(1-p)}{n}} \right)\\ \]
Sampling Distribution of \(\hat p\)
\[ \hat p \sim N\left(\mu_{\hat p} = p, \sigma_{\hat p} = \sqrt{\frac{p(1-p)}{n}} \right) \]
Conditions for Using the Sampling Distribution of \(\hat p\)
To ensure the CLT applies and the distribution of \(\hat p\) is approximately Normal, we need:
\[ \begin{align} np \geq 10 && \text{and} && n(1-p) \geq 10 \end{align} \]
This test uses a test statistic, i.e., a function of the sample data use to decide whether to reject or fail to reject \(H_0\).
It requires a critical value(s)2 which separates the
rejection region (RR): the set of observed test statistics for which \(H_0\) will be rejected
“acceptance”/fail-to-reject region: the set of observed test statistics for which we would fail to reject \(H_0\)
Our base assumption for hypothesis testing is that \(H_0\) is true.
\[H_0: \mu = \mu_0\] null dist. for mean
\[ \begin{align} \bar X &\sim N(\mu_\bar X, \sigma_{\bar X})\\ \bar X &\sim N(\mu_0, \sigma/\sqrt{n}) \end{align} \]
\[H_0: p = p_0\] null dist. for proportions
\[ \begin{align} \hat p &\sim N(\mu_{\hat p}, \sigma_{\hat p})\\ \hat p &\sim N\left(p_0, \sqrt{\frac{p_0(1-p_0)}{n}}\right) \end{align} \]
As before, we standardize our sample statistic to \(N(0,1)\).
standardization for mean
\[ Z = \frac{\bar X - \mu_{\bar x}}{\sigma_{\bar X}} \] where \(\bar X \sim N(\mu_{\bar X}, \sigma_{\bar X})\)
standardization for proportions
\[ Z = \frac{\hat p - \mu_{\hat p}}{\sigma_{\hat p}} \] where \(\hat p \sim N(\mu_{\hat p}, \sigma_{\hat p})\)
As before, we standardize our sample statistic to \(N(0,1)\).
standardization for mean
\[ Z = \frac{\bar X - \mu_0}{\dfrac{\sigma}{\sqrt{n}}} \] where \(Z \sim N(0,1)\)
standardization for proportions
\[ Z = \frac{\hat p - p_0}{ \sqrt{\dfrac{p_0(1-p_0)}{n}} } \] where \(Z \sim N(0,1)\)
Critical Value approach for proportions
Check Assumptions. If satisfied, 1. State hypotheses \[\begin{equation} H_0 : p = p_0 \quad \text{ vs. } \quad H_A: \begin{cases} p \neq p_0& \text{ two-sided test} \\ p < p_0&\text{ one-sided (lower-tail) test} \\ p > p_0&\text{ one-sided (upper-tail) test} \end{cases} \end{equation}\]
Find critical value:
\[\begin{cases} P(-z_{crit} < Z < z_{crit}) = 1 - \alpha &\text{ if } H_A: p \neq p_0 \\ P(Z < z_{crit}) = \alpha &\text{ if } H_A: p < p_0 \\ P(Z > z_{crit}) = \alpha &\text{ if } H_A: p > p_0 \end{cases}\]Compute the test statistic \(z_{obs} = \dfrac{\hat p - p_0}{\sqrt{p_0(1-p_0)/n}}.\)
Conclusion: reject \(H_0\) if \(z_{obs} \in\) rejection region, otherwise, fail to reject \(H_0\).
Assumptions for Hypothesis Tests for Proportions
ChatGPT
Exercise 1 According to a November 2023 research survey conducted by Pew Research1, about 13% of all U.S. teens have used the generative artificial intelligence (AI) chatbot in their schoolwork. Suppose we wish to investigate if this the proportion of students on our campus using generative AI to do homework. We survey 80 randomly selected UBCO students and find that 14 have admitted to using generative AI to do homework. Perform a formal hypothesis test for determining the proportion at UBCO differs from that of U.S. teens.
We need to check the success-fail condition: that \(np \geq 10\) and \(n(1-p) \geq 10)\).
Important
We use the hypothesized value \(p_0\) for these checks.
Here \(n\) = 80, our hypothesized value for \(p\) is 0.13
\(np_0\) = 10.4 ✅
\(np_0\) = 69.6 ✅
State null and alternative hypotheses
\(H_0:\)
\(H_A:\)
Find the critical values:

Calculate the test statistic:
\[ \begin{equation} Z_{obs} = \dfrac{\hat p - p_0}{\sqrt{p_0(1-p_0)/n}} = \end{equation} \]
Conclusion:
Alternatively we could have used the \(p\)-value approach …
\(p\)-value
Definition 1 The \(p\)-value is the probability, calculated assuming that the null hypothesis is true, of obtaining a value of the test statistic at least as contradictory to \(H_0\) as the value calculated from the available sample.
In other words, it quantifies the chances of obtaining the observed data or data more favorable to the alternative than our current data set if the null hypothesis were true.
\(p\)-value approach for proportions
State hypotheses \[\begin{equation} H_0 : p = p_0 \quad \text{ vs. } \quad H_A: \begin{cases} p \neq p_0& \text{ two-sided test} \\ p < p_0&\text{ one-sided (lower-tail) test} \\ p > p_0&\text{ one-sided (upper-tail) test} \end{cases} \end{equation}\]
Compute the test statistic \(z_{obs} = \dfrac{\hat p - p_0}{\sqrt{p_0(1-p_0)/n}}.\)
Calculate the \(p\)-value
\[\begin{cases} 2P(Z \geq |z_{obs}|) &\text{ if } H_A: p \neq p_0 \\ P(Z \leq z_{obs}) &\text{ if } H_A: p < p_0 \\ P(Z \geq z_{obs}) &\text{ if } H_A: p > p_0 \end{cases}\]Conclusion: reject \(H_0\) if \(p\)-value is less than \(\alpha\) (typically 0.05), otherwise, fail to reject \(H_0\).
Returning to Exercise 1, and using the same hypotheses, the \(p\)-value can be calculated as follows:
\[\begin{align} 2P(Z \geq |z_{obs}|) &= 2P(Z \geq |1.1968127|) \\ &= 2 (0.1156898)\\ &= 0.2313796\\ \end{align}\]
Since this \(p\)-value (\(\approx 0.231)\) is greater than \(\alpha = 0.05\), we fail to reject \(H_0\).
If 13% of use ChatGPT to do homework, we would expect \(0.13\times80= 10.4\) in our survey to answer “yes”.
As our sample moves farther from that expected value, the null hypothesis becomes less and less plausible.
This provides increasing evidence against the null \(H_0\).
The distribution of the test statistic for testing \(H_0: p = 0.13\) vs \(H_0: p \neq 0.13\). Shaded in yellow is the \(p\)-value when we observed 14 yes’s.
The distribution of the test statistic for testing \(H_0: p = 0.13\) vs \(H_0: p \neq 0.13\). Shaded in yellow is the \(p\)-value when we observed 15 yes’s.
The distribution of the test statistic for testing \(H_0: p = 0.13\) vs \(H_0: p \neq 0.13\). Shaded in yellow is the \(p\)-value when we observed 16 yes’s.
The distribution of the test statistic for testing \(H_0: p = 0.13\) vs \(H_0: p \neq 0.13\). Shaded in yellow is the \(p\)-value when we observed 17 yes’s.
The distribution of the test statistic for testing \(H_0: p = 0.13\) vs \(H_0: p \neq 0.13\). Shaded in yellow is the \(p\)-value when we observed 18 yes’s.
The distribution of the test statistic for testing \(H_0: p = 0.13\) vs \(H_0: p \neq 0.13\). Shaded in yellow is the \(p\)-value when we observed 19 yes’s.
The distribution of the test statistic for testing \(H_0: p = 0.13\) vs \(H_0: p \neq 0.13\). Shaded in yellow is the \(p\)-value when we observed 20 yes’s.
| \(p\)-value | Evidence against \(H_0\) | Significance code in R
|
|---|---|---|
| \(0.1 \leq p \leq 1\) | no evidence | |
| \(0.05 < p \leq 0.10\) | weak evidence | . |
| \(0.01 < p \leq 0.05\) | sufficient evidence | * |
| \(0.001 < p \leq 0.01\) | strong evidence | ** |
| \(0< p \leq 0.001\) | very strong evidence | *** |
We build a \(100*(1-\alpha)\)% confidence interval based on:
\[ \begin{align} \text{point estimate} &\pm \textcolor{red}{\boxed{\text{Margin of Error}}}\\ \hat p \ &\pm \textcolor{red}{\boxed{z_{\alpha/2} \times \sqrt{\frac{p(1-p)}{n}}}} \end{align} \]
Alternatively, we could express our CI as:
\[ (\hat p - \textcolor{red}{\boxed{\text{ME}}}, \hat p + \textcolor{red}{\boxed{\text{ME}}}) \]
Hypothesis tests use \(p_0\) in the standard error \[ \sigma_{\hat p} = \sqrt{\frac{p_0(1-p_0)}{n}} \]
Confidence Intervals use \(\hat p\) in the standard error \[ \sigma_{\hat p} = \sqrt{\frac{\hat p(1-\hat p)}{n}} \]
Exercise 1: 95% CI based on 14 out of 80 yes’s \(\left(\frac{14}{80} = 0.175\right)\)
\[ \begin{align} \hat p &\pm z_{\alpha/2} \sqrt{\frac{\hat p(1-\hat p)}{n}}\\ 0.175 &\pm 1.96 \frac{0.175(1- 0.175)}{\sqrt{80}}\\ 0.175 &\pm 0.08326396 \\ [0.0917&, 0.2583] \end{align} \]
Since \(p_0\) = 0.13 lies within this CI we would fail to reject the null hypothesis that \(p = 0.13\)
Exercise 1: 95% CI based on 17 out of 80 yes’s \(\left(\frac{17}{80} = 0.2125\right)\)
\[ \begin{align} \hat p &\pm z_{\alpha/2} \sqrt{\frac{\hat p(1-\hat p)}{n}}\\ 0.2125 &\pm 1.96 \frac{0.2125(1- 0.2125)}{\sqrt{80}}\\ 0.2125 &\pm 0.08964289 \\ [0.1229&, 0.3021] \end{align} \]
Since \(p_0\) = 0.13 does not lie within this CI would have sufficient evidence to reject the null hypothesis that \(p = 0.13\)
Margin of Error for a Proportion I
Suppose the sample proportion value changed from \(\hat p=0.2125\) to \(\hat p=0.4\). What happens to the margin of error? (Assume the same \(\alpha\) and \(n\))
Margin of Error for a Proportion II
Suppose the sample proportion value changed from \(\hat p=0.2125\) to \(\hat p=0.9\). What happens to the margin of error? (Assume the same \(\alpha\) and \(n\))
Margin of Error for a Proportion III
All else equal, for which value of \(p\) is the standard error largest?
As we have seen before we can do sample size calculations to acheive a desired margin of error.
Lightbulbs
Exercise 2 Suppose that we want to estimate the true proportion of defective light bulbs in a very large shipment, and that we want to be at least 95% confident that the error in our estimate is at most 0.03. How large a sample will we need if …
\[ \begin{align} \text{point estimate} &\pm \textcolor{red}{\boxed{\text{Margin of Error}}}\\ \hat p \ &\pm \textcolor{red}{\boxed{z_{\alpha/2} \times \sqrt{\frac{p(1-p)}{n}}}} \end{align} \]
If we have a desire margin of error = \(E\), we simply solve for \(n\)
\[ \begin{align} z_{\alpha/2} \times \sqrt{\frac{0.5(1-0.5)}{n}} &= E\\ \sqrt{\frac{p(1-p)}{n}} &= \frac{E}{z_{\alpha/2}}\\ \frac{p(1-p)}{n} &= \left(\frac{E}{z_{\alpha/2}}\right)^2\\ \implies n &= p(1-p) \left(\frac{z_{\alpha/2}}{E}\right)^2 \end{align} \]
Using the conservative formula (worst case when \(p = 0.5\)),
\[ \begin{align} n &= p(1-p) \left(\frac{z_{\alpha/2}}{E}\right)^2\\ &= 0.5(1-0.5) \left(\frac{z_{0.025}}{0.03}\right)^2\\ &= 0.5(1-0.5) \left(\frac{1.959964}{0.03}\right)^2\\ &= 1067.0718946 \implies n \text{ must be at least } 1068 \end{align} \]
Using the formula with prior information \(p\) is at most 0.08 we use the worst case scenario of pknown (in code below) = 0.08 (anything lower will lead to a smaller required sample size).
\[ \begin{align} n &= p(1-p) \left(\frac{z_{\alpha/2}}{E}\right)^2\\ &= 0.08(1-0.08) \left(\frac{z_{0.025}}{0.03}\right)^2\\ &= 0.0736 \left(\frac{1.959964}{0.03}\right)^2\\ &= 314.1459658 \implies n \text{ must be at least } 315 \end{align} \]
prop.test(x, n, p = NULL,
alternative = c("two.sided", "less", "greater"),
conf.level = 0.95, correct = TRUE)x counts of successn counts of trialsp hypothesized \(p_0\)
conf.level = \(1-\alpha\)
correct If Yates’ continuity correction gets appliedx = 14Note that when we run this test, our \(p\)-values agree with what we obtained earlier, however the test statistic and CI do not. We will return to this later.
1-sample proportions test without continuity correction
data: 14 out of 80, null probability 0.13
X-squared = 1.4324, df = 1, p-value = 0.2314
alternative hypothesis: true p is not equal to 0.13
95 percent confidence interval:
0.1072064 0.2725754
sample estimates:
p
0.175
x = 17Note that when we run this test, our \(p\)-values agree with what we obtained earlier, however the test statistic and CI do not. We will return to this later.
1-sample proportions test without continuity correction
data: 17 out of 80, null probability 0.13
X-squared = 4.8143, df = 1, p-value = 0.02822
alternative hypothesis: true p is not equal to 0.13
95 percent confidence interval:
0.1371239 0.3142216
sample estimates:
p
0.2125
x = 20Note that when we run this test, our \(p\)-values agree with what we obtained earlier, however the test statistic and CI do not. We will return to this later.
1-sample proportions test without continuity correction
data: 20 out of 80, null probability 0.13
X-squared = 10.186, df = 1, p-value = 0.001415
alternative hypothesis: true p is not equal to 0.13
95 percent confidence interval:
0.1680623 0.3548467
sample estimates:
p
0.25
Proportion inference is just mean inference on 0/1 data
The quality of the Normal approximation depends strongly on p
Note: There is more than one way to construct a confidence interval for \(p\).
Comments
When the observed test statistics falls in the rejection region, we will necessarily have a significant \(p\)-value and vice versa. Observing…