(Known population variance \(\sigma^2\))
University of British Columbia Okanagan
Sampling Distribution of the Sample Mean
Let \(X_1,\dots,X_n\) be a random sample from a population with mean \(\mu\) and variance \(\sigma^2\). The sample mean defined as \(\bar{X}=\frac{1}{n}\sum_{i=1}^n X_i\) has
\[\mathbb{E}(\bar{X})=\mu\]
\(\mathrm{Var}(\bar{X})=\frac{\sigma^2}{n}\)
\(\mathrm{SD}(\bar{X})=\frac{\sigma}{\sqrt{n}}\) \(\leftarrow\) we call this the standard error
If the population is normal or \(n\) is sufficiently large, then
\[
\bar{X}\sim\mathcal{N}\!\left(\mu,\frac{\sigma^2}{n}\right)
\]
This distribution is called the sampling distribution of the sample mean.
Last lecture, we studied the sampling distribution of the sample mean \(\bar X\).
Today, we will see how to construct a confidence interval (CI) for the population mean \(\mu\) based on that sampling distribution.
For simplicity, we assume (for now), that are sampling from a normal population with known standard deviation \(\sigma\).
The sampling distribution of \(\bar X\) assuming samples are drawn from a normal population with mean 100 and variance 5.
In most real-world settings, the population mean \(\mu\) is unknown and we rely on sample data to learn about it.
The sample mean \(\bar X\) is a point estimate of \(\mu\).
A point estimate gives a single best guess, but provides no information about the uncertainty associated with this guess.
We can do better than a single number.
Rather than reporting a single number, we report a best guess together with a plus–minus.
This plus-minus is called a margin of error (ME)
The ME tells us how uncertain we are about our estimate.
The sampling distribution of the sample mean describes the distribution of \(\bar X\) across repeated samples drawn from the same population.
In practice, however, we usually observe only one sample.
The observed value of the statistic \(\bar X\) can be viewed as one realization from its sampling distribution.
Since the sampling distribution tells us how much the \(\bar X\) varies from sample to sample, we can use it to construct an interval that likely contains \(\mu\).
❓ What is a confidence interval?
A confidence interval (CI) provides an interval or range of plausible values for the population parameter.
It is constructed using a sample statistic and its sampling distribution.
CI provides some prescribed degree of confidence (C) of securing the true parameter (typically 90%, 95%1, or 99%)
The general form of a confidence interval (CI) in this unit:
\[\text{point estimate} \pm \text{(Margin of Error)}\]
\[ \begin{align} \text{or } \big[\text{point est.} - \text{ME}&,\ \text{point est.} + \text{ME}\big]\\ \big[L&, U\big] \quad\quad \text{where }L <U \end{align} \]
For these types of CI,
In general, let \(\theta\) be some population parameter of interest.
A point estimator of \(\theta\) is a statistic, is denoted by \(\hat{\theta}\).
The “hat” notation, reminds us that \(\hat{\theta}\) is sample-based estimate computed from the data which distinguishs from the population parameter which is fixed and unknown.
In general, there may be many possible estimators for a parameter.
For example, estimators of the population mean \(\mu\) include:
The sample mean \(\bar X\) is especially appealing because it is unbiased for \(\mu\).
Unbiased Estimator
A statistic \(\hat \theta\) is said to be an unbiased estimator, or its value an unbiased estimate, of \(\theta\) if and only if:
\[ \mathbb{E}[\hat \theta]= \theta \]
We know that when we use \(\bar X\) to estimate \(\mu\), the estimate will almost surely be wrong, i.e. \(\Pr(\bar X = \mu) = 0\)
To examine this error, recall that for large \(n\)
\[ Z = \frac{\bar X - \mu}{\sigma/\sqrt{n}} \sim N(0,1) \]
Figure 1: The sampling distribution of \(Z = \frac{\bar X - \mu}{\sigma/\sqrt{n}} \sim N(0,1)\)
Figure 2: The sampling distribution of \(Z = \frac{\bar X - \mu}{\sigma/\sqrt{n}} \sim N(0,1)\)
As shown in Figure 1, we can assert the following:
\[ \begin{align} \Pr(-z_{\alpha/2} < Z < z_{\alpha/2}) &= 1- \alpha\\ \Pr(-z_{\alpha/2} < \frac{\bar X - \mu}{\sigma/\sqrt{n}} < z_{\alpha/2}) &= 1- \alpha \end{align} \]
where \(z_{\alpha/2}\) represents the \(z\)-score such cuts off an area of \(\alpha/2\) in the upper-tail of the standard normal curve.
Rearranging this ineqaulity we can write …
\[ \begin{align} &\Pr\left(-z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} < \bar X - \mu < z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \right) = 1- \alpha\\ &\Pr\left(- \bar X -z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} < - \mu < - \bar X + z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \right)= 1- \alpha\\ &\Pr\left(\bar X + z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} > \mu > \bar X - z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \right)= 1- \alpha\\ &\Pr\left( \bar X - z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} < \mu < \bar X + z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \right)= 1- \alpha \end{align} \]
The previous probability statement describes how the random variable \(\bar X\) behaves across repeated samples.
Once we observe a sample and compute \(\bar x\), the resulting interval is fixed (no more randomness/probability).
Since the parameter is also considered fixed, there is no more randomness associate with this interval and it either contains the true population parameter or it doesn’t.
Interpreted through repeated sampling: If we repeatedly sample from the population, we would expect \((1-\alpha)100\)% of the confidence interval to contain \(\mu\).
Correct Interpretation
✅ Over repeated sampling, this confidence interval procedure is expected to produce intervals that contain \(\mu\) about \((1-\alpha)100\%\) of the time.
✅ We are \((1-\alpha)100\%\) confident that \(\mu\) lies in the \((1-\alpha)100\%\) CI.
Incorrect Interpretations
❌ We are confident that the interval contains the true population parameter \(\mu\).
❌ There is a \((1-\alpha)100\%\) probability that \(\mu\) lies in this interval.
Large Sample Confidence Intervals
For large (\(n \geq 30\)) random samples from a population with mean \(\mu\) and variance \(\sigma^2\) a \((1- \alpha)100\)% confidence interval for \(\mu\) is given by:
\[ \bar x \pm \underbrace{z_{\alpha/2}\cdot \frac{\sigma}{\sqrt{n}}}_{\text{Margin of Error}} \]
The confidence level \(C = (1 - \alpha)100\)% describes the reliability of the CI procedure.
The quantity \(\alpha\) is called the significance level.
We can choose the confidence level to suit the situation. Common choices include:
Constructing a 95% confidence interval
Exercise 1 A random sample of size \(n\) = 100 has a sample mean of \(\bar x\) = 21.6. Assuming the population standard deviation is know to be \(\sigma =\) 5.1, construct a 95% confidence interval for \(\mu\).
To find the \(z_{\alpha/2}\) in R, you can use the qnorm() function
There will be several ways to get at \(z_{\alpha/2}\)
Choose which ever way makes the most sense to you
Tip
I highly recommend that you draw the Normal curve to help you visualize.
qnorm() is the quantile function for the normal distribution with mean equal to mean and standard deviation equal to sd.
p vector (or single value) of probabilitiesmean (default value is 0)sd (default value is 1)lower.tail logical; if TRUE (default), probabilities are \(\Pr(X \leq x)\), otherwise \(\Pr(X > x)\)
Warning
\(z_{\alpha}\) denotes the upper-tail \(z\)-scores of the standard normal distribution, i.e. \(\Pr(Z > z_{\alpha})= \alpha\) where \(Z \sim N(0,1)\)

The default of qnorm(p) is to find lower-tail quantiles associated with lower-tail probabilities.


For a 95% confidence interval, we set \(\alpha = 0.05\), so we need the cutoff \(z_{0.025}\).
The value \(-z_{0.025}\) is just the same number with a minus sign in front.
Because the standard normal curve is symmetric around zero, we don’t need to find both numbers.
Either value can be used to construct the interval, so compute the one you find easiest.
Find the \(z_{\alpha/2} = z_{0.025}\) by finding the \(z\)-score that that gives 2.5% in the upper-tail:
Find the \(z_{\alpha/2} = z_{0.025}\) by finding the \(z\)-score that that gives 97.5% in the lower-tail:
Find the \(-z_{\alpha/2} = z_{1-\alpha/2} = z_{0.975}\) that gives 97.5% in the upper-tail:
Find the \(-z_{\alpha/2} = z_{1-\alpha/2} = z_{0.025}\) that gives 2.5% in the lower-tail:
Since our Z-table gives lower-tail probabilities, it will make sense to either find
\(\Pr(Z < -z_{0.025}) = 0.025\)

\(\Pr(Z < z_{0.025}) = 1- 0.025 = 0.975\)


Notice how the table gives lower-tail probabilities.

\[ \begin{align} \Pr(Z < -z_{0.025}) &= 0.025 \\ &\phantom{= 0.975} \end{align} \]


\[ \begin{align} \Pr(Z < -z_{0.025}) &= 0.025 \\ & \phantom{= 0.975} \end{align} \]


\[ \begin{align} \Pr(Z < -z_{0.025}) &= 0.025 \\ & \phantom{=0.975} \end{align} \]


\[ \begin{align} \Pr(Z < z_{0.025}) &= 1- 0.025 \\ &= 0.975 \end{align} \]


\[ \begin{align} \Pr(Z < z_{0.025}) &= 1- 0.025 \\ &= 0.975 \end{align} \]


\[ \begin{align} \Pr(Z < z_{0.025}) &= 1- 0.025 \\ &= 0.975 \end{align} \]

The form of a 95% CI for \(\mu\) is therefore given by:
\[ \begin{align} \text{point estimate} &\pm \textcolor{red}{\boxed{\text{Margin of Error}}}\\ \bar x &\pm \textcolor{red}{\boxed{z_{0.025} \times \sigma_{\bar x}}} \phantom{asdflj}\\ \bar x &\pm \textcolor{red}{\boxed{1.96 \times \sigma/\sqrt{n}}}\\ \end{align} \]
where \(\sigma_{\bar X}\) is the standard error (SE) of the point estimate.
iClicker: finding \(z_{\alpha/2}\)
What is the \(z_{\alpha/2}\) for a 98% confidence interval?
2.33
2.05
2.58
None of the above
Warning
For this CI to be exact, we must have the following:
When it is not guaranteed that our population is normal, this interval is approximate for large \(n\) (thanks to the CLT)
iClicker
Exercise 2 (Margin of Error) Suppose a confidence interval for a population mean has the form \([a, b]\). Which of the following is the margin of error?
iClicker
Exercise 3 (Margin of Error) What does a larger margin of error indicate?
iClicker
Exercise 4 (Margin of Error) Two confidence intervals are centered at the same point estimate. One is wider than the other. Which statement is true?
Let’s consider a subset of cereal from the Breakfast Cereal data from Kaggle.
Cereal calories
Exercise 5 Based on this sample, a nutrition researcher wants to estimate the mean number of calories per serving for breakfast cereals. Given the sample size of \(n\) = 12 and the sample mean of \(\bar x\) = 109.17 compute a 98% confidence interval for \(\mu\). You may assume the population standard deviation is \(\sigma = 19.8\) calories
Small Sample Size
Because this is a small sample size, we cannot rely on the CLT. Before we take our CI we need to as ourselves if its reasonable to assuming a normal population.
While formal tests for normality exist, for now we will graphical checks.
Histogram
Look for a bell-shaped, symmetric distribution
Mild skewness is usually fine for large samples
Normal Q–Q Plot
If points fall roughly along a straight line → normality is reasonable
Systematic curves or strong deviations → non-normality
📌 In practice, Q–Q plots are preferred over histograms.
The Q–Q plot on the right includes shaded bands showing typical sampling variation under normality.
✅ Because the points fall within these bands, the normality assumption appears reasonable

We are 98% confident that the true mean calories per serving in breakfast cereal lies somewhere between 95.9 and 122.5
Interpretation matters
Confidence intervals must be interpreted in context. Generic statements such as
We are 98% confident that \(\mu\) lies within this interval.
without referencing the population and units will receive partial credit or no credit on tests and assignments.
❌ We are 98% confident that the sample mean calories per serving falls within this interval.
Why this is wrong
❌ There is a 98% probability that the true mean calories per serving lies in this interval.
Why this is wrong
Once the interval is computer, it is fixed. Probability refers to the procedure, not this particular interval.
❌ 98% of breakfast cereals have calorie counts that fall within 95.9 and 122.5 calories.
Why this is wrong
A confidence interval for the mean does not describe the distribution of individual observations or the proportion of the population within the interval.
The CI we have just constructed were symmetric CI that splies \(\alpha\) evenly between the two tails.
This is not a requirement, however, it is the most common.
Consider instead if we split a 5% significant level to have 2% in the lower tail and 3% in the upper tail…


Q: What would be an advantage of using the symmetric confidence interval on the left over the non-symmetric confidence interval on the right?










In the most extreme case, we place all of \(\alpha\) in one tail resulting in a one-sided confidence interval
There are two natural versions:
In both cases, the confidence interval has infinite width.

Comments
\(L\) and \(U\) will be used to denote1 the lower and upper confidence limits, respectively.
For this interval \((L, U)\) to be useful,