Uncertainty,
standard errors and
confidence intervals
Recap of sampling from populations
Uncertainty in research and estimation
Sampling distribution revisited
Standard error of the mean
Confidence intervals: what they are, and what they are not
Distributions describe how often different values occur - in a sample or a population. They can take any shape - if they’re “algebraically tractable”, we can describe them with a formula.
Normal distribution - defined by mean, standard deviation, and proportions of scores expected above/below critical values
Key idea
Assuming a distribution of a particular shape, how common is a given value?
Key idea
Assuming a distribution of a particular shape, is the value we’re interested in above or below a specific cut-off point?
E.g. is an individual who attends 190 social events per year among the top 10% of event-goers?
Example 1: Average individual drinks 730 cups of coffee per year. John drinks 1100 cups of coffee per year. Is John in top 5% of the distribution (shaded)?
Example 1: Average individual drinks 730 cups of coffee per year. John drinks 1100 cups of coffee per year. Is John in top 5% of the distribution (shaded)?
Example 2: A critical value for top 5% on an anxiety scale is 23.5. A study participant receives a score of 19. Are they in the top 5%?
Example 2: A critical value for top 5% on an anxiety scale is 23.5. A study participant receives a score of 19. Are they in the top 5%?
Last week - example of a population distribution with known mean and SD
In research - the setup is often the other way around. We don’t know what the population distribution looks like, and we’re trying to figure this out from our sample.
THE PROBLEM: Each time we take a sample, we get different values
So there is uncertainty around how close the sample mean is to the true population value.
Standard errors and confidence intervals are tools we use to quantify that uncertainty.
What is an estimate?
An estimate can take many different forms. For example, we might be interested in comparing groups, in which case the estimate can be the difference in group means on some variable of interest. Or we might want to know whether two variables are associated, where the estimate is some measure of association (e.g. a correlation coefficient, or a b value which will be covered later in the term).
The average person…
drinks 730 cups of coffee per year (twice as much for academics, incl. students) ☕
spends 192 minutes a day watching TV 📺
eats 250 cloves of garlic per year 🧄
takes 3500 steps each day 🚶
falls asleep in 7 minutes 😴
Doomscrolling
“… refers to a unique media habit where social media users persistently attend to negative information in their newsfeeds about crises, disasters, and tragedies.”
- Sharma, Lee, and Johnson (2022)
Research question…
How much does an average person doomscroll?
How do we know if our estimate is accurate and close to the real population value?
Each time we take a sample, we get a different estimate.
How do we know if our estimate is accurate and close to the real population value?
We can describe every normal distribution using:
Mean - the central value
Standard deviation (SD) - the average difference from the mean
Proportions of scores at cut-off points
Around 68% of scores are within 1 SD of the mean
95% of scores are within \(\pm\) 1.96 SDs of the mean
Same rules apply!
The mean of a sampling distribution will be centered on the population value
LANGUAGE CHANGE: when talking about standard deviation in the context of a sampling distribution, we call it standard error.
Standard deviation
The average difference between each score and the sample mean
Standard error
Standard deviation of sample means
The average difference between each sample mean and the population value
Standard error is a useful metric for quantifying uncertainty in estimates - it describes the extent to which samples differ from each other in a sampling distribution
We can use it to construct an interval within which a certain percentage of sample means will fall
Let’s make things more complicated…
Warning
Error 404:Sampling distribution not found.
Sampling distributions don’t exist “in the wild”. They are a hypothetical statistical concept.
Remember: standard error refers to the standard deviation of the sampling distribution (created by re-sampling and computing the mean infinite number of times), but we only have access one sample with one mean.
Therefore, if we want to use the standard error to construct an interval, we need to estimate it from our sample.
Equation:
\[ SE = \frac{s}{\sqrt N} \]
Translation:
\[ \text{standard error} = \frac{\text{sample standard deviation}}{\text{(the square root of) the sample size}} \]
In R
:
We collect a sample of 4 individuals.
Each person reports their daily doomscrolling time (in minutes): 86, 114, 97, 107
The mean for the sample is 101 minutes
The standard deviation is:
\[ s = \sqrt\frac{\sum(x_i - x)^2}{N} = \sqrt\frac{(86-101)^2 + (114-101)^2 + (97 - 101)^2+(107-101)^2}{4} = 12.19 \]
\[ SE = \frac{s}{\sqrt{N}} = \frac{12.19}{\sqrt{4}} = 6.095 \]
Average doomscrolling time for the sample: 101 minutes
Standard deviation: 12.19
Standard error: 6.095
\[ \text{Lower CI limit} = \text{sample mean} - 1.96 \times\text{SE} \\ \text{Upper CI limit} = \text{sample mean} + 1.96 \times\text{SE} \]
\[ \text{Lower CI limit} = 101 - 1.96 \times6.095 = 89.054\\ \text{Upper CI limit} = 101 + 1.96 \times6.095 = 112.946 \]
You might see in a paper…
“The average doomscrolling time in our sample was 101 minutes (SD = 12.19) 95% CI [89.05, 112.95].”
📌 Sampling distribution of the mean will have a normal shape as long as the sample size large enough
Smaller samples don’t approximate the normal sampling distribution very well. Because of this, we can’t rely on the value 1.96 to give us accurate intervals.
Instead, we can use the t-distribution
Looks like normal, by isn’t.
Defined by degrees of freedom (df) - calculated as N-1 (number of observations minus 1)
The “critical t value” will change for different degrees of freedom.
Instead of multiplying the standard error by 1.96, we multiply by the critical t value.
Critical t gets closer to 1.96 with larger sample - the t-distribution itself will approximate normal distribution more closely
For example, in our sample of 4, the df is 4 - 1 = 3. Move the slider to df = 3 to see that the critical t value for 3 is 3.182
Average doomscrolling time for the sample: 101 minutes
Standard error: 6.095
Critical t value: 3.182
\[ \text{CI Limits} = \text{mean} \pm3.182 \times\text{SE} \\ \text{CI Limits} = 101 \pm3.182 \times\text{6.095} \\ \text{CI Limits} = [81.606, 120.394] \]
This is to be expected - we have a tiny sample (N = 4), so there is a lot of uncertainty around whether the estimate of 101 minutes is actually representative of the population.
The larger the sample, the tighter the confidence intervals,. because the critical t gets smaller and smaller (note how t approaches 1.96 as the sample size (df) increases)
We take samples over and over again, compute the mean for each, and construct confidence intervals around that mean - 95% of them will contain the population value, the remaining 5% will not.
This is known as an interval with 95% coverage. 95% is the most common value that we choose, but it can take on other values as well (e.g 50%, 90%, 99%).
\[ \text{"The average doomscrolling time in our sample was} \\ \text{101 minutes (SD = 12.19) 95% CI [81.61, 120.39]."} \]
Correct interpretation
ASSUMING THAT our sample is one of the 95% producing confidence intervals that contain the population value, then the population value for time spent doomscrolling per day falls somewhere between 81.61 and 120.39 minutes.
However…
There is no guarantee that the assumption above is correct! And we just have to live our lives not knowing…
No:
“We can be 95% confident that the population value falls between 81.61 and 120.39.”
Also no:
“There is 95% probability that the population value falls between 81.61 and 120.39.”
Correct interpretation
ASSUMING THAT our sample is one of the 95% producing confidence intervals that contain the population value, then the population value for time spent doomscrolling per day falls somewhere between 81.61 and 120.39 minutes.
More general correct interpretation
ASSUMING THAT our sample is one of the 95% producing confidence intervals that contain the population value, then the population value for the estimate of interest falls somewhere between the lower limit the upper limit of the interval we’ve computed for our sample.
Memorise and practice!
When interpreting estimates and confidence intervals for your sample - always consider them as just one of many different possible estimates
This is why replication is important in science - our sample could easily be the one that misses the population value
Always be vary of studies placing too much certainty on a single finding
\[ SE = \frac{\sigma}{\sqrt N} \]
\[ \text{CI limits} = mean \pm (1.96 \times{SE}) \\ \]
Putting it all into practice:
Research questions
Good and less good hypotheses
Testing hypotheses with Null Hypothesis Significance Testing
A disappointing answer to why we’re so obsessed with the value 95%.