Uncertainty,
standard errors and
confidence intervals

Martina Sladekova

A reminder image so that I don't forget to record the lecture on Zoom. Again.

Today…

  1. Recap of sampling from populations

  2. Uncertainty in research and estimation

  3. Sampling distribution revisited

  4. Standard error of the mean

  5. Confidence intervals: what they are, and what they are not

Recap of sampling

Recap

  • Distributions describe how often different values occur - in a sample or a population. They can take any shape - if they’re “algebraically tractable”, we can describe them with a formula.

  • Normal distribution - defined by mean, standard deviation, and proportions of scores expected above/below critical values

Density plot of a normal distribution with shaded proportions. The X axis shows standard deviations. Middle portion is shaded from -1 to +1 SD from the mean, representing 68.2 percent of scores. Outer portion is shaded from -2 to + 2 SD from the mean, represeting 95.4% of scores

Recap

Key idea

Assuming a distribution of a particular shape, how common is a given value?

  • E.g. is an individual who attends 57 social events per year unusual?

    • In a normal distribution with M = 127 (social events attended per year) and SD = 40, only 4 % of people attend less than 57 events per year.
1-pnorm(57, mean = 127, sd = 40, lower.tail = FALSE)
[1] 0.04005916

density plot of a normal distribution centred at 127 with standard deviation of 40. Vertical line crosses x axis at the value 57

Recap

Key idea

Assuming a distribution of a particular shape, is the value we’re interested in above or below a specific cut-off point?

  • E.g. is an individual who attends 190 social events per year among the top 10% of event-goers?

    • In a normal distribution with M = 127 (social events attended per year) and SD = 40, an individual would have to attend 178 events or more to be in the top 10% of event goers.
    • Therefore a person who attends 190 events is in the top 10%.
qnorm(p = 0.9, mean = 127, sd = 40)
[1] 178.2621

density plot of a normal distribution centred at 127 with standard deviation of 40. Vertical line crosses x axis at the value 190

Let’s practice (1)

Example 1: Average individual drinks 730 cups of coffee per year. John drinks 1100 cups of coffee per year. Is John in top 5% of the distribution (shaded)?

density plot of a normal distribution centred at 730 with standard deviation of 200. Top 5% of the distribution are shaded.

Let’s practice (1)

Example 1: Average individual drinks 730 cups of coffee per year. John drinks 1100 cups of coffee per year. Is John in top 5% of the distribution (shaded)?

density plot of a normal distribution centred at 730 with standard deviation of 200. Top 5% of the distribution are shaded.

Let’s practice (2)

Example 2: A critical value for top 5% on an anxiety scale is 23.5. A study participant receives a score of 19. Are they in the top 5%?

density plot of a normal distribution centred at 15.3 with standard deviation of 5.

Let’s practice (2)

Example 2: A critical value for top 5% on an anxiety scale is 23.5. A study participant receives a score of 19. Are they in the top 5%?

density plot of a normal distribution centred at 15.3 with standard deviation of 5. Top 5% are shaded. Verticlal line crosses the x axis at 19

Roadmap!

Roadmap on the module. Top row contains boxes "Introduction and distributions", "Standard error and confidence intervals" and "null hypothesis significance testing". Second box is labelled as "We're here!". Middle row is "t-test", "correlation" and "chi-square". Bottom row is "equation of a straight line", "linear model with one predictor", "linear model with multiple predictors"

Uncertainty in estimation

Population distribution is often unknown

  • Last week - example of a population distribution with known mean and SD

  • In research - the setup is often the other way around. We don’t know what the population distribution looks like, and we’re trying to figure this out from our sample.

    1. Collect a random sample
    2. Measure the individuals in our sample on some variable (e.g. social events attended, anxiety scores, time spent on social media, stress levels… )
    3. Calculate mean and SD of the sample and use to infer what the population distribution could look like (i.e. we want to generalise) .

Population distribution is often unknown

  • THE PROBLEM: Each time we take a sample, we get different values

  • So there is uncertainty around how close the sample mean is to the true population value.

  • Standard errors and confidence intervals are tools we use to quantify that uncertainty.

histogram of stress scores ranging from roughly 30 to 70. Orange dot on the distribution represents the mean of about 55. There are error bars surronding the dot.

Sampling from populations

  • We want to learn something about the population
  • But we only have access to a sample (often a small one)
  • The sample estimate is our best guess


What is an estimate?

An estimate can take many different forms. For example, we might be interested in comparing groups, in which case the estimate can be the difference in group means on some variable of interest. Or we might want to know whether two variables are associated, where the estimate is some measure of association (e.g. a correlation coefficient, or a b value which will be covered later in the term).

Some average facts

The average person…

  • drinks 730 cups of coffee per year (twice as much for academics, incl. students) ☕

  • spends 192 minutes a day watching TV 📺

  • eats 250 cloves of garlic per year 🧄

  • takes 3500 steps each day 🚶

  • falls asleep in 7 minutes 😴

Doomscrolling

“… refers to a unique media habit where social media users persistently attend to negative information in their newsfeeds about crises, disasters, and tragedies.”

- Sharma, Lee, and Johnson (2022)


A gif of a person scrolling endlessly on their phone

Research question…

How much does an average person doomscroll?

A group of stick figures representing the population

A group of stick figures representing the population. Below is a sample of 4 stick figures, labeled with mean of 101 minutes per day.

A group of stick figures representing the population. Below are two samples of stick figures drawn from the population, each showing a different value of the mean

A group of stick figures representing the population. Below are three samples of stick figures drawn from the population, each showing a different value of the mean

Uncertainty in research and estimation

  • Each time we take a sample, we get a different estimate because we are sampling randomly

How do we know if our estimate is accurate and close to the real population value?

dot plot of average time spent scrolling (x axis) and 3 sample IDs (y axis). Dot at each value of the y aixs represents the mean of each population. Top dot is at the value of 93, middle dot is at the value of 109, lower dot is at the value of 101 minutes.

Uncertainty in research and estimation

Each time we take a sample, we get a different estimate.

How do we know if our estimate is accurate and close to the real population value?

  • ANSWER: We don’t. We conduct some statistical wizardry and hope for the best.

Dot plot representing the distribution of means of 30 samples. Dots are scattered across a range of doomscrolling values.

Sampling distributions

  • We can plot the sample estimates in a histogram to see how they’re distributed
  • We’re now working with sample means, not the scores of individual people. Therefore the plot below shows a sampling distribution.
  • x axis shows the sample means, y axis shows how many times each mean occurs

histogram showing sampling distribution of sample means. 3 previously sampled means are highlighted in orange.

Central Limit Theorem revisited

  • We can repeat the process infinite number of times
  • For many types of estimates, including means, the sampling distribution will be normal
  • This is because of the Central Limit Theorem
  • This is true as long as the samples that we’re taking are large enough
    • If each sample has only 3 participants, the sampling distribution might not end up being normal
    • Textbooks often say that 30 is enough, but there are situations when we might need more
    • Let’s put a pin in this and we’ll get back to it later 📌

sampling distribution of many means, forming a perfect normal distribution.

Standard error

Normal distribution - what we already know

We can describe every normal distribution using:

  • Mean - the central value

  • Standard deviation (SD) - the average difference from the mean

  • Proportions of scores at cut-off points

    • Around 68% of scores are within 1 SD of the mean

    • 95% of scores are within \(\pm\) 1.96 SDs of the mean

Normal sampling distribution

  • Same rules apply!

  • The mean of a sampling distribution will be centered on the population value

  • LANGUAGE CHANGE: when talking about standard deviation in the context of a sampling distribution, we call it standard error.

Standard deviation

The average difference between each score and the sample mean

Standard error

  • Standard deviation of sample means

  • The average difference between each sample mean and the population value

Normal sampling distribution

  • We know that 95% of scores will fall within 1.96 standard errors from the population mean
  • We can use this knowledge to construct an interval around the mean - 95% of sample means will fall within this interval.

sampling distribution of many means, forming a perfect normal distribution. Distance of 1.96 standard errors from the mean is highlighted in orange

Normal sampling distribution

  • We know that 95% of scores will fall within 1.96 standard errors from the population mean
  • We can use this knowledge to construct an interval around the mean - 95% of sample means will fall within this interval.

sampling distribution of many means, forming a perfect normal distribution. Distance of 1.96 standard errors from the mean is highlighted in orange. Edges of the highlighted are are 90.35 (left) and 107.46 (right)

Standard error

  • Standard error is a useful metric for quantifying uncertainty in estimates - it describes the extent to which samples differ from each other in a sampling distribution

  • We can use it to construct an interval within which a certain percentage of sample means will fall

  • Let’s make things more complicated…

Standard error

illustrative meme of the guy from matrix saying "What if I told you that sampling distributions aren't real?"

Estimating the standard error from the sample

Warning

Error 404:Sampling distribution not found.

  • Sampling distributions don’t exist “in the wild”. They are a hypothetical statistical concept.

  • Remember: standard error refers to the standard deviation of the sampling distribution (created by re-sampling and computing the mean infinite number of times), but we only have access one sample with one mean.

  • Therefore, if we want to use the standard error to construct an interval, we need to estimate it from our sample.

Estimating the standard error from the sample

Equation:

\[ SE = \frac{s}{\sqrt N} \]

Translation:

\[ \text{standard error} = \frac{\text{sample standard deviation}}{\text{(the square root of) the sample size}} \]

In R:

se = sd(data$variable) / sqrt(n)
  • Note that the SE will be smaller for larger samples (because we’re dividing by a larger number).

Estimating the standard error from the sample

Example:
  • We collect a sample of 4 individuals.

  • Each person reports their daily doomscrolling time (in minutes): 86, 114, 97, 107

  • The mean for the sample is 101 minutes

  • The standard deviation is:

\[ s = \sqrt\frac{\sum(x_i - x)^2}{N} = \sqrt\frac{(86-101)^2 + (114-101)^2 + (97 - 101)^2+(107-101)^2}{4} = 12.19 \]

  • Which makes the standard error:

\[ SE = \frac{s}{\sqrt{N}} = \frac{12.19}{\sqrt{4}} = 6.095 \]

Confidence intervals

Confidence intervals

Average doomscrolling time for the sample: 101 minutes

Standard deviation: 12.19

Standard error: 6.095

\[ \text{Lower CI limit} = \text{sample mean} - 1.96 \times\text{SE} \\ \text{Upper CI limit} = \text{sample mean} + 1.96 \times\text{SE} \]

\[ \text{Lower CI limit} = 101 - 1.96 \times6.095 = 89.054\\ \text{Upper CI limit} = 101 + 1.96 \times6.095 = 112.946 \]

illustrative image of a confidence interval. Dot in the middle represents the mean. The left and right edges of the error bar around the mean represent lower and upper limits of the confidence interval.

You might see in a paper…

“The average doomscrolling time in our sample was 101 minutes (SD = 12.19) 95% CI [89.05, 112.95].”

Confidence intervals for small samples

  • 📌 Sampling distribution of the mean will have a normal shape as long as the sample size large enough

  • Smaller samples don’t approximate the normal sampling distribution very well. Because of this, we can’t rely on the value 1.96 to give us accurate intervals.

illustrative image of a confidence interval. Dot in the middle represents the mean. The left and right edges of the error bar around the mean represent lower and upper limits of the confidence interval.

The t-distribution

  • Instead, we can use the t-distribution

    • Looks like normal, by isn’t.

    • Defined by degrees of freedom (df) - calculated as N-1 (number of observations minus 1)

    • The “critical t value” will change for different degrees of freedom.

      • It’s the value we use instead of 1.96 to calculate 95% confidence intervals

Density plot of a bell-shaped t-distribution, with 3 degrees of freedom. Shape is similar to normal, but tails are longer. Critical value displayed on the plot for 3 degrees of freedom is 3.184

The t-distribution

  • Instead of multiplying the standard error by 1.96, we multiply by the critical t value.

  • Critical t gets closer to 1.96 with larger sample - the t-distribution itself will approximate normal distribution more closely

  • For example, in our sample of 4, the df is 4 - 1 = 3. Move the slider to df = 3 to see that the critical t value for 3 is 3.182

Density plot of a bell-shaped t-distribution, with 3 degrees of freedom. Shape is similar to normal, but tails are longer. Critical value displayed on the plot for 3 degrees of freedom is 3.184

t-based confidence intervals:

Average doomscrolling time for the sample: 101 minutes

Standard error: 6.095

Critical t value: 3.182

\[ \text{CI Limits} = \text{mean} \pm3.182 \times\text{SE} \\ \text{CI Limits} = 101 \pm3.182 \times\text{6.095} \\ \text{CI Limits} = [81.606, 120.394] \]

t-based confidence intervals

  • Compare the new confidence interval [81.61, 120.394] with the confidence interval we got using the value 1.96: [89.05, 112.95]. The new CI is wider!

two confidence intervals. First one is the one calculated using the Z score of 1.96. The second one is calculated using t-distribution. The second interval is wider.

t-based confidence intervals:

  • This is to be expected - we have a tiny sample (N = 4), so there is a lot of uncertainty around whether the estimate of 101 minutes is actually representative of the population.

  • The larger the sample, the tighter the confidence intervals,. because the critical t gets smaller and smaller (note how t approaches 1.96 as the sample size (df) increases)

Density plot of a bell-shaped t-distribution, with 100 degrees of freedom. Shape is now closer to normal, with lighter tails. Critical value displayed on the plot for 100 degrees of freedom is 1.98

Confidence intervals across samples

  • We take samples over and over again, compute the mean for each, and construct confidence intervals around that mean - 95% of them will contain the population value, the remaining 5% will not.

  • This is known as an interval with 95% coverage. 95% is the most common value that we choose, but it can take on other values as well (e.g 50%, 90%, 99%).

dot plot of means from our 3 samples, now with confidence intervals. There's a red line going through the middle of the plot representing the population mean. Only 3 out of 4 intervals cross this line.

gif of many confidence intervals being generated for many samples. 95% of confidence intervals include the population value. 5% miss it entirely.

How to interpret confidence intervals

\[ \text{"The average doomscrolling time in our sample was} \\ \text{101 minutes (SD = 12.19) 95% CI [81.61, 120.39]."} \]

Correct interpretation

ASSUMING THAT our sample is one of the 95% producing confidence intervals that contain the population value, then the population value for time spent doomscrolling per day falls somewhere between 81.61 and 120.39 minutes.

However…

There is no guarantee that the assumption above is correct! And we just have to live our lives not knowing…

image of many confidence intervals generated for many samples. 95% of confidence intervals include the population value. 5% miss it entirely.

How *not* to interpret confidence intervals

No:

“We can be 95% confident that the population value falls between 81.61 and 120.39.”

  • “95%” in the name refers to coverage, not to how confident we’re feeling.

Also no:

“There is 95% probability that the population value falls between 81.61 and 120.39.”

  • Think back to the “ladder” of confidence intervals - each of the intervals on that plot shows different limits. So the probability cannot be 95% for every single one of them.

How to interpret confidence intervals

Correct interpretation

ASSUMING THAT our sample is one of the 95% producing confidence intervals that contain the population value, then the population value for time spent doomscrolling per day falls somewhere between 81.61 and 120.39 minutes.

More general correct interpretation

ASSUMING THAT our sample is one of the 95% producing confidence intervals that contain the population value, then the population value for the estimate of interest falls somewhere between the lower limit the upper limit of the interval we’ve computed for our sample.

Memorise and practice!

The bigger picture…

  • When interpreting estimates and confidence intervals for your sample - always consider them as just one of many different possible estimates

  • This is why replication is important in science - our sample could easily be the one that misses the population value

  • Always be vary of studies placing too much certainty on a single finding

image of many confidence intervals generated for many samples. 95% of confidence intervals include the population value. 5% miss it entirely.

Summary

  • We want to estimate some population value (e.g. some average value - a mean)
  • Confidence intervals help us quantify uncertainty around that estimate
  • To construct a confidence interval, we use the standard error which we can estimate as:

\[ SE = \frac{\sigma}{\sqrt N} \]

  • Lower and upper limits of a 95% confidence interval can be estimated as (replacing 1.96 with critical t for small samples):

\[ \text{CI limits} = mean \pm (1.96 \times{SE}) \\ \]

  • When sampling repeatedly, 95% of samples produce confidence intervals that contain the true population value. We don’t know if our sample is one of them - we only (rightly or wrongly) assume that it does.

illustrative image of a confidence interval. Dot in the middle represents the mean. The left and right edges of the error bar around the mean represent lower and upper limits of the confidence interval.

Next week:

Putting it all into practice:

  • Research questions

  • Good and less good hypotheses

  • Testing hypotheses with Null Hypothesis Significance Testing

  • A disappointing answer to why we’re so obsessed with the value 95%.

References

Sharma, Bhakti, Susanna S. Lee, and Benjamin K. Johnson. 2022. “The Dark at the End of the Tunnel: Doomscrolling on Social Media Newsfeeds.” Technology, Mind, and Behavior 3 (1). https://doi.org/10.1037/tmb0000059.