Linear Model 2: The F Awakens

Week 09

Jenny Terry

Survey Study on Statistics Attitudes & Career Goals

Interested in a Research Career?

Looking Ahead (and Behind)

  • The story so far…

    • Fundamentals of NHST & Statistical Tests
    • The Linear Model - Using the Equation of a Line to Make Predictions
  • This week:

    • The Linear Model - Evaluating the Model
  • Coming up:

    • The Linear Model - Models with Multiple Predictors
    • Questionable Research Practices

Today’s Objectives

First, a recap:

  • Linear model

    • Recap of how we can use the model to makes predictions

    • The model error term & other Lecture 08 questions answered

After this lecture, you will (begin to) understand:

  • Evaluating model fit (conceptually) with \(R^2\) and the \(F\)-statistic

  • Evaluating the statistical significance of the prediction with p-values and CIs

Talk to Me!

Open the Lecture Google Doc: bit.ly/and24_lecture09

Using the Linear Model to Make Predictions (Recap)

The Linear Model as a Statistical Model

Models take sample data and use known mathematical properties of the world around us to make predictions about the population from which our sample came from.

Vocabulary: The General Model Equation

A conceptual representation of all statistical models, with the following form:

\[outcome = model + error\]

Vocabulary: The Linear Model Equation

A particularly common type of statistical model, with the following form:

\[y_{i} = b_{0} + b_{1}\times x_{1i} + e_{i}\]

The (Pesky) Error Term

  • The error term includes everything that separates your model from actual reality (e.g., individual differences, unmeasured variables, and measurement errors).

  • It is different to - but often conflated with - model residuals (which we touch on today, but you’ll meet formally next semester)

    • The error term represents the way observed data differs from the actual population (unobservable)

    • A residual represents the way observed data differs from sample population data (observable)

  • So, we can not use the linear model error term when calculating predictions, but we know it is there so it is represented in the equation.

The Linear Model Equation

\[y_{i} = b_{0} + b_{1}\times x_{1i} + e_{i}\]

Term Meaning
\(y_i\) = The outcome (\(y\)) for an individual’s actual score (\(i\)) is equal to (or, is predicted by)…
\(b_0\) … the value of beta-zero (the model’s intercept)…
+ … plus…
\(b_1\) … the value of beta-one (the model’s slope)…
\(\times\) … multiplied by…
\(x_{1i}\) … the value of the predictor (\(x_1\)) for an individual’s actual score (\(i\))…
\(+\) … plus…
\(e_i\) … the error (\(e\)) for the individual’s actual score (\(i\)).

The Linear Model Equation

\[y_{i} = b_{0} + b_{1}\times x_{1i} + e_{i}\]

  • \(b_{0} + b_{1}\times x_{1i}\) is the model

  • \(b_0\) (intercept) is the value of \(y\) when \(x\) is 0

  • \(b_1\) (slope) is the change in \(y\) for every unit change in \(x\)

  • The model uses sample data to estimate \(b_0\) and \(b_1\) in the population

  • Once we know \(b_0\) and \(b_1\), we can estimate \(y_i\) (outcome) for any value of \(x_{1i}\)

Your Questions from Lecture 08

1) Does \(y_{i} = b_{0} + b_{1}\times x_{1i} + e_{i}\) apply to the whole dataset or each participants scores?

  • Models take sample data and use known mathematical properties of the world around us to make predictions about the population from which our sample came from using the sample data
  • So, the model uses information from the whole dataset to set \(b_0\) and \(b_1\)
  • We then use \(b_0\) and \(b_1\) to make predictions about the outcome (\(y\)) value for any individual score on the predictor variable (\(x_1\))

Using the Model to Predict Masculinity

  • Insert our outcome and predictor

\[ Masculinity_i = b_0 + b_1\times Femininity_{1i} + e_i \]

  • Add the \(b\) values (that we got from R in Lecture 8):

    • Intercept (\(b_{0}\)): the predicted value of masculinity when femininity is 0

      • = 8.82
    • Slope (\(b_{1}\)): change in masculinity associated with a unit change in femininity (note the sign change)

      • = -0.80

\[Masculinity_i = \hat{8.82} - \hat{0.8}\times Femininity_{1i} + e_i\]

Your Questions from Lecture 08

2) How come it doesn’t use - - 0.08? Doesn’t this turn it into a +?

  • Two negatives do make a positive…

  • … but there is initially a \(+\) in the equation… \[ Masculinity_i = b_0 + b_1\times Femininity_{1i} + e_i \]

. . .

  • … so we’re adding a positive and a negative…

  • … which gives us a negative.

\[Masculinity_i = \hat{8.82} - \hat{0.8}\times Femininity_{1i} + e_i\]

Using the Model to Predict Masculinity

\[Masculinity_i = \hat{8.82} - \hat{0.8}\times Femininity_{1i} + e_i\]

For someone with a fairly low (on a scale of 1-9) femininity rating of 3:

\[Masculinity_i = \hat{8.82} - \hat{0.8}\times 3 + e_i\]

\[Masculinity_i = 6.42 + e_i\]

For someone with a fairly high (on a scale of 1-9) femininity rating of 8:

\[Masculinity_i = \hat{8.82} - \hat{0.8}\times 8 + e_i\]

\[Masculinity_i = 2.42 + e_i\]

Evaluating our Model: Is it any Good?

Statistical Models

  • When we’ve been using the linear model so far, we’ve been using a model to make predictions about individuals from a population by extrapolating from a sample of that population

  • In psychology, we don’t usually use it to make predictions as such, but we tend to focus on what \(b_1\) (the slope) tells us about the strength and direction of the relationship between our predictor and outcome

  • But, how do we know if what our model is telling us about \(b_1\) is meaningful?

    • First, is the model itself any good?

      • \(R^2\) and the \(F\)-statistic
    • Next, is the value of the slope (\(b_1\)) important?

      • p-values and confidence intervals

The Simplest Model

  • A good linear model should fit the data better than the simplest possible model

  • The simplest model is the null (aka, intercept-only) model - where there is no relationship between the predictor(s) and outcome

  • Here’s our linear model again: \(y_{i} = b_{0} + b_{1}\times x_{1i} + e_{i}\)

  • If there is no relationship between the predictor and outcome then \(b_1 = 0\)

  • If we replace \(b_1\) with \(0\) then we get: \(y_i=b_0+0\times x_{1i} + e_i\)

  • But \(0\times x_{1i} = 0\) so we can then drop \(b_1\times x_{1i}\) from the equation to get: \(y_i = b_0 + e_i\)

  • When there are no predictors, \(b_0\) is the model (which is just the mean of \(y\))

  • So, a linear model with no predictors predicts the outcome by the mean of the outcome

Mean Model Residuals

  • The lines from each point to the mean represent the distance between the sleep score of each participant and the mean sleep score for the sample (aka the residual error)

  • We can use these residuals to calculate the overall amount of error and, therefore, variance in our mean model

Linear Model Residuals

  • The lines from each point to the mean represent the distance between the sleep score of each participant and the model’s predicted sleep score for the sample (aka the residual error)
  • So, we can also calculate the overall amount of error and, therefore, variance in our linear model

Goodness of Fit with \(R^2\)

Vocabulary: \(R^2\)

\(R^2\) = how much of the variance in the outcome is explained by the model

  • In other words, do our predictors do a good job of explaining changes in the outcome?

  • \(R^2\) portions out the variance in the linear model from the mean model

  • \(R^2\) is a Goodness of Fit measure for linear models

  • \(R^2\) will appear in your output as a decimal, but is usually reported as a percentage on a 0 -100% scale

    • e.g., “… the model explains \(x\) % of the variance in the outcome…”
  • The higher the \(R^2\) value, the better the model (but, low values are not always a cause of concern)

  • But, is that amount of variance sufficiently large that we would conclude the effect is meaningful in the population?

The \(F\)-statistic

Vocabulary: \(F\)-statistic

\(F\) = whether the variance explained significantly differs from zero

  • The \(F\)-statistic also uses (a slightly different, but related) comparison of the amount of error in the mean model and the amount of error in your linear model

  • So, using null hypothesis significance testing (NHST), we can get an associated p-value for our \(F\)-statistic

NHST Recap

To get a p-value

  • We first obtain data, from which we calculate…
  • A test statistic that represents the relationship of interest, which we compare to…
  • The distribution of that test statistic under the null hypothesis (where there is no relationship) to get…
  • The probability (p) of getting a test statistic at least as large as the one we have if the null hypothesis is true so that we can…
  • Evaluate our competing hypotheses using a previously decided alpha (\(\alpha\)) level (usually 0.05)
  • If our p-value is lower than \(\alpha\), we say that is statistically significant and fail to reject the null hypothesis

The \(F\)-statistic

  • In this case, the \(F\)-statistic is our test statistic that represents the relationship of interest, which we compare to…
  • The distribution of the \(F\)-statistic under the null hypothesis (where there is no relationship between the predictor and the outcome) to get…
  • The probability (p) of getting an \(F\)-statistic as large as the one we’ve observed if the null hypothesis is true
  • Evaluate our competing hypotheses using a previously decided \(\alpha\) level (usually 0.05)
  • If our p-value is lower than \(\alpha\), we say that is statistically significant and fail to reject the null hypothesis
  • The F-statistic and it’s p-value will appear in your R output, so let’s have a quick look…

Example: Predicting Better Sleep from Positive Psychology

Research Question

  • Are positive psychology attributes (e.g., gratitude, optimism, mindfulness) associated with better sleep?

Operationalisation

  • Predictor: Positive psychology attributes

  • Outcome: Sleep quality & quantity

  • Model: \(Sleep_{i} = b_{0} + b_{1}\times PositivePsychology_{1i} + e_{i}\)

Is it a Good Model?


Call:
lm(formula = sleep ~ pos_psy, data = sleep_tib)

Residuals:
    Min      1Q  Median      3Q     Max 
-12.526  -1.706   0.408   1.978   5.389 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.5202     1.1502   3.060  0.00239 ** 
pos_psy       2.2872     0.3149   7.262 2.74e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.855 on 331 degrees of freedom
Multiple R-squared:  0.1374,    Adjusted R-squared:  0.1348 
F-statistic: 52.74 on 1 and 331 DF,  p-value: 2.744e-12
  • How much of the variance in sleep is accounted for by positive psychology attributes? 🤔

  • Is the \(F\)-statistic statistically significant? 🤔

  • What does this suggest about the fit of the model? 🤔

Interim Summary

  • How do we know if what our model is telling us about \(b_1\) is important?

    • First, is the model itself any good?

      • \(R^2\) and the \(F\)-statistic ✅
  • Specifically, a good model will:

    • Fit the data better than the simplest possible model (\(R^2\) & \(F\)-statistic)

    • Explain a lot of variance in the outcome (\(R^2\))

    • Explain an amount of variance that significantly differs from zero (\(F\)-statistic)

  • How do we know if what our model is telling us about \(b_1\) is important?

    • Next, is the value of the slope (\(b_1\)) meaningful?

      • p-values and confidence intervals

Evaluating our Model: Is \(b_1\) Meaningful?

What is so Special about \(b_1\)?

  • It is a (parameter) estimate of the true relationship in the population

  • When we’ve been using the linear model so far, we’ve been making predictions

  • In psychology, we don’t usually make predictions as such, but instead interpret what \(b_1\) tells us about the strength and direction of the relationship between our predictor and outcome

  • In other words, we treat \(b_1\) as an effect size

\(b_1\) as an Effect Size

Example: Predicting Better Sleep from Positive Psychology

Research Question

  • Are positive psychology attributes (e.g., gratitude, optimism, mindfulness) associated with better sleep?

Operationalisation

  • Predictor: Positive psychology attributes

  • Outcome: Sleep quality & quantity

  • Model: \(Sleep_{i} = b_{0} + b_{1}\times PositivePsychology_{1i} + e_{i}\)

Hypotheses

  • What is the null hypothesis? 🤔
  • What is the alternative hypothesis? 🤔

Hypotheses

Null Hypothesis

  • No relationship between \(x\) and \(y\)

  • No relationship between positive psychology and sleep

  • \(b_1 = 0\)

Alternative Hypothesis

  • Relationship between \(x\) and \(y\)

  • Relationship between positive psychology and sleep

  • \(b_1 \ne 0\)

Significance of b1

  • Is our estimate of b1 different enough from 0 to believe that it is not 0 in the population?

  • We - well, R - tests this with NHST.

  • First, we need a test statistic, so we…:

1) Scale the estimate of b1 by its standard error (variation of estimates)

ChallengR!

What do you get when you divide a normally distributed value by its standard error? 🤔

Calculating the Test Statistic for \(b_1\)

1) Scale the estimate of b1 by its standard error (variation of estimates):


\[\frac{b_{1}}{SE_{b_1}} = t\]

2) Compare the value of t to the t-distribution to get p, just as we’ve seen before

3) If p is smaller than our chosen \(\alpha\) level, our predictor is considered to be statistically significant

Confidence Intervals (Recap)

  • We can also compute a confidence interval (CI) for \(b_1\) (see Lecture 3)

  • Assuming that our sample is one of the 95% of samples producing confidence intervals that contain the population value, then the population value for the estimate of interest falls somewhere between the lower limit the upper limit of the interval we’ve computed for our sample

  • But, we cannot know if our sample is one of the 95% producing a confidence interval that contains the population value

  • So, they do NOT mean we can be 95% confident that the result lies between the lower and upper limits of our computed interval

Confidence Intervals for \(b_1\)

  • CIs are useful as a measure of uncertainty in our estimate

    • Is there a relatively large difference between the upper and lower limits?

      • If yes, there is a large amount of uncertainty

      • If no, there is a small amount of uncertainty

    • For example:

      • 95% CI = [1.28, 10.34] is a wide CI with a large amount of uncertainty

      • 95% CI = [1.28, 1.34] is a narrow CI with a small amount of uncertainty

Confidence Intervals for \(b_1\)

  • CIs can also tell us about the statistical significance of \(b_1\)

    • Does the CI cross (contain) 0?

      • If it does, and our sample contains the population value, it is possible the population value is 0.

      • If it doesn’t, and our sample contains the population value, it is unlikely the population value is 0.

    • For example:

      • 95% CI = [-1.28, 1.34] crosses 0 (tip: look for a negative sign)

      • 95% CI = [1.28, 1.34] does not cross 0

Is \(b_1\) (Statistically) Significant?


Call:
lm(formula = sleep ~ pos_psy, data = sleep_tib)

Residuals:
    Min      1Q  Median      3Q     Max 
-12.526  -1.706   0.408   1.978   5.389 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.5202     1.1502   3.060  0.00239 ** 
pos_psy       2.2872     0.3149   7.262 2.74e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.855 on 331 degrees of freedom
Multiple R-squared:  0.1374,    Adjusted R-squared:  0.1348 
F-statistic: 52.74 on 1 and 331 DF,  p-value: 2.744e-12
  • Is the p-value associated with \(b_1\) statistically significant? 🤔


🤫Scientific Notation Converter 🤫

Where’s the CI?!

term estimate std.error statistic p.value conf.low conf.high
(Intercept) 3.520190 1.1502372 3.060403 0.0023911 1.257493 5.782887
pos_psy 2.287165 0.3149414 7.262193 0.0000000 1.667626 2.906704
  • Is there much uncertainty in the estimate of \(b_1\), according to the CI? 🤔

  • What can we infer about the statistical significance of \(b_1\) from the CI? 🤔

\(b_1\) as an Effect Size


Call:
lm(formula = sleep ~ pos_psy, data = sleep_tib)

Residuals:
    Min      1Q  Median      3Q     Max 
-12.526  -1.706   0.408   1.978   5.389 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.5202     1.1502   3.060  0.00239 ** 
pos_psy       2.2872     0.3149   7.262 2.74e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.855 on 331 degrees of freedom
Multiple R-squared:  0.1374,    Adjusted R-squared:  0.1348 
F-statistic: 52.74 on 1 and 331 DF,  p-value: 2.744e-12
  • The value of b1 is also a useful and interpretable value by itself

  • It can tell us the direction of the relationship between the predictor and outcome

    • Is the direction of the relationship positive or negative? 🤔
  • It can also tell us the strength of the relationship between the predictor and outcome

    • More on this next week!

Putting It All Together

  • \(R^2\) tells us the percentage of outcome variance our model explains

  • The \(F\)-statistic and its associated p-value tells is whether the amount of variance explained is significantly different from 0

  • We can also get an associated p-value and confidence interval for \(b_1\) that tells us whether \(b_1\) is significantly different from 0

  • We can also use the confidence interval as a measure of uncertainty

  • \(b_1\) is also an effect size that tells us the strength and direction of the relationship between the predictor and the outcome

Reporting Results in APA Style

The model explained a statistically significant proportion of the variance in sleep quality and quantity, \(R^2\) = 13.7%, F(1, 331) = 52.74, p < 0.001. Positive psychology attributes positively and statistically significantly predicted sleep quality and quantity, \(b_1\) = 2.29 [1.67, 2.91], t(331) = 7.26, p < 0.001.

Further Support