Linear Model 3: Return of the y_i

Week 10

Jenny Terry

Survey Study on Statistics Attitudes & Career Goals

Please help my Junior Research Associate student! 🙏
Do a 10 minute survey and earn 1 SONA credit and some research karma!
Take part here: https://sussexpsychology.sona-systems.com/default.aspx?p_return_experiment_id=1852

Interested in a Research Career?

Check out the JRA scheme
Approach lecturers that are doing research you’re interested in and ask if you can assist

Looking Ahead (and Behind)

The story so far…
- The Linear Model - Equation of a Line
- The Linear Model - Evaluating the Model with p-values, CIs, \(F\), & \(R^2\)

This week:
- The Linear Model - Adding predictors; Comparing models; Comparing predictors

Coming up:
- Questionable Research Practices

Objectives

After this lecture, you will (begin to) understand:

How to extend the linear model equation to two, three, or more predictors
How to compare linear models using the \(R^2\)-change and \(F\)- change statistics
How to interpret the relationship of each predictor with the outcome
How to compare predictors using standardised beta coefficients

Talk to Me!

Open the Lecture Google Doc: bit.ly/and24_lecture10

Three Types of Research Questions

When we are using linear models with more than one predictor (aka “multiple regression”), there are usually three stages we go through, each of which align with slightly different research questions:

Which model is better (e.g., a model with one predictor vs. a model with three predictors)?

Does each predictor in the better model have a statistically significant relationship with the outcome (and which direction will those relationships be in)?

Which predictor in the better model has the biggest impact upon the outcome?

We’ll look at these in turn, but first, let’s just see what the linear model looks like with more than one predictor…

Adding Predictors to the Linear Model

Extending the Equation

The equation:

One-predictor model: \(y_{i} = b_{0} + b_{1}\times x_{1i} + e_{i}\)
- Predicts the outcome \(y\) based on a predictor \(x_1\)

Two-predictor model: \(y_{i} = b_{0} + b_{1}\times x_{1i} + b_{2}\times x_{2i} + e_{i}\)
- Predicts the outcome \(y\) based on a predictor \(x_1\) and another predictor \(x_2\)

Three–predictor model: \(y_{i} = b_{0} + b_{1}\times x_{1i} + b_{2}\times x_{2i} + b_{3}\times x_{3i} + e_{i}\)
- Predicts the outcome \(y\) based on a predictor \(x_1\) and \(x_2\) and \(x_3\)

\(n\)-predictor model: \(y_{i} = b_{0} + b_{1}\times x_{1i} + ... + b_{n}\times x_{ni} + e_{i}\)
- Predicts the outcome \(y\) based on as many predictors as you like!

1. Comparing Linear Models

Comparing Linear Models

We can compare linear models with different numbers of predictors as long as they are hierarchical (aka nested)
Hierarchical models must be the same except for the addition of something new
- The model with one predictor is nested in the two and three predictor models because they are the same except for the addition of the extra predictor(s)
- Similarly, a two-predictor model would be nested in a three-predictor model
- However, we couldn’t remove a variable from the two-predictor model and replace it with a different one to create a different two-predictor model and compare those
- In that case, changing the variable would mean it is no longer the same and, therefore, no longer nested

What is a ‘good’ model?

In Lecture 09, we learned that a good model will:
- Fit the data better than the simplest possible model (\(R^2\) & \(F\)-statistic)
- Explain a lot of variance in the outcome (\(R^2\))
- Explain an amount of variance that significantly differs from zero (\(F\)-statistic)
When we are comparing models, we can use the \(R^2\) and \(F\) values to instead ask, which is the better fitting model?

It’s all Greek to me!

Vocabulary: \(\Delta\)

\(\Delta\) just means “change”

\(R^2\) Change

How do we interpret \(R^2\)? 🤔

Vocabulary: \(R^2\)

\(R^2\) = percentage of the variance in the outcome explained by the model

A larger value means better fit (more variance is explained)

Vocabulary: \(R^2\) Change

\(R^2 \Delta = R^2_\text{Model 1} - R^2_\text{Model 2}\)

The difference in \(R^2\) values between two models
The model with the larger \(R^2\) value is the better fitting model (more variance is explained)
The larger the \(R^2\Delta\) value, the greater the improvement in the better fitting model

\(F\) Change

How do we interpret \(F\)? 🤔

Vocabulary: \(F\)-statistic

\(F\) = whether the variance explained significantly differs from zero

Compares a model with predictor(s) to the null model
An \(F\)-statistic with a p-value lower than 0.05 means that it is statistically significant, so we can reject the null hypothesis that the null model fits as well as the predictor model

Vocabulary: \(F\) Change

\(F\Delta = F_\text{Model 1} - F_\text{Model 2}\)

Compares models with predictors to one another
The model with the larger \(F\)-statistic is the better fitting model
An \(F\Delta\) with an a p-value lower than 0.05 means that the model is statistically significantly better, so we can reject the null hypothesis that one model fits as well as the other model

Let’s look at an example to see how this looks in practice…

Example: Predicting Sleep from 1 vs. 3 Predictors

Tout et al. (2023) were interested in the effect of positive psychology and emotional regulation upon sleep
Participants took part in a cross-sectional, self-report survey that asked them to rate their:
- Positive psychology attributes (a composite of gratitude, optimism, self-compassion, and mindfulness)
- Adaptive emotional regulation strategies (a composite of acceptance, positive refocusing, refocus on planning, positive reappraisal, perspective taking)
- Maladaptive emotional regulation strategies (a composite of self-blame, rumination, catastrophising, other-blame)
- Sleep quality and quantity (a composite of subjective sleep quality, sleep literacy, sleep duration, sleep efficiency, sleep disturbances, sleep medication, daytime dysfunction)

Research Question & Hypothesis

Research Question

Which model is better, a model with just positive psychology attributes as a predictor or a model with positive psychology attributes and adaptive emotional regulation strategies and maladaptive emotional regulation?

Hypotheses

The model with positive psychology attributes and adaptive emotional regulation strategies and maladaptive emotional regulation will fit the data better than a model with just positive psychology attributes.

Operationalisation - Model 1

Predictors:
1. Positive psychology attributes (\(PosPysch\))
Outcome: Sleep quality & quantity (\(Sleep\))
Model 1: \(Sleep_{i} = b_{0} + b_{1}\times PosPsych_{1i} +e_{i}\)

Operationalisation - Model 2

Predictors:
1. Positive psychology attributes (\(PosPysch\))
2. Adaptive emotion regulation attributes (\(AdaptEmoReg\))
3. Maladaptive emotion regulation attributes (\(MalEmoReg\))
Outcome: Sleep quality & quantity (\(Sleep\))
So, what will our 3-predictor model equation look like? 🤔

Model 2: \(Sleep_{i} = b_{0} + b_{1}\times PosPsych_{1i} + b_2\times AdaptEmoReg_{2i} + b_3\times MalEmoReg_{3i} +e_{i}\)

Running the Analysis

We run the analysis in a series of stages:

Fit Models
1. Fit Model 1
2. Fit model 2
Compare \(R^2\) Values
1. Calculate \(R^2\) for Models 1 & 2
2. Calculate \(R^2\Delta\)
Calculate \(F\Delta\)

Running the Analysis

Step 1a: Fit Model 1

Model 1

sleep_lm1 <- sleep_tib |>   
  lm(sleep ~ pos_psy, data = _)  

summary(sleep_lm1)


Call:
lm(formula = sleep ~ pos_psy, data = sleep_tib)

Residuals:
    Min      1Q  Median      3Q     Max 
-12.526  -1.706   0.408   1.978   5.389 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.5202     1.1502   3.060  0.00239 ** 
pos_psy       2.2872     0.3149   7.262 2.74e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.855 on 331 degrees of freedom
Multiple R-squared:  0.1374,    Adjusted R-squared:  0.1348 
F-statistic: 52.74 on 1 and 331 DF,  p-value: 2.744e-12

Running the Analysis

Step 1b: Fit Model 2

Model 2

sleep_lm2 <- sleep_tib |>   
  lm(sleep ~ pos_psy + adapt_er + mal_er, data = _)  

summary(sleep_lm2)


Call:
lm(formula = sleep ~ pos_psy + adapt_er + mal_er, data = sleep_tib)

Residuals:
     Min       1Q   Median       3Q      Max 
-12.2262  -1.5042   0.3768   2.0303   4.7376 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   8.5671     2.0735   4.132 4.57e-05 ***
pos_psy       2.1531     0.4154   5.183 3.82e-07 ***
adapt_er     -0.5004     0.3316  -1.509  0.13229    
mal_er       -0.9886     0.3690  -2.679  0.00776 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.823 on 329 degrees of freedom
Multiple R-squared:  0.1621,    Adjusted R-squared:  0.1545 
F-statistic: 21.22 on 3 and 329 DF,  p-value: 1.371e-12

Running the Analysis

Step 2a: Get the \(R^2\) values for each model

Model 1

broom::glance(sleep_lm1)

r.squared	adj.r.squared	sigma	statistic	p.value	df	logLik	AIC	BIC	deviance	df.residual	nobs
0.1374355	0.1348296	2.855112	52.73944	0	1	-820.8574	1647.715	1659.139	2698.2	331	333

Model 2

broom::glance(sleep_lm2)

r.squared	adj.r.squared	sigma	statistic	p.value	df	logLik	AIC	BIC	deviance	df.residual	nobs
0.1621142	0.1544739	2.822512	21.21831	0	3	-816.0243	1642.049	1661.089	2621.002	329	333

Running the Analysis

Step 2b: Calculate the \(R^2\Delta\) value from the \(R^2\) values

\(R^2 \Delta = R^2_\text{Model 1} - R^2_\text{Model 2}\)
\(R^2 \Delta = 0.137 - 0.162\)
\(R^2 \Delta = -0.025\)
\(R^2 \Delta = 2.5\text %\)
\(R^2\) for Model 2 accounts for 2.5% more of the variance in sleep, indicating Model 2 is the better fitting model

Running the Analysis

Step 3: Get & interpret the \(F\Delta\) and associated p-value

anova(sleep_lm1, sleep_lm2) |> broom::tidy()

term	df.residual	rss	df	sumsq	statistic	p.value
sleep ~ pos_psy	331	2698.200	NA	NA	NA	NA
sleep ~ pos_psy + adapt_er + mal_er	329	2621.002	2	77.19761	4.845095	0.0084371

\(F\Delta\) = 4.85, p < 0.01, so we can conclude that Model 2 accounts for statistically significantly more variance and is, therefore, the better fitting model

2. Relationships Between the Predictors & the Outcome

Three Types of Research Questions

Now we know that Model 2 is the better fitting model, we can turn to examining the relationships between the predictors and the outcome
We can do this by examining the slope values (\(b_n\)) for each predictor and asking whether they are statistically significantly different from 0 (see Lecture 09 for a refresher)?
This is very similar to how we interpret a linear model with one predictor - let’s take a look…

Interpreting the Model Intercept & Slope

Intercept (\(b_0\)):

One predictor model: value of \(y\) when \(x_1\) is 0

Two predictor model: value of \(y\) when \(x_1\) and \(x_2\) are 0
Three predictor model: value of \(y\) when \(x_1\) and \(x_2\) and \(x_3\) are 0

Slopes (\(b_1\), \(b_2\)… \(b_n\)):

One predictor model: \(b_1\) = change in \(y\) for every unit change in \(x_1\)

Two predictor model: \(b_2\) = change in \(y\) for every unit change in \(x_2\) … when the other predictor in the model is held constant (i.e., when the other variables do not change)
Three predictor model: \(b_3\) = change in \(y\) for every unit change in \(x_3\) … when the other predictors in the model are held constant

Interpreting the Model Intercept & Slope

The interpretation of \(b_0\) (the intercept) doesn’t change
- It is always the value of \(y\) (the outcome) when the predictor(s) are at 0
The interpretation of \(b_n\) coefficients (a given slope) changes a little
- They always represent the change in \(y\) (the outcome) for every unit change in \(x_n\) (a given predictor)…
- … but when there are other predictors in the model, the relationship between the outcome and predictor assumes the other predictors are held constant
It doesn’t matter if there are two, five, ten, or fifty predictors - the \(b\)-values will always be interpreted in this same way
Let’s have a look at our example…

Research Question & Hypothesis

Research Question

Does each predictor in the better model have a statistically significant relationship with sleep (and which direction will those relationships be in)?

Hypotheses

Positive psychology attributes and adaptive emotional regulation strategies have a positive relationship with sleep quality and quantity
Maladaptive emotional regulation strategies would have a negative relationship with sleep quality and quantity

Operationalisation - Model 2

Predictors:

\(x_1\) Positive psychology attributes (\(PosPysch\))

\(x_2\) Adaptive emotion regulation attributes (\(AdaptEmoReg\))

\(x_3\) Maladaptive emotion regulation attributes (\(MalEmoReg\))
Outcome (\(y\)): Sleep quality & quantity (\(Sleep\))
Model: \(Sleep_{i} = b_{0} + b_{1}\times PosPsych_{1i} + b_2\times AdaptEmoReg_{2i} + b_3\times MalEmoReg_{3i} +e_{i}\)

Running the Analysis

term	estimate	std.error	statistic	p.value	conf.low	conf.high
(Intercept)	8.5670973	2.0734842	4.131740	0.0000457	4.488138	12.6460569
pos_psy	2.1530694	0.4154330	5.182711	0.0000004	1.335829	2.9703095
adapt_er	-0.5003627	0.3316101	-1.508889	0.1322869	-1.152706	0.1519809
mal_er	-0.9885696	0.3690450	-2.678724	0.0077616	-1.714555	-0.2625841

\(b_0\) (intercept) = 8.57 (the value of sleep when all predictors are at 0)
\(b_1\) (slope for \(PosPysch\)) = 2.15, p < .001, 95% CI [1.34, -2.97]
\(b_2\) (slope for \(AdaptEmoReg\)) = -0.50, p = .132, 95% CI [-1.15, 0.15]
\(b_3\) (slope for \(MalEmoReg\)) = -0.99, p < .001, 95% CI [-1.71, -0.26]
Each slope represents the relationship between the predictor and the outcome when the other predictors in the model are held constant

3. Comparing Predictors

Three Types of Research Questions

First, we determined that the model with 3 predictors was the better fitting model
Second, we determined that positive psychology and maladaptive emotional regulation are statistically significant predictors of sleep
We also learned that positive psychology had a positive relationship with sleep and maladaptive emotional regulation had a negative relationship with sleep
Now we can ask, which is the best predictor?

Comparing Predictors

We can compare predictors to ascertain which is the “best” predictor
However, we cannot compare the beta values in their current (raw, unstandardised) form, because they are in different units
These different units reflect how the predictor variables are measured
- e.g., seconds, kilograms, centimetres, pound stirling etc.
- In psychology, we often use self-report scales where the units are Likert scale points…

Likert Scales - Maths Anxiety

Likert Scales - Trait Anxiety

Comparing Predictors

The two Likert Scales use different units of measurement, so cannot be directly compared
Similarly, we couldn’t directly compare reaction time in seconds with the amount of money someone earns, or the distance someone walks with their Likert Scale responses, etc.

However, we can make them comparable by standardising the slope values

Unstandardised vs Standardised Betas

Vocabulary: Unstandardised Betas

Change in the outcome for each unit change in the predictor
Depends on original scale of measurement
Usually denoted by \(b\)

Vocabulary: Standardised Betas

A standard deviation change in the outcome for each standard deviation change in the predictor
Does not depend on original scale of measurement
Usually denoted by \(β\)

Research Question & Hypothesis

Research Question

Which predictor in the better model has the biggest impact upon sleep?

Hypotheses

Positive psychology attributes will predict sleep quality and quantity better than maladaptive emotional regulation strategies
Note that we’ve dropped the non-significant predictor, adaptive emotional regulation strategies
Note we’re not saying anything about the direction of the effect here, just the magnitude

Operationalisation - Model 2

Predictors:

\(x_1\) Positive psychology attributes (\(PosPysch\))

\(x_2\) Adaptive emotion regulation attributes (\(AdaptEmoReg\))

\(x_3\) Maladaptive emotion regulation attributes (\(MalEmoReg\))
Outcome (\(y\)): Sleep quality & quantity (\(Sleep\))
Model: \(Sleep_{i} = b_{0} + b_{1}\times PosPsych_{1i} + b_2\times AdaptEmoReg_{2i} + b_3\times MalEmoReg_{3i} +e_{i}\)

Running the Analysis

Statistical Interpretation

Parameter	Coefficient	SE	CI	CI_low	CI_high	t	df_error	p
(Intercept)	0.000	0.050	0.95	-0.099	0.099	0.000	329	1.000
pos_psy	0.349	0.067	0.95	0.217	0.481	5.183	329	0.000
adapt_er	-0.092	0.061	0.95	-0.212	0.028	-1.509	329	0.132
mal_er	-0.154	0.057	0.95	-0.267	-0.041	-2.679	329	0.008

Look at the absolute values of \(β\) (ignore positive/negative sign) and decide which is the ‘bigger’ predictor of sleep? 🤔
Remember, we can assume non-statistically significant \(β\) values are not important predictors of the outcome (we’d report them, but not interpret them as meaningful)

Running the Analysis

Applied Interpretation

What should people focus on for better sleep quality/quantity? 🤔

This analysis suggests that the best thing we can do to improve sleep is to focus on increasing positive psychology attributes
Decreasing maladaptive emotional regulation strategies will also help, but not as much
Working on adaptive emotional regulation strategies would not have any impact on sleep (because it was non-significant)

Lecture Summary

You can extend the linear model to include multiple predictors: \(y_{i} = b_{0} + b_{1}\times x_{1i} + ... + b_{n}\times X_{ni} + e_{i}\)

We can ask three main types of question to evaluate multiple–predictor linear models:

Hierarchical models can be compared with \(R^2\Delta\) and \(F\Delta\) to determine which model is best

We can then examine the statistical significance of each predictor to establish whether it is an important predictor of the outcome

We can also interpret the relative importance of the predictors in the best model using standardised betas (\(β\))

Next Week

Next week, there are unfortunately no statistics! 😭
Martina and I are going to tell you all about Questionable Research Practices (QRPs)
QRPs are “a range of activities that intentionally or unintentionally distort data in favour of a researcher’s own hypotheses…” (Fortt 2021)
At best, bad science. At worst, academic misconduct and outright fraud! 😈
We’ll explore some different QRPs, look at some famous examples, tell you how the field is trying to solve the problems (i.e., Open Science), and consider some emerging critiques of that movement.

Linear Model 3: Return of the yi

Survey Study on Statistics Attitudes & Career Goals

Looking Ahead (and Behind)

Objectives

Three Types of Research Questions

Adding Predictors to the Linear Model

Extending the Equation

1. Comparing Linear Models

Comparing Linear Models

What is a ‘good’ model?

It’s all Greek to me!

\(R^2\) Change

\(F\) Change

Example: Predicting Sleep from 1 vs. 3 Predictors

Research Question & Hypothesis

Research Question

Hypotheses

Operationalisation - Model 1

Operationalisation - Model 2

Running the Analysis

Running the Analysis

Step 1a: Fit Model 1

Running the Analysis

Step 1b: Fit Model 2

Running the Analysis

Step 2a: Get the \(R^2\) values for each model

Running the Analysis

Step 2b: Calculate the \(R^2\Delta\) value from the \(R^2\) values

Running the Analysis

Step 3: Get & interpret the \(F\Delta\) and associated p-value

2. Relationships Between the Predictors & the Outcome

Three Types of Research Questions

Interpreting the Model Intercept & Slope

Interpreting the Model Intercept & Slope

Research Question & Hypothesis

Research Question

Hypotheses

Operationalisation - Model 2

Running the Analysis

3. Comparing Predictors

Three Types of Research Questions

Comparing Predictors

Likert Scales - Maths Anxiety

Likert Scales - Trait Anxiety

Comparing Predictors

Unstandardised vs Standardised Betas

Research Question & Hypothesis

Research Question

Hypotheses

Operationalisation - Model 2

Running the Analysis

Statistical Interpretation

Running the Analysis

Applied Interpretation

Lecture Summary

Next Week

Linear Model 3: Return of the y_i