r | 95% CI |
---|---|
-0.76 | [-0.8, -0.7] |
Week 06
Studies ongoing
Nominate faculty for a University Education Award!
Nominate staff who have inspired you, or made a positive difference to your experience
Nominated staff see all nominations - makes a huge difference!
Deadline: Friday 8 March (tomorrow)
Nominate classmates for a SavioR Award!
Preparing for the TAP
Read the Take-Away Paper Information page carefully!
Attempt the sample take-away paper (on the Cloud)
Tonight’s Skills Lab will talk through some portions of the paper, do Q&A
After this lecture you will understand:
The concepts behind statistical correlation
How to interpret the values of the correlation coefficient r
How to read a correlation matrix
How to interpret and report significance tests of r
The relationship between correlation and causation 👀
Putting our statistical “grammar” into practice
For each statistical analysis, we will have the same elements:
We want to believe true things about the world, and disbelieve false things
Statistics is a system to help us make decisions about whether, and to what degree, we believe something is supported by evidence
Quantifies how two quantities change in relation to each other
When one variable changes, does the other…
Change in a similar way?
Change in the opposite way?
Not change very much at all?
The Fundamental Question
To what degree do two variables behave the same way - do they covary?
Variance should be familiar already!
Covariance (“co” = with) is a similar idea - and calculated in a similar way
Vocabulary: Variance
How much scores deviate from the mean, on average
Calculate how far each data point is from the mean of x, multiply the deviations by themselves (i.e. square them), add them all together, divide by N - 1
\[\text{variance} = s^2 = \frac {\sum\nolimits_{i = 1}^n {(x_{i} - \bar{x_{i}})(x_{i} - \bar{x_{i}})}}{N - 1}\]
Vocabulary: Covariance
How much pairs of scores deviate from their (respective) means in the same way, on average
Calculate how far each data point is from the mean for both x and y, multiply them together, add all those together, divide by N - 1
\[\text{covariance}_{xy} = \frac {\sum\nolimits_{i = 1}^n {(x_{i} - \bar{x_{i}})(y_{i} - \bar{y_{i}})}}{N - 1}\]
Psychology very frequently collects gender as a variable
Is this a useful way to categorise people?
Gender and Sexuality Questionnaire about gender and attraction
Research Question
Are femininity and masculinity actually dichotomous? What is the nature of the relationship between them?
People who rated their femininity high tended to rate their masculinity low, and vice versa
We might like to know:
What is the nature of this relationship?
How strong is it? What direction does it go?
Should we believe that it’s real (ie representative of people/first-year psychology students in general?)
We talked a moment ago about covariance
\[\text{covariance}_{xy} = \frac {\sum\nolimits_{i = 1}^n {(x_{i} - \bar{x_{i}})(y_{i} - \bar{y_{i}})}}{N - 1}\]
Same problem as last week with the difference in the means
Let’s standardise by dividing by an estimate of the noise
\[r = \frac{covariance_{xy}}{s_{x}s_{y}}\]
Typically used with two (or more) continuous variables
r quantifies the strength and direction of the relationship
Absolute value of r between 0 and 1
The sign of r (positive or negative)
r | 95% CI |
---|---|
-0.76 | [-0.8, -0.7] |
Pop Quiz
How can we interpret this value of r?
The negative sign (-) means as femininity increases, masculinity tends to decrease (and vice versa)
The absolute value of .76 is very strong!
We now have our data, from which we calculated…
Our test statistic r (-.76)
We also know the distribution of r with different degrees of freedom
How likely is an r of -.76 (or larger) if in fact femininity and masculinity have a true r of 0
i.e. the null hypothesis is in fact true
We will again use \(\alpha\) = .05 in this case
Parameter1 | Parameter2 | r | 95% CI | t(304) | p |
---|---|---|---|---|---|
gender_fem | gender_masc | -0.76 | [-0.8, -0.7] | -20.1 | < .001 |
Reporting Correlation
There was a significant negative correlation between femininity and masculinity, r(304) = -.76, p < .001.
Leading 0s
We reported both r and p without leading 0s (e.g. as -.76 and not -0.76). The rule is this:
Correlations are often presented in matrices
Each cell contains the correlation coefficient r for the variables in that row and column
gender_comfortable gender_masc gender_fem gender_stability
gender_comfortable 1.00 -0.31 0.17 0.61
gender_masc -0.31 1.00 -0.76 -0.28
gender_fem 0.17 -0.76 1.00 0.18
gender_stability 0.61 -0.28 0.18 1.00
Pop Quiz
Why is there a diagonal line of 1s?
More useful version with GGally::ggscatmat()
Scatterplots, distributions, and r values
Our analysis showed that higher ratings of femininity tended to correspond to lower ratings of masculinity, and vice versa
Can we conclude from this that being more feminine causes you to be more masculine?
No distinction between cause and effect
Which is the chicken and which is the egg?
Which came first: femininity or masculinity?
A new study shows a rise in depression and stress among young people parallels the growth in smartphone and social media use.https://t.co/AxyseUyBxn
— NPR (@NPR) March 14, 2019
Vocabulary: Tertium quid
An unmeasured third variable that influences two other measured quantities
As it turns out…
The original study didn’t measure or have access to data on social media and smartphone use
Nurture a healthy skepticism of claims that two things are “linked”
What evidence do they have? Or NOT have?
What other explanations have not be considered or accounted for?
In everyday language, “correlated” means “related to in some way, usually causally”
Vocabulary: Correlation
The (standardised) degree to which two variables covary. Calculated as covariance divided by the product of the standard deviations. Quantifies both the strength (absolute value) and direction (sign) of the relationship between -1 and 1.
“Correlation” is a technical term!
Do not say two things are “correlated” unless you report r as evidence!
Instead: variables “have a relationship”/“are related to each other”
Website that collects examples of spurious correlations
Can you suggest a “third thing” that might influence both?
Content warning: examples involve death rates, self-harm rates
More practice with interpreting r with this fun little game
The correlation coefficient r quantifies the strength and direction of relationships between variables
The p-value associated with r is the probability of encountering a value of r as large as the one we have, or larger, if in fact the true value of r in the population is 0
Correlation DOES NOT IMPLY CAUSATION!!!!!!!
Hybrid teaching and disability support study: rebrand.ly/hybrid_ds
ChatGPT and AI at University study: rebrand.ly/gpt_uni
Prepare for the TAP!
Read the TAP Information page
Sample TAP in Skills Lab TONIGHT
Next week’s (07) practicals will:
Contain a short and optional study
Have a quiz that is practice only (i.e. will not contribute to your quiz mark)!
Be based only on this week’s lecture and tutorial