Recap, Z-scores, and Unusual Cats.
https://canvas.sussex.ac.uk/courses/35783/quizzes
https://canvas.sussex.ac.uk/courses/35783/pages/module-contacts
Research question:
Is CBT (Cognitive Behavioural Therapy) effective for treating social anxiety?
Hypothesis:
Participants who receive the CBT intervention will show lower social anxiety levels than participants who don’t receive an intervention.
Some other examples [made up data]:
Hypothesis: The more we procrastinate, the more stressed we feel.
Some other examples [made up data]:
Hypothesis: The more we procrastinate, the more stressed we feel.
Some other examples [made up data]:
Research question: Is there a relationship between caffeine consumption and productivity?
Some other examples [made up data]:
Research question: Is there a relationship between caffeine consumption and productivity?
In quantitative research, we often (but not always):
Start with a theory,
Devise an experiment (or a cross-sectional study) to test that theory
Collect the data
Describe our sample <- last term
Test hypotheses <- this term
Forman and Leavens (2024) - The Effect of Transparency on Unsolvable Task Engagement in Domestic Cats (Felis catus) using Citizen Science
A study of social behaviours - e.g. looking at the owner while completing an unsolvable puzzle
Sample of 21 cats (each cat completed multiple trials)
On average, how long do cats spend on a task before looking at their owner?
What is the shortest and longest time?
What is the variance of scores in our sample? How do scores in our sample differ from each other?
Are there any “unusual” cats in our sample?
Measures of central tendency:
Mean: the average value \(\frac{\sum{x_i}}{n}\)
Median: the value exactly in the middle
Mode: the most common value
Measures of central tendency:
Mean: the average value \(\frac{\sum{x_i}}{n}\) = 16.56
Median: the value exactly in the middle = 16.1
Mode: the most common value (around 15)
There are 21 cats in our sample
| cat_name | look_latency |
|---|---|
| Bubbles | 25.3 |
| Commodore | 25.9 |
2 cats out of 21 represents a proportion of 2/21 = 0.095.
The empirical probability that a cat takes more than 25s to look at owner is 0.095, or 9.5 percent.
Sample =/= Population
Population distribution
Describes the frequency with which scores of a variable occur in the population.
Sample distribution
Describes the frequency with which scores of a variable occur in the sample.
Two cornerstones of statistical research:
Normal distribution is :
Symmetrical (skewness of 0)
Bell-shaped
Unimodal (only has one mode)
Defined by mean and standard deviation
Mean, median, and mode converge on one value
There are infinite possible combinations of means and standard deviations
Therefore there are infinite possible normal distributions
But not every bell-shaped distributions is a normal distribution
We know that normal distribution has:
More scores in the middle
Fewer and fewer scores in the tails, the further away we get from the mean - the centre of the distribution
Proportions matter
We expect certain proportions of scores at certain distances away from the mean:
~68% of scores will be within 1 standard deviation of the mean
~95% of scores will be within 1.96 standard deviations of the mean
~99% of scores will be within 2.58 standard deviations of the mean
~68% of scores will be within 1 standard deviation of the mean:
This means that shaded area contains approximately ~68% of the scores.
~95% of scores will be within 1.96 standard deviations of the mean:
This means that the shaded area contains approximately ~95% of the scores.
~99% of scores will be within 2.58 standard deviations of the mean:
This means that the shaded area contains approximately ~99% of the scores. The remaining <1% will be a the unshaded tails.
A way of assessing how unusual/uncommon a score is with reference to the mean
Mean of our sample: 16.56
SD sample: 4.71
\[ Z = \frac{X-M}{SD} \]
We are converting scores into standard deviation units.
Dracula is a cat from our sample
Dracula only spent 7 seconds on a task before turning to his owner
Is Dracula unusual?
Calculate Dracula’s Z-score:
\[ Z = \frac{X-M}{SD} \]
\[ Z = \frac{7-16.56}{4.71} \]
\[ Z = -2.03 \]
Dracula’s looking latency is 2.03 standard deviations smaller than the mean.
We can convert the whole sample into Z-scores:
Standardisation - the shape of the distribution remains the same but mean and SD change.
Mean = 0
SD = 1
We can convert the whole sample into Z-scores:
Standardisation allows us to work with probabilities:
~68% of scores will be within Z-scores of -1 to +1
~95% of scores will be within Z-scores -1.96 to + 1.96
~99% of scores will be within Z-scores -2.58 to + 2.58
We can convert the whole sample into Z-scores:
Standardisation allows comparisons:
Across measurement scales
Accounting for the characteristics of a normal distribution
With reference to the population
This allows us to quantify whether something “unusual” or “surprising” with reference to the population.
Option 1: Use a “Z-table” - ancient tablets from the 1800 BC. Pre-calculated and printed in special books or at the end of textbooks.