Day 20 - Lesson 3.1
Interpret the correlation.
Understand the basic properties of correlation, including how the correlation is influenced by outliers.
Distinguish correlation from causation.
Activity: How safe is Barbie?
We continue with the Barbie Bungee context today by asking students to think about how trustworthy their prediction will be based on their data. They had clearly identified a linear relationship yesterday, but how strong is that linear relationship? We will use the correlation value to answer the question.
Most students have some familiarity with the correlation value from a previous Algebra class. Typically they know that the r value must between -1 and 1 and that values closer to the ends are closer to linear.
Calculating r by hand
Most important, students will not be expected to use this formula to calculate r on the AP Exam (yes we know it’s on the formula sheet…but’s it’s never been on the exam).
So, is there value in having students go through one of these calculations? Well…maybe. There are some properties of correlation that are nicely explained by looking at the formula:
Correlation makes no distinction between explanatory and response variables. If we decided to switch our explanatory and response variables, the zx and zy would swap places in the formula. Because multiplication is commutative, we will get the same correlation value.
r does not change when we change units. z-scores tell us the number of standard deviations above or below the mean. This value does not depend on the units. If a value measured in inches is 1.78 standard deviations above the mean, then the same value measured in centimeters will still be 1.78 standard deviations above the mean.
The correlation r has no units of measurement. Suppose the explanatory variable is measured in inches. Then numerator of zx would be measured in inches, and when divided by the standard deviation (which is also measured in inches), the units would cancel out. Thus r has no units.
Correlation vs. Causation
Quite possibly the most important lesson for students to take away from an introductory statistics course is that Correlation Does Not Mean Causation. We use this Power Point to give some ridiculous examples to start the conversation (did you know that increased ice cream sales cause more shark attacks). Also, Tyler Vigen’s Spurious Correlations website and this website have many other examples.
Correlation Guessing Game
We use this applet to generate random scatterplots, and then have students “Guess the correlation”. This way, students really “experience” the correlation r, rather than simply looking at 10 examples of scatterplots and their associated correlation r value. To save time, create teams of 2 or 4 students. Putting students in teams also encourages them to discuss and refine their guesses. Inform the whole class to pay close attention as each student (or team) makes a guess. Students (or teams) who go first in Round 1 will be at a disadvantage because they did not get to see several examples before making their guess. This is the reason a Round 2 (in reverse order) is necessary. An alternate way to run this activity is March madness style, where two students face off and each makes a guess about the same scatterplot. Winners move on.