Unit 2: Exploring Two-Variable Data

Updated: Sep 22, 2019

The new College Board Course and Exam Description (CED) has presented the new standard for content and pacing in AP Statistics, and we have been making some adjustments to our daily lesson plans. Just as there were some changes we made in Unit 1: Exploring One-Variable Data (mosaic plots and a new definition of percentile), there are some changes that the College Board has put into Unit 2: Exploring Two-Variable Data that require us to make a few tweaks.


Nonlinear Data

In the past, we used non-linear data as the caboose of the course, just after completing inference for linear regression. Now, we will teach these lessons along with all the other two-variable analysis. Here is our new pacing guide:

One of the bonuses of this new schedule is that students will now have the option to use a non-linear model to make their final prediction for the Barbie Bungee Finale.


Outliers, Influential, and High-leverage Oh My!

Holy moly. There is a lot of vocabulary to keep track of here. Let's look at the CED definition for each:


Outlier: An outlier in regression is a point that does not follow the general trend shown in the rest of the data and has a large residual when the Least Squares Regression Line (LSRL) is calculated.

Notice that the CED is requiring an outlier "has a large residual."



Influential point: An influential point in regression is any point that, if removed, changes the relationship substantially. Examples include much different slope, y intercept, and/or correlation.

This one feels pretty familiar. Usually we have students find the slope, y-intercept, and correlation of a set of data, then remove a point and re-calculate each of these values. If any of the values changes substantially (what does this even mean?), then the point is influential.


High leverage point: A high-leverage point in regression has a substantially larger or smaller x-value than the other observations have.

So here we are only looking at only one variable (x). If one of the points has an x-value that is substantially larger or smaller (doesn't this sound like the one-variable definition of outlier?), then the point is considered high-leverage.


When Should I Teach Unit 2?


Some teachers prefer to save this unit until the end of the course and pair it with the inference for linear regression. This is a good option to consider if you are trying to save some days, as you won't have to review anything before jumping into the inference for linear regression. We prefer to teach this unit early in the year, allow students time to forget everything, and then get a refresh at the end of the course.

3,628 views3 comments

Copyright © 2020 Stats Medic

Let's connect 

  • Facebook - White Circle
  • Twitter - White Circle