This blog post was written by former AP Statistics Chief Reader and statistics professor Allan Rossman and is borrowed (with permission) from his blog Ask Good Questions.

This blog is about asking good questions to teach introductory statistics, so let me tell you about my all-time favorite question. I want to emphasize from the outset that I had nothing to do with writing it. I’m just a big fan.

I am referring to question #6, called an investigative task, on the 2009 AP Statistics exam. I’ll show you the question piece-by-piece, snipped from the College Board website. You can find this question and many other released AP Statistics exams here.

Here’s how the question begins:

Oh dear, I have to admit that this is an inauspicious start. Frankly, I think this a boring, generic context for a statistics question. Even worse, there’s no mention of real data. What’s so great about this? Nothing at all, but please read on …

I think this is a fine question, but I admit that it’s a fairly routine one. Describing the parameter in a study is an important step, and I suspect that students find this much more challenging than many instructors realize. I would call this an adequate question, perhaps a good question, certainly not a great question. So, I don’t blame you if you’re wondering why this is my all-time favorite question. Please read on …

Now we’re getting somewhere. I think this is pretty clever: presenting students with a statistic that they have almost certainly never encountered before, and asking them to figure out something about the unknown statistic based on what they know. The question is not particularly hard, but it does ask students to apply something they know to a new situation. Students should realize that right-skewed distributions tend to have a larger mean than median, so the ratio mean/median should be greater than 1 with these data.

Part (b) also helps students to prepare for what comes next …

Now we’re talking! I think part (c) makes this a great question. To answer this part well, students have to understand the reasoning process of statistical significance, and they have to apply that reasoning process in a situation that they have almost surely never encountered or even thought about: making an inference about the symmetry or skewness of a population distribution. This is extremely challenging, but I think this assesses something very important: whether students can apply what they have learned to a novel situation that goes a bit beyond what they studied.

Notice that this question does not use words such as hypothesis or test or reject or strength of evidence or p-value. The key word in the question is plausible. Students have to realize that the simulation analysis presented allows them to assess the plausibility of the assumption underlying the simulation: that the population follows a normal distribution. Then they need to recognize that they can assess plausibility by seeing whether the observed value of the sample statistic is unusual in the simulated (null) distribution of that statistic. It turns out that the observed value of the mean/median ratio (1.03) is not very unusual in the simulated (null) distribution, because 14/100 of the simulated samples produced a statistic more extreme than the observed sample value. Therefore, students should conclude that the simulation analysis reveals that a normally distributed population could plausibly have produced the observed sample.

A common student error is not recognizing the crucial role that the observed value (1.03) of the statistic plays. More specifically, two common student errors are:

Commenting that the simulated distribution is roughly symmetric, and concluding that it’s plausible that the population distribution is normal. Students who make this error are failing to notice the distinction between the simulated distribution of sample statistics and the population distribution of mpg values.

Commenting that the simulated distribution of sample statistics is centered around the value 1, which is the expected value of the statistic from a normal population, and concluding that it’s plausible that the population distribution is normal. Students who make this error are failing to realize that the simulation assumed a normal population in the first place, which is why the distribution of simulated sample statistics is centered around the value 1.

If this question ended here, it would be one of my all-time favorites. But it doesn’t end here. There’s a fourth part, which catapults this question into the exalted status of my all-time favorite. Once again (and for the last time!), please read on…

Wow, look at what’s happening here! Students are being told that they don’t have to restrict their attention to common statistics that they have been taught. Rather, this question asks students to exercise their intellectual power to create their own statistic! Moreover, they should know enough to predict how their statistic will behave in a certain situation (namely, a right-skewed distribution). This part of the question not only asks students to synthesize and apply what they have learned, but it also invites students to exercise an intellectual capability that they probably did not even realized they possess. Some common (good) answers from students include the following statistics, both of which should take a value greater than 1 with a right-skewed distribution:

(maximum – median) / (median – minimum)

(upper quartile – median) / (median – lower quartile)

There you have it: my all-time favorite question from an introductory statistics exam. I encourage you to ask this question, or some variation of it*, of your students. I suggest asking this in a low-stakes setting and then discussing it with students afterward. Encourage them to realize that the reasoning processes they learn in class can be applied to new situations that they have not explicitly studied, and also help them to recognize that they are developing the intellectual power to create new analyses of their own.

Check out other valuable blog posts about teaching introductory statistics from Allan Rossman on his blog Ask Good Questions.

I love this. "Make your own statistic" is one of my favorite statistical activities, though I will admit that in the rush to complete AP Stats I was never able to spend as much time on it as I would have liked, back when I taught it. The German Tank problem (which I learned about from The Practice of Statistics but has been around for a while apparently) is a great activity to explore the idea of sampling distributions and making your own stat.

When I made a quantitative sampling distribution web applet, available at https://stats.cpm.org/quantsamples/ , I intentionally made the option to generate your own Custom statistics and create distributions to open this up for testing. You could us…