# Tell the Whole Story: Evidence for Ha by Josh Tabor

Updated: Apr 19, 2019

*Today we have Josh Tabor as our guest blogger. Josh is a high school statistics teacher at Canyon del Oro high school in Arizona. He also the co-author of *__Statistical Reasoning in Sports__*, *__Statistics and Probability with Applications__* (SPA), and *__The Practice of Statistics__* (TPS). Josh is a question leader at the AP Statistics Exam Reading each year, an AP Statistics consultant for the College Board, and also one of my greatest mentors. Josh has and continues to transform the way I teach statistics.*

By the end of the year, my students can write a conclusion to a significance test with great proficiency: “Because the *p*-value of 0.03 is less than = 0.05, we reject *H*0. There is convincing evidence that…” However, if you ask them to interpret the *p*-value, they will often look at you with a mixture of fear and confusion. I am convinced that the root of this problem is that students aren’t thinking about the question the *p*-value is trying to answer.

**How I know this matters**

This really hit home as I graded __Question #5 on the 2013 AP exam__.

Here is the short version of this item: An observational study was conducted to see if there was a relationship between meditation and blood pressure among men who lived in a retirement community. Of the 11 men who meditated daily, 0 had high blood pressure while 8 of the 17 who didn’t meditate had high blood pressure.

Part (a) of the item asked students to recognize __correlation doesn’t imply causation__ as this was an observational study.

Part (b) asked students to explain why a two-sample *z* test for a difference in proportions was not appropriate (the large counts condition isn’t met).

Because the two-sample *z* test is inappropriate, Part (c) asked students to use the results of a simulation to draw a conclusion about the study. The item presented a graph that showed the simulated sampling distribution of the difference in proportions under the assumption that the null hypothesis was true (p*med* = p*not *). Here are the results of 100 trials of this simulation, using the __One Categorical Variable applet.__

**The problem**

On the AP Exam, what did over 99% of the students do at this point? *They completely ignored the data from the study!* That is, fewer than 1% of students thought to compare the proportion of meditators who had high blood pressure (0/11 = 0) to the proportion of non-meditators who had high blood pressure (8/17 = 0.47).

Here’s what students should have been thinking: “In the actual study, the difference in proportions was –0.47. Wow! This seems like a pretty big difference. But, maybe there is no difference in the true proportions, and the researchers got a difference this big by chance alone. Hmmm…I wonder how likely it is to get a difference like this if there is no difference between meditators and non-meditators?”

The answer to this last question is the *p*-value, which can be easil