Updated: Aug 29
Juan Gómez is in his 24th year of teaching. He teaches AP Statistics and this past summer was his first as an AP Reader. He has participated in California Department of Education's Literature List for Math & Science, reviewed test items for the Smarter Balanced Consortium, and has presented at a variety of national conferences. Currently, Juan volunteers with Open Intro's Advanced High School Statistics project helping with AP Statistics content sequencing and pacing, and also helps with translating math curriculum. Juan is "all in" with the "Experience First, Formalize Later" approach to teaching AP Statistics.
There are lots of very valid reasons to avoid the AP Statistics Exam Reading. Why would anyone volunteer to grade hundreds (maybe thousands?) of exam questions when they could instead be enjoying a drink poolside? It completely makes sense to want to maximize your summer break enjoying some well deserved time with family. But, if you’re like me, you may be just a bit curious about the process used to grade AP Statistics Exams.
Here’s what I learned during my first time reading the AP Statistics Exams:
1. There is an extensive amount of training to ensure the grading is fair
During my first few years of teaching AP Statistics, I didn’t know what to tell students that didn’t quite get the scores they felt they deserved. Did they get “that reader” who was super picky and looking to mark things incorrect? Was the pressure to grade so many exams by the end of the week causing some readers to hurry through answers? To be honest, I pictured the Swedish Chef from the Muppets tossing papers left and right in an effort to get through as many as they could. You’ll be relieved to know that none of these situations are even remotely true!
Here’s how the grading process is organized. A few days before the reading started, I received an email letting me know which Free Response question I would be reading. Once we arrived in Kansas City, the majority of the first day was spent reviewing rubrics, reading training papers, and calibrating our own scoring with pre-graded papers. The grading itself happens at a table of 8 readers, with a table leader who answers questions, backreads papers, and generally makes sure that all readers at a table are grading consistently according to the rubric.
The conversations that happened at this table over the course of the reading were pure gold! It certainly helps when you have an experienced table leader (shout out to Ashley Schrekengost who answered all of my questions with grace and patience. Thank you!) and a veteran crew of high school and college teachers at your table who enjoy dicing up the rubric and don’t mind the many “what if…?” questions that came up. For example, I may have mentioned in my class that when describing experimental design, placing slips of paper in a hat and choosing them from the hat is a random process. But, “What if a student does not shake the hat? Is this a random selection? (It is not.) But, “What if they don’t shake the hat but blindly reach into the hat? (This is random.) As a result of this conversation I will be emphasizing to students the need to “shake the hat” or “reach in blindly” to ensure the random part of the process.
Having a table leader who backreads the papers I had already scored ensured that no one person could unilaterally influence a score on a question. Since each reader only reads one question in each exam booklet, the idea of one person grading an entire exam and having that influence a student’s exam score was a non-starter. After having been an AP Exam reader, I can reassure my students that readers go out of their way to look for work that supports the scoring on the rubric. This leads me to my next big takeaway from the AP Statistics reading:
2. Reading rubrics in a group is better than reading them alone!
Before I was allowed to start grading student papers, all readers went through training on how to apply the rubric. Sample responses were given that fully met the criteria, some that partially met the criteria, and something new to me: which answers were enough to “get the student in the door” to show partial understanding. As a teacher, gauging the exemplary and incomplete answers is never the problem. The tough part is knowing when a student communicates just enough understanding to be marked as partially correct.
Let’s play the “What if…?” game again. What if a question asks a student to compare statistical advantages of one study design to another and a student describes an advantage both designs share? (not enough) What if a student describes a statistical advantage of one design but doesn’t provide a comparison of one design to the other? Well, this opens the door to consider a partially correct pathway as long as the student response builds from this initial thought. Figuring out how to mark these responses can be tough even with a well written rubric. However, I found table conversations that revolved around the idea of “What if a student wrote this….?” were often the difficult ones to mark. Most of the time, we were able to resolve our initial confusion, but every once in a while our table leader consulted with a question leader – whose main responsibility was to field these clarifying questions. Having yet another layer of support to help clarify the rubric from the people who had created the rubrics, was very helpful!
While a reader gains a lot of clarity on the rubric for the question they are grading, something that is not to be missed are the lunchtime sessions where question leaders talk through the rubric for each of the free response questions. During these sessions I found myself taking notes about which types of answers were receiving “E” marks and what readers looked for in student answers. Spoiler: most questions did not require paragraphs of writing to get full credit, although in every single case, context was a must!
3. My class will change after having read the AP Statistics exam
As I headed to the airport for my flight back home after having read hundreds (maybe thousands?!) of exams it slowly dawned on me how much faster of a question reader I had become. Having to calibrate sample papers before each day of reading made me realize that I could replicate this grading experience with my students. Jason Molesky’s website has some resources on how to do this using a FRAPPY. Knowing how to elegantly communicate a statistical process is key to success on the AP Statistics exam. Asking students to grade sample papers and knowing what is vital and what isn’t will hopefully help my students write succinctly.
As I thought more about the AP reading experience, two themes about how my class would change this year crystalized:
Best Practices for Students:
Throughout this school year I plan on sharing some of the best practices I learned during the AP reading with students. Here are a few:
Include context for all FRQs. This includes describing/comparing distributions, interpreting slope, r, r^2, and conclusions for confidence intervals and significance tests.
Include all parts of a description. This includes units of measure, describing what happens when a coin lands on heads and when it lands on tail.
Be sure to answer the question being asked. If the answer is a yes or no question, be sure to start with yes or no, then follow up with a statistical explanation.
Be succinct in your writing. Having a sentence frame for interpretation can be very helpful in constructing an elegant solution. My students this year will definitely hear: “ ______% of the variation in (y-variable context) can be accounted for by the least-squares regression line using x = (x-variable context).” It may get students to buy in if you share that the rubrics often have expected correct answers that use commonly used sentence frames. Matching student writing to expected correct answers on the rubric can help demystify the grading process for students.
My students are typically great writers, but the best practices listed above will help them refine their ideas without forgetting vital parts.
Avoid Common Student Errors:
One of the benefits of seeing hundreds of student responses is seeing the variety of mistakes students make. These are a few of the tips I’ll give students this year inspired by what I saw during the reading:
When summary statistics/slope of a LSRL or other statistic is given, do include context but don’t round the given value.
Avoid writing parallel solutions. Students who are not sure which of two competing solution paths to take, may try to hedge their bets by writing two ways to get an answer. AP readers are trained to grade the weaker of two parallel solutions.
Show your work (no naked answers). Statistics requires both calculation and communication of the process used. For example, when calculating a z-score, students should write the formula they are using, the formula with numbers substituted into it, or better yet both.
Always include context! This one bears repeating. Of the six written response questions, 5 explicitly called for context to be included as a component in the rubric. While the investigative Task did not call for it directly, it did ask students to compare two different outcomes.
The task of helping students figure out what to include and what to avoid in their FRQ answers became many times easier after having participated as an AP reader.
Which leads me to …
How To Become an AP Exam Reader
Hopefully I’ll get a chance to see you at next year’s reading!