I was driving my ’58 Plymouth Fury on a long trip out of Boulder, Colorado to a strange town in Maine, when I stopped at a hotel along the way for a needed caffeine boost. A man in glasses and a way with words came over and asked, “So, what do you know about percentile rank and the normal curve?” It was strange for a pick-up line, but…
Ok, ok, that’s not what happened. Frankly, I doubt that Stephen King could have taught me these fundamental concepts in a conversation as well as he taught them to me in one of his novels. Here’s what really happened.
I was taking a research methods class, going through a phase in which I transformed from Geek Type E (English Lit Major) to Geek Type S (Social Scientist). The statistics gave me trouble, given my earlier career as a math-avoidant bibliophile. I understood, on a basic level, the concept of the normal curve. I was willing to accept that many kinds of data follow a distribution pattern where there are a few data points at one end, a lot in the middle, and a few at the other end. We’ve all heard of the bell curve.
Percentiles and percentile ranks (definitions differ slightly) also made intuitive sense to me as an overachieving nerd who understood that my SAT score could also be expressed as what percentage of test-takers scored at or below my score.
What I couldn’t get was how these two concepts relate to one another, and the biggest stumbling block was that percentile rank is not an equal-interval score. That means that the difference between the 25th and 30th percentiles is not the same as the difference between the 55th and 60th percentiles, or the 90th and the 95th percentiles. 30-25 = 5. 60-55 = 5. 95-90 = 5. So why is it that when it comes to percentiles, 5 does not mean the same thing as 5? The technical explanation is that percentile ranks are tied to the normal curve, so some are closer than others.
I still didn’t get it.
Enter the Master of Horror. In a serendipitous moment, I picked up a copy of King’s The Long Walk. In the novel, 100 boys living in a contemporary dystopia participate in an event called the Walk. They have to maintain a speed of 4 miles an hour, there are no stops or resting breaks, and failure to keep moving or abide by any of the strict rules of the contest results in immediate death. Soldiers stay with the walkers, ready to shoot at any moment. The contest ends when there is only one walker alive.
As the novel opens, a few boys are shot right away. They have mental or physical problems that immediately take them out of the contest. Most of the others keep on going, until there are big losses around the midpoint. When it gets down to the last five walkers… well, what they go through is unbearable. The tension of wondering who will make it, who can keep lifting his feet and putting them down, who can suppress the psychological terror of it all long enough to keep on going, is vintage King.
It’s also a near-perfect representation of percentile rank and why differences are not equal. I realized that if you plotted how long each boy walked before being shot, you would end up with a normal curve. Since there were 100 boys, each boy’s placement could be equated to percentile rank. Now I could see that 5 does not always equal 5. The difference between walkers who came in at, say, 55th and 60th places was inconsequential. It is easy to imagine them switching ranks because there was very little to distinguish them from one another. But the difference between coming in as the winner and coming in fifth was profound. The boys in the middle were all about the same, the boys who died at the beginning and end were both very different from the group as a whole and from one another. Aha!
If it still doesn’t make sense to you Primary Geek Type Es, read the book and then come back to this post. You’ll see what I mean. Students take note, however. Reading horror novels as a method of studying for your statistics classes is generally not recommended. On the other hand, we could explore probability calculations for encountering scary creatures in dark, wooded areas, or incidence and prevalence rates for vampire infections in the general population…