What Olympic Scoring Can Teach You About Polling

Have you noticed that Hillary Clinton's lead in the Real Clear Politics average of averages has declined from 7.6 points on August 9th to 6.4 points on August 11th?

This might seem quite puzzling if you consider that almost every national poll and state poll has shown Clinton increasing her lead. Does this indicate a softening of support or a tightening of the race, or some really bad political event on August 10th for the Clinton campaign? In fact, what you are actually seeing can be illustrated with an example from the Olympics (borrowed from my book Everydata).

Could an Olympic athlete lose out on a medal simply because of the way her scores are calculated?

Yes - and here's how.

In diving, gymnastics, and other sports, an athlete's score may be calculated by taking all of the judges' scores for an event, dismissing the highest and lowest scores, and then calculating the average.

This tactic--known as mean trimming (the "mean" is what most people think of when you say "average")--can help avoid having a judge's bias or personal preference affect the outcome. Mean trimming is one way in which people try to handle outliers--by simply eliminating them.

But it's possible that mean trimming could have affected the medal standings in at least one event (the women's 10-meter platform), according to Nationalistic Judging Bias in the 2000 Olympic Diving Competition, a paper by John W. Emerson and Silas Meredith that looked at the diving scores from the 2000 Olympics.

Does mean trimming--this specific method of dealing with potential outliers--work? Ask yourself:

  • What would happen if there was more than one judge who was biased in favor of an athlete? The Olympic system--as it's commonly used--only eliminates the one highest and one lowest value.
  • Is it fair that mean trimming treats the highest and lowest values as if they're outliers regardless of whether they truly are or not?
  • Is a high or low score--whether it's an outlier or not--actually a sign of bias?

This last question is one of the most interesting to explore. Yes, nationalistic bias may exist--Emerson and Meredith found that "most judges gave some type of nationalistic bump to their countrymen without giving a similar bump to non-countrymen." But consider the Chinese diving judge. His average score for Chinese divers at the 2000 Olympics was 1.48 points higher than his average score for non-Chinese divers. Seems like a bias, right? But when the researchers analyzed the data, they actually found that he was "apparently the least biased judge" based on his scores.

How is this possible?

Because the Chinese judge, it turns out, scored both Chinese and non-Chinese divers higher than the other judges, on average. And the Chinese divers were really good; in fact, their average scores were 1.44 points higher than non-Chinese divers. So, when researchers looked at the relative magnitude by which this judge's scores were higher for Chinese divers compared to all other divers, it was actually less than the amount (aka magnitude) by which other judges elevated the scores of divers from their home countries relative to all others. Does it make sense, then, to discard his scores in this scenario?

Now, back to our Hillary Clinton polling example. What changed in the RCP average of averages during this two-week period? It turns out that a McClatchy/Marist poll taken in early August had Hillary Clinton up by 14 points. The average of averages drops polls after a week or two, and that poll dropped out of the average. Just like mean trimming, when you lost an outlier, the average can change--but it doesn't necessarily mean the race is tightening.

If you want a gold medal in being a better consumer of polls, keep this in mind as you're considering changes in the average of averages.

Today's blog post includes excerpts from my book Everydata and was co-authored by Mike Gluck.

Excerpt from: John H. Johnson, PhD, and Mike Gluck. "Everydata: The Misinformation Hidden in the Little Data You Consume Every Day."