08/11/2008 05:59 pm ET | Updated May 25, 2011

How We Choose Polls to Plot: Part I

Since adding the maps and upgrading this site, we have received a number of good questions about how the charts and trend lines work and why we choose to include the poll results that we do. I want to answer a few of those questions this week before we call get swept up in the conventions and the final stretch of the fall campaign.

Our approach to charting and aggregating poll data follows the lead and philosophy of our co-founder Charles Franklin. And while I am tempted to describe that approach as well entrenched, the reality is that in many ways it has and will continue to evolve.

Since launching this site nearly two years ago, Franklin and I have continued to discuss (and occasionally debate) some of the technical issues offline. Most of the time we agree, but I tend to propose ways to change or tinker with our approach, and Franklin usually succeeds in convincing me to stay the course.

In considering some of issues that came up more recently, I thought it might be helpful to take this dialogue online. Hopefully, we can both answer some of the questions readers have asked and also seek further input on those issues we have not completely resolved.

So with that introduction out of the way, here is the first question for Franklin:

Over the last few weeks, in commenting on the "likely voter" subgroups reported by Gallup and other national pollsters, I have essentially recommended that we focus on the more stable population of registered voters (RV) now, and leave the "likely voter" (LV) models for October (see especially here, here, here and here). Yet as many readers have noticed, when national surveys publish numbers for both likely and registered voters, our practice has been to use the "likely voter" numbers for our charts and tables.


Like the other sites that aggregate polling results from different sources, we face the challenge of how to best choose among many polls that are not strictly comparable to each other. Even if we examine data from one pollster at a time, we will still see methodological changes: Many national pollsters will shift at some point from reporting results from registered voters to "likely voters." Some will shift from one likely voter "model" to another, or will tinker with the mechanics of their model, often without providing any explanation or notice of the change. And no two pollsters are exactly alike in terms of either the mechanics they use or the timing of the changes they make.

As such, two principles guide our practices for selecting results for the charts and tables: First, we want to defer to each pollster's judgement about the most appropriate methodology (be it sample, questionnaire design or the most appropriate method to select the probable electorate). Second, we want a simple, objective set of rules to follow in deciding which numbers to plot on the chart.

In that spirit, when pollsters release results for more than one population of potential voters, our rule is to use the most restrictive. So we give preference to results among "likely" voters over registered voters and to registered voters over results among all adults. In almost all cases, the rule is consistent with the underlying philosophy: The numbers for the more restrictive populations are usually the ones that the pollsters themselves (or their media clients) choose to emphasize.

But there have been some notable exceptions recently, of which, last month's ABC News/Washington Post poll provided the most glaring example. ABC News put out a report and filled-in questionnaire with two sets of results: They showed Barack Obama leading John McCain by eight points (50% to 42%) among registered voters, but by only three points (49% to 46%) among likely voters. Following our standard procedure, we included the likely voter numbers in our chart.

However, ABC News emphasized the eight-point registered voters numbers in the headline of their online story ("Obama Leads McCain by Eight But Doubts Loom"). Within the text, they first reported the registered vote numbers and then used the likely voter results to argue that "turnout makes a difference." The 8-point lead also made the headline of the Washington Post story, but they did not report the likely voter results at all, either in the text of the story on in their version of the filled-in questionnaire.

So in this case, the news organizations that sponsored the poll clearly indicated that the RV numbers deserved greater emphasis, yet we followed our rule and included the LV numbers in our charts.

Charles, in cases like these, should we find a way make an exception? And why not just report on "registered" voters until after the conventions?

Update: Franklin answers in Part II.