How We Choose Polls to Plot: Part II

How We Choose Polls to Plot: Part II
This post was published on the now-closed HuffPost Contributor platform. Contributors control their own work and posted freely to our site. If you need to flag this entry as abusive, send us an email.
Mark started this conversation with "Why we choose polls to plot: Part I" asking how we decide to handle likely voter vs registered voter vs adult samples in our horse race estimates. This was especially driven home by the Washington Post/ABC poll reporting quite different results for A, RV and LV subsamples but it is a good problem in general. So let's review the bidding.
The first rule for Pollster is that we don't cherry pick. We make every effort to include every poll, even if it sometimes hurts. So even when we see a poll way out of line with other polls and what we "know" has to be true, we keep that poll in our data and in our trend estimates. There are two reasons. First, once you start cherry picking you never know when to stop. Second, we designed our trend estimator to be pretty resistant to the effect of any one poll (though when there are few polls this can't always be true.) That rule has served us pretty well. Whatever else may be wrong with Pollster, we are never guilty of including just the polls (or pollsters) we like.
But what do we do when one poll gives more than one answer? The ABC/WP poll is a great example, with results for all three subgroups: adults, registered voters and likely voters. Which to use? And what to do that remains consistent with our prime directive: never cherry pick?
Part of the answer is to have a rule for inclusion and stick to it stubbornly. (I hear Mark sighing that you can do too much of this stubborn thing.) But again the ABC/WP example is a good one. Their RV result was more in line with other recent polls while their LV result showed the race a good deal closer. If we didn't have a firm, fixed, rule we'd be sorely tempted to take the result that was "right" because it agreed with other data. This would build in a bias in our data that would underestimate the actual variation in polling because we'd systematically pick results closer to other polls. Even worse would be picking the number that was "right" because it agreed with our personal political preferences. But that problem doesn't arise so long as we have a fixed rule for what populations to include in cases of multiple results. Which is what we have.
That rule for election horse races is "take the sample that is most likely to vote" as determined by the pollster that conducted the survey. If the pollster was content to just survey adults, then so be it. That was their call. If they were content with registered voters, again use that. But if they offer more than one result, use the one that is intended to best represent the electorate. That is likely voters, when available.
We know there are a variety of problems with likely voter screens, evidence that who is a likely voter can change over the campaign and the problem of new voters. But the pollster "solves" these problems to the best of their professional judgement when they design the sample and when they calculate results. If a pollster doesn't "believe" their LV results, then it is a strange professional judgement to report them anyway. If they think that RV results "better" represent the electorate than their LV results, they need to reconsider why they are defining LV as they do. Our decision rule says "trust the pollster" to make the best call their professional skills can make. It might not be the one we would make, but that's why the pollster is getting the big bucks. And our rule puts responsibility squarely on the pollsters shoulders as well, which is where it should be. (By the way, calling the pollster and asking which result they think is best is both impractical for every poll, AND suffers the same problems we would introduce if we chose which results to use.)
But still, doesn't this ignore data? Yes it does. Back in the old days, I included multiple results from any poll that reported more than one vote estimate. If a pollster gave adult, RV and LV results, then that poll appeared three times in the data, once for each population. But as I worked with these data, I decided that was a mistake. First, it was confusing because there would be multiple results for a poll-- three dots instead of one in the graph. That also would give more influence to pollsters who reported for more than one population compared to those pollsters who only reported LV or RV. Finally, not that many polls report more than one number. Yes sometimes some pollsters do, but the vast majority decide what population to represent and then report that result. End of story. So by trying to include multiple populations from a single poll, we were letting a small minority of cases create considerable confusion with little gain.
The one gain that IS possible, is to be able to compare within a single survey what the effect of likelihood of vote is. The ABC/WP poll is a very positive example of this. By giving us all three results, they let us see what the effect of their turnout model is on the vote estimate. Those who only report LV results hide from us what the consequences might be of making the LV screen a bit looser or a bit tighter. So despite our decision rule, I applaud the Post/ABC folks for providing more data. That can never be bad. But so few pollsters do it that we can't exploit such comparisons in our trend data. There just aren't enough cases.
What would be ideal is to compare adult, RV and LV subsamples by every pollster, then gauge the effect of each group on the vote. But since few do this, we end up having to compare LV samples by one pollster with RV samples by another and adult samples by others. That gets us some idea of the effect of sample selection, but it also confuses the differences between survey organizations with differences in the likely voter screens. Still, it is the best we can do with the data we have.
So let's take a look at what difference the sample makes. The chart below shows the trend estimate using all the polls, LV, RV and adult samples separately. We currently have 109 LV samples, 136 RV and 37 adult. There are some visible differences. The RV (blue) trend is generally more favorable to Obama than is the LV (red) trend, though they mostly agreed in June-July. But the differences are not large. All three sub-population trend estimates fall within the 68% confidence interval around the overall trend estimate (gray line.) There is good reason to think that likely voters are usually a bit more Republican than are registered or adult samples. The data are consistent with that, amounting to differences that are large enough to notice, if not to statistically distinguish with confidence. Perhaps more useful is to notice the scatter of points and how blue and red points intermingle. While there are some differences on average, the spread of both RV and LV samples (and adult) is pretty large. The differences in samples make detectable differences, but the points do not belong to different regions of the plot. They largely overlap and we shouldn't exaggerate their differences.
There is a valid empirical question still open. Do LV samples more accurately predict election outcomes than do RV samples? And when in the election cycle does that benefit kick in, if ever? That is a good question that research might answer. The answer might lead me to change my decision rule for which results to include. But if RV should outperform LV samples, then the polling community has a lot of explaining to do about why they use LV samples at all. Until LV samples are proven worse than RV (or adult) then I'll stick to the fixed, firm, stubbornly clung to, rule we have. And if we should ever change, I'll want to stick stubbornly to that one. The worst thing we could do is to have to make up our minds every day about which results to include and which not based on which results we "like."

[Update: In Part III of this thread, Mark Blumenthal answers to some of the comments below and poses a new question].

Popular in the Community

Close

What's Hot