06/18/2010 10:27 am ET | Updated May 25, 2011

Murray: Are Nate Silver's Pollster Ratings 'Done Right'?

Patrick Murray is director of the Monmouth University Polling Institute

The motto of Nate Silver's website,, is "Politics Done Right." Questions have been raised whether his latest round of pollster ratings lives up to that claim.

After Mark Blumenthal noted errors and omissions in the data used to arrive at Research 2000's rating, I asked to examine Monmouth University's poll data. I found a number of errors in the 17 poll entries he attributes to us - including six polls that were actually conducted by another pollster before our partnership with the Gannett New Jersey newspapers started, one eligible poll that was omitted, one incorrect candidate margin, and even two incorrect election results that affected the error scores of four polls. [Nate emailed that he will correct these errors in his update later this summer.]

In the case of prolific pollsters, like Research 2000, these errors may not have a major impact on the ratings. But just one or two database errors could significantly affect the ratings of pollsters with relatively limited track records - such as the 157 (out of 262) organizations with fewer than 5 polls to their credit. Some observers have called on Nate to demonstrate transparency in his own methods by releasing that database. Nate has refused to do this (with a somewhat dubious justification), but at least he now has a process for pollsters to verify their own data.

Basic errors in the database are certainly a problem, but the issue that has really generated buzz in the polling community is his new "transparency bonus." This is based on the premise that pollsters who were members of the National Council on Public Polls or had committed to the American Association for Public Opinion Research (AAPOR) Transparency Initiative as of June 1, 2010 exhibit superior polling performance. These pollsters are awarded a very sizable "transparency bonus" in the latest ratings.

Others have remarked on the apparent arbitrariness of this "transparency bonus" cutoff date. Many, if not most, pollsters who signed onto the initiative by June 1, 2010 were either involved in the planning or attended the AAPOR national conference in May. A general call to support the initiative did not go out until June 7.

Nate claims that, regardless of how a pollster made it onto the list, these pollsters are simply better at election forecasting, and he provides the results of a regression analysis as evidence. The problem is that the transparency score misses most researchers' threshold for being significant (p<.05). In="" fact,="" of="" the="" three="" variables="" in="" his="" equation="" -="" transparent,="" partisan,="" and="" Internet="" polls="" only="" partisan="" polling="" shows="" a="" significant="" relationship.="" Yet,="" Pollster="" Introduced="" Error="" (PIE)="" calculation="" awards="" "transparent"="" penalizes="" polls,="" but="" leaves="" untouched.="" Moreover,="" model="" explains="" 3%="" total="" variance="" pollster="" raw="" scores="" (i.e.="" error).="">

I decided to run some ANOVA tests on the effect of the transparency variable on pollster raw scores for the full list of pollsters as well as sub-groups at various levels of polling output (e.g. pollsters with more than 10 polls, pollsters with only 1 or 2 polls, etc.). The F values for these tests range from only 1.2 to 3.6 under each condition, and none are significant at p<.05. In="" other="" words,="" there="" may="" be="" more="" that="" separates="" pollsters="" ="">

I also ran a simple means analysis. The average error among all pollsters is +.54 (positive error is bad, negative is good). Among "transparent" pollsters, the average score is -.63 (se=.23), while among other pollsters it is +.68 (se=.28). A potential difference, to be sure.

I then isolated the more prolific pollsters - the 63 organizations with at least 10 polls. Among this group, the 19 "transparent" pollsters have an average error score of -.32 (se=.23) and the other 44 pollsters average +.03 (se=.17). The difference is now less stark.

On the flip side, organizations with fewer than 10 polls to their credit have an average error score of -1.38 (se=.73) if they are "transparent" - all 8 of them - and a mean of +.83 (se=.28) if they are not. That's a much larger difference. Could it be that the real contributing factor to pollster performance is the number of polls conducted over time?

Consider that 70% of "transparent" pollsters on Nate's list have 10 or more polls to their credit, but only 19% of the "non-transparent" organizations have been equally as prolific. In effect, "non-transparent" pollsters are penalized for being affiliated with a large number of colleagues who have only a handful of polls to their name - i.e. pollsters who are prone to greater error.

To assess the tangible effect of the transparency bonus (or non-transparency penalty) on pollster ratings, I re-ran Nate's PIE calculation using a level playing field for all 262 pollsters on the list to rank order them. [I set the group mean error to +.50, which is approximately the mean error among all pollsters.] Comparing the relative pollster ranking between his and my lists produced some intriguing results. The vast majority of pollster ranks (175) did not change by more than 10 spots on the table. On its face, this first finding raises questions about the meaningfulness of the transparency bonus.

Another 67 pollsters moved between 11 to 40 ranks between the two lists, 11 shifted by 41 to 100 spots, and 9 pollsters gained more than 100 spots in the rankings, solely due to the transparency bonus. Of this last group, only 2 of the 9 had more than 15 polls recorded in the database. This raises the question of whether these pollsters are being judged on their own merits or riding others' coattails, as it were.

Nate says that the main purpose of his project is not to rate pollsters' past performance but to determine probable accuracy going forward. The complexity of his approach boggles the mind - his methodology statement contains about 4,800 words including 18 footnotes. It's all a bit dazzling, but in reality it seems like he's making three left turns to go right.

Other poll aggregators use less elaborate methods - including straightforward means - and have been just as, or even more, accurate with their election models (see here and here). I wonder if, with the addition of this transparency score, Nate has taken one left turn too many.