In Part II of this series on how to answer the question, "can I trust this poll," I argued that we need better ways to assess "likely voter" samples: What kinds of voters do pollsters select and how do they choose or model the likely voter population? Regular readers will recall how hard it can be to convince pollsters to disclose methodological details. In this final installment, I want to review the past efforts and propose an idea to promote more complete disclosure in the future.
First, lets review the efforts to gather details of pollster methods carried out over the last two years by this site, the American Association for Public Opinion Research (AAPOR) and the Huffington Post.
- Pollster.com - In September 2007, I made a series of requests of pollsters that had released surveys of likely caucus goers in Iowa. I asked for information about their likely voter selection methods and for estimates of the percentage of adults represented by their surveys. A month later, seven pollsters -- including all but one of the active AAPOR members -- had responded fully to my requests, five provided partial responses and five answered none of my questions. I had originally planned to make similar requests regarding polls for the New Hampshire and South Carolina primaries, but the responses trickled in so slowly and required so much individual follow-up that that I limited the project to Iowa (I reported on the substance of their responses here)
- AAPOR - In the wake of the New Hampshire primary polling snafu, AAPOR appointed an Ad Hoc Committee to investigate the performance of primary polls in New Hampshire and, ultimately, in three other states: South Carolina, California and Wisconsin. They made an extensive request of pollsters, asking not only for things the AAPOR code requires pollsters to disclose but also for more complete information, including individual-level data for all respondents. Despite allowing pollsters over a year to respond, only 7 of 21 provided information beyond minimal disclosure, and despite the implicit threat of AAPOR censure, three organizations failed to respond with even the minimal information mandated by AAPOR's ethical code (see the complete report).
- HuffingtonPost - Starting in August 2008, as part of their "Huffpollstrology" feature, the Huffington Post asked a dozen different public pollsters to provide response and refusal rates for their national polls. Six replied with response and refusal rates, two responded with limited calling statistics that did not allow for response rate calculations and four refused to respond (more on Huffpollstrology's findings here).
The disclosure requirements in the ethical codes of survey organizations like AAPOR and the National Council on Public Polls (NCPP) gained critical mass in the late 1960s. George Gallup, the founder of the Gallup Organization, was a leader in this effort, according to Albert Golin's chapter in a published history of AAPOR (The Meeting Place). In 1967, Gallup proposed creating what would ultimately become NCPP:
The disclosure standards [Gallup] was proposing were meant to govern "polling organizations whose findings regularly appear in print and on the air....also [those] that make private or public surveys for candidates and whose findings are released to the public." It was clear from his prospectus that the prestige of membership (with all that it implied for credentialing) was thought to be sufficient to recruit public polling agencies, while the threat of punitive sanctions (ranging from a reprimand to expulsion) would reinforce their adherence to disclosure standards [p. 185].
Golin adds that Gallup's efforts were aimed at a small number of "black hat" pollsters in hopes of "draw[ing] them into a group that could then exert peer influence over their activities." Ultimately, this vision evolved into AAPOR's Standards for Minimal Disclosure and NCPP's Principles of Disclosure.
Unfortunately, as the experiences of the last year attest, larger forces have eroded the ability of groups like AAPOR and NCPP to exert "peer pressure" on the field. A new breed of pollsters has emerged that cares little about the "prestige of membership" in these groups. Last year, nearly half the surveys we reported at Pollster.com had no sponsor other than the businesses that conducted them. These companies either disseminate polling results for their market value, make their money by selling subscription access to their data, or both. They know that the demand for new horse race results will drive traffic to their websites and expose their brand on cable television news networks. As such, they see little benefit to a seal of approval from NCPP or AAPOR and no need for exposure in more traditional, mainstream media outlets to disseminate their results.
The recent comments of Tom Jensen, the communications director at Public Policy Polling (PPP) are instructive:
Perhaps 10 or 20 years ago it would have been a real problem for PPP if our numbers didn't get run in the Washington Post but the fact of the matter is people who want to know what the polls are saying are finding out just fine. Every time we've put out a Virginia primary poll we've had three or four days worth of explosion in traffic to both our blog and our main website.
So when pressured by AAPOR many of these companies feel no need to comply (although I should note for the record that PPP responded to my Iowa queries last year and responded to the AAPOR Ad Hoc Committee request for minimal disclosure, but no more). The process of "punitive sanctions" moves too slowly and draws too little attention to motivate compliance among non-AAPOR members. Although the AAPOR Ad Hoc Committee made its requests in March 2008, its Standards Committee is still processing the "standards case" against those who refused to comply. In February, AAPOR issued a formal censure, its first in more than ten years, of a Johns Hopkins researcher for his failure to disclose methodological details. If you can find a single reference to it in the Memeorandum news compilation for the two days following the AAPOR announcement, your eyes are better than mine.
Meanwhile, the peer pressure that Gallup envisioned continues to work on responsible AAPOR and NCPP members, leaving them feeling unfairly singled out and exposed to attack by partisans and competitors. I got an earful of this sentiment a few weeks ago from Keating Holland, the polling director at CNN, as we were both participating in a panel discussion hosted by the DC AAPOR chapter. "Disclosure sounds like a great idea in the confines of a group full of AAPOR people," he said, "but it has real world consequences, extreme real world consequences . . . as a general principal, disclosure is a stick you are handing to your enemies and allowing them to beat you over the head with it."
So what do we do? I have an idea, and it's about scoring the quality of pollster disclosure.
To explain what I mean, let's start with the disclosure information that both AAPOR and NCPP consider mandatory -- the information that their codes say should be disclosed in all public reports. While the two standards are not identical, they largely agree on these elements (only AAPOR considers the release of response rates mandatory, while NCPP says pollsters should provide response rate information on request):
- Who sponsored/conducted the survey?
- Dates of interviewing
- Sampling method (e.g. RDD, List, Internet)
- Population (e.g. adults, registered voters, likely voters)
- Sample size
- Size and description of the subsample, if the survey report relies primarily on less than the total sample
- Margin of sampling error
- Survey mode (e.g. live interviewer, automated, internet, via cell phone?)
- Complete wording and ordering of questions mentioned in or upon which the release is based
- Percentage results of all questions reported
- [AAPOR only] The AAPOR response rate or a sample disposition report
NCPP goes farther and spells out a second level of disclosure -- information pertaining to publicly released results that its members should provide on written request:
- Estimated coverage of target population
- Respondent selection procedure (for example, within household), if any
- Maximum number of attempts to reach respondent
- Exact wording of introduction (any words preceding the first question)
- Complete wording of questions (per Level I disclosure) in any foreign languages in which the survey was conducted
- Weighted and unweighted size of any subgroup cited in the report
- Minimum number of completed questions to qualify a completed interview
- Whether interviewers were paid or unpaid (if live interviewer survey mode)
- Details of any incentives or compensation provided for respondent participation
- Description of weighting procedures (if any) used to generalize data to the full population
- Sample dispositions adequate to compute contact, cooperation and response rates
They also have a third level of disclosure that "strongly encourages" members to "release raw datasets" for publicly released results and "post complete wording, ordering and percentage results of all publicly released survey questions to a publicly available web site for a minimum of two weeks."
The relatively limited nature of the mandatory disclosure items made sense given the print and broadcast media into which public polls were disseminated when these standards were created. But now, as Pollster reader Jan Werner points out via email, things are different:
When I argued in past decades for fuller disclosure, the response was always that broadcast time or print space were limited resources and too valuable to waste on details that were only of interest to a few specialists. The Internet has now removed whatever validity that excuse may once have had, but we still don't get much real information about polls conducted by the news media, including response rates.
So here is my idea: We make a list of all the elements above, adding the likely voter information I described in Part II. We gather and record whatever methodological information pollsters choose to publicly release into our database for every public poll that Pollster.com collects. We then use the disclosed data to score the quality of disclosure of every public survey release. Aggregation of these scores would allow us to rate the quality of disclosure for each organization and publish the scores alongside polling results.
Now imagine what could happen if we made the disclosure scores freely available to other web sites, especially the popular poll aggregators like RealClearPolitics, Fivethirtyeight and the Polling Report. What if all of these sites routinely reported disclosure quality scores with polling results the way they do the margin of error? If that happened, it could create a set of incentives for pollsters to improve the quality of their disclosure in a way that enhances their reputations rather than making them feel as if they are handing a club to their enemies.
Imagine what might happen if we could create a database available for free to anyone for non-commercial purpose (via Creative Commons license) of not just polls results, sample sizes and survey dates, but also a truly rich set of methodological data appended to each survey. We might help create the tools that would allow pollsters to refine their best practices and the next wave of ordinary number crunchers to find ways to decide which polls are worthy of our trust.
The upside is that this system would not require badgering of pollsters or a reliance on a slow and limited process of "punitive sanctions." It would also not place undue emphasis on any one element of disclosure (as the "Huffpollstrology" feature does with response rates). We would record whatever is in the public domain, and if pollsters want to improve their scores, they can choose what new information to release. If a particular element is especially burdensome, they can skip it.
The principal downside is that turning this idea into a reality requires considerable work and far more resources than I have at my disposal. We would need to expand both our database and our capacity to gather and enter data. In other words, we would need to secure funding, most likely from a foundation, to make this idea a reality.
The scoring procedure would have to be thought out very carefully, since different types of polls may require different kinds of disclosure. We would need to structure and weight the index so that different categories of poll get scored fairly. I am certain that to succeed, any such project would need considerable input from pollsters and research academics. The index and scoring would also need to be utterly transparent. We would want to set up a page or data feed so that anyone on the Internet could see the disclosed information for any poll, to evaluate how any survey was scored.
For the moment, at least, this is more an idea than a plan, and it may be little more than fanciful "pie in the sky" that gets not much further than this blog posting. Nevertheless, in my five years of participating in this amazing revolution of news and information on the internet that we used to call "the blogosphere," I have come to a certain faith that ideas become a reality when we put them out in the public doman and offer them up for comment, criticism and revision.
So, dear readers, what do you think? Want to help make it reality?
[Note: I will be participating in a panel Tomorrow on "How to Get
the Most Out of Polling" at this week's Netroots Nation conference. This series of posts previews the thoughts I
am hoping to summarize tomorrow].