I want to pick up where I left
off on Tuesday, when I wrote about the way national surveys screen for
primary voters. How well have the pollsters in early primary states done in
disclosing how tightly they "screen" to identify the voters that will actually
turn out to vote (or caucus)? Not very well, unfortunately.
For those just dropping in, here is the basic dilemma: Voter
turnout in primary elections and, especially in caucus states like Iowa, is typically much
lower than in the general election. A pre-election survey that aims to track
and ultimately project the outcome of the "horse-race" -- the measure of voter
preferences "if the election were held today" -- needs to represent the population
voters." When the expected turnout is very low, that becomes a difficult
task, especially when polling many months before an election.
And in Iowa and South Carolina, if
history is a guide, that turnout will be a very small fraction of eligible
adults,** as the following table shows:
When a pollster uses a random digit telephone methodology,
they begin by randomly sampling adults in all households with landline telephone
service. They need to use some mechanism to identify a probable electorate from
within a sample of all adults. If recent history is a guide, the probable
electorate in Iowa
-- Democrats and Republicans -- will fall
in the high single digits as a percentage of eligible adults. South Carolina's turnout is better, but is
still unlikely to exceed 30% of adults. And while the New Hampshire primary typically draws the
highest turnout of any of the presidential primaries, it still attracts less
than half of the eligible adults in the state. Despite all the attention the New Hampshire primary
receives, many voters that ultimately cast ballots in the November general election
(roughly 30% in 2000) choose to skip their states' storied primary.
A pollster may not want to "screen" so that the size of
their likely voter matches the exact level of turnout. Most campaign pollsters
I have worked with prefer to shoot for a slightly more expansive universe, both
to capture those genuinely uncertain about whether they will vote and to
account for the presumption that "refusals" (those who hang up on their own
before answering any questions) are more likely to be non-voters.
Nonetheless, the degree to which pollsters screen matters a
great deal. If, hypothetically, one Democratic primary poll captures 10% of
eligible adults while another captures 40%, the results could easily be very
different (and I'll definitely put more faith in the first).
It also matters greatly how
the pollster go about identifying likely voters. I wrote quite a bit about
that process in October 2004 as it applies to random digit dial (RDD) surveys
of general election voters. In extremely low turnout contests, such as the Iowa caucuses, most
campaign pollsters now rely on samples drawn from lists of registered voters
that include the vote history of individual voters. Most of the Democratic
pollsters I know agree with Mark Mellman, who asserted in a must-read column
in The Hill earlier this year that,
"the only accurate way to poll the Iowa caucuses starts with the party's voter
So, based on the information they routinely release, what do
we know about way the recent polls in Iowa, New Hampshire and South
Carolina screened for likely voters? As the many
questions marks in the tables below show, not much.
The gold star for disclosure goes to the automated pollster
SurveyUSA. Of 22 survey organizations active so far in these states, they are
the only organization that routinely releases (and makes available on their web
site) all of the information necessary to determine how tightly they screen. Every
release includes a simple statement like the one from their May
poll of New Hampshire
2,000 state of New
Hampshire adults were interviewed by SurveyUSA
05/04/07 through 05/06/07. . . Of the 2,000 NH adults, 1,756 were registered to
vote. Of them, 551 were identified by SurveyUSA as likely to vote in the
Republican NH Primary, 589 were identified by SurveyUSA as likely to vote in
the Democratic NH Primary, and were included in this survey.
I did the simple math using the number above (which are weighted
values). For SurveyUSA's May survey, Democratic likely voters represented 29%
of adults and Republican likely voters represented 28%, for a total of 57% of
all New Hampshire
adults. Their screen is a very reasonable fit for a survey fielded eight months
before the primary.
Honorable mention for disclosure also goes to two Iowa polls. First, the Des Moines Register poll conducted by Selzer
and Company. Ann Selzer provided me with very complete information upon
request last year. Her first Iowa caucus
survey last year used a registered voter list sample and screened reach a
population that represents roughly 11% of the eligible adults (assuming 2.0
voters in Iowa
and 2.2 million eligible
Second, the poll conducted in March by the University of Iowa.
While their survey asked an open-ended vote question (rendering the results
incomparable with those included in our Iowa chart),
did at least provide the basic numbers concerning their likely voter screen. They
interviewed 298 Democratic likely caucus goers and 178 Republican caucus-goers
out of 1,290 "registered Iowa
voters" (for an incidence of 37% of registered voters). Unfortunately, they did
not specify whether they used a registered voter list or a random digit sample,
although given the incidence of registered voters in Iowa, we can assume that the percentage of
eligible adults that passed the screen was probably in the low 30s.
And speaking of the sampling frame, only 6 of 22
organizations SurveyUSA, Des Moines
Register/Selzer, Fox News, Rasmussen Reports, Zogby, and Winthrop
University specified the sampling method they used (random digit dial, RBS or
listed telephone directory). I will give honorable mention to two more
organizations -- Chernoff Newman/ MarketSearch and the partnership of Hamilton
Beattie (D) and Ayres McHenry (R) -- that disclosed their sample method to me
upon request earlier this year.
The obfuscation of this information by the remaining 14
pollsters is particularly stunning given that the ethical codes of both the
American Association for Public Opinion Research (AAPOR) and the National
Council on Public Polls (NCPP) include explicitly
require the disclosure of the sampling method, also known as the sample
"frame." The NCPP's principles of
disclosure requires the following for its member organizations for "all reports of survey findings issued for public release:"
Sampling method employed (for
example, random-digit dialed telephone sample, list-based telephone sample,
area probability sample, probability mail sample, other probability sample,
opt-in internet panel, non-probability convenience sample, use of any
The AAPOR code mandates
definition of the population under study, and a description of the sampling
frame used to identify this population.
Finally, while virtually all of these surveys told us how
many "likely primary voters" they selected, very few provided details on how
they determined that voters (or caucus goers) were in fact "likely" to
participate. The most notable exceptions were the Hamilton
Beattie (D) Ayres McHenry (R) and Chernoff
Newman/ MarketSearch polls in South Carolina,
and the News
7/Suffolk University poll in New
Hampshire. All of these included the questions used
to screen for likely primary voters in the "filled-in" questionnaires that
included full results.
So what should an educated poll consumer do? I have one more
category of diagnostic questions to review, and then I want to propose
something we might be able to do about the very limited methodological
information available to us. For now, here's two-word hint of what I have in
mind: "upon request."
**Political scientists typically use two statistics to calculate turnout among adults: all adults of voting age (also known as
the voting age population or VAP), or all adults who are eligible to vote (or
the voter eligible population or VEP). George Mason University Professor
Michael McDonald has helped popularize
VEP as a better way to calculate voter turnout, because it excludes adults
ineligible for voting such as non-citizens and ineligible felons. The perfect
statistic for comparison to telephone surveys of adults would fall somewhere in
between, because adult telephone samples do not reach those living in
institutions or who do not speak English, but might still include non-citizens
that speak English (or Spanish where pollsters use bilingual interviewers).
In a state like California,
with a large non-citizen population, VAP is probably the better statistic for
comparisons to the way polls screen for likely voters. In Iowa,
New Hampshire and South Carolina, however, the choice has very
little impact. Had I used VAP rather than VEP above, the turnout statistics in
the table would have been roughly a half a percentage point lower.
CORRECTION: Due to an error in my spreadsheet, the original version of the turnout table above incorrectly displayed turnout as a percentage of VAP rather than VEP. For reference, the table below has turnout as a percentage of VAP.