We've had quite a bit of discussion today in the comments section about the wide variation in results from the South Carolina polls. Reader Ciccina noticed some "fascinating" differences in the percentages reported as undecided, differences that lead reader Joshua Bradshaw to ask, "how is it possible to have so widely different poll numbers from the same time period?" There are many important technical reasons for the variation, but they all stem from the same underlying cause: Many South Carolina voters are still uncertain, both about their choices and about whether they will vote (my colleague Charles Franklin has a separate post up this afternoon looking at South Carolina's "endgame" trends).
Take a look at the results of eight different polls released in the last few days. As Ciccina noticed, the biggest differences are in the "undecided" percentage, which varies from 1% to 36%:
1) "Undecided" voters -- Obviously, the differences in the undecided percentage are about much more than the random sampling variation that gives us the so-called "margin of error," but they are surprisingly common. Differences in question wording, context, survey mode and interviewer technique can explain much of the difference. In fact, variations in the undecided percentage are usually the main sources of "house effect" differences among pollsters.
The key issue is that many voters are less than completely certain about how they will vote and will hesitate when confronted by a pollster's trial heat question. How the pollster handles that hesitation determines the percentage that ultimately get recorded as undecided.
On one extreme, is the approach taken by the Clemson University Palmetto Poll. First, their trial-heat question, as reproduced in an online report, appears to prompt for "undecided" as one of the four choices. And just before the vote question, they asked another question which probably suggests to respondents that "undecided" is a common response:
Q1. Thinking about the 2008 presidential election, which of the following best describes your thoughts on this contest?
1. You have a good idea about who you will support
2. You are following the news, but have not decided
3. You are not paying much attention to the news about it
4. Don’t know, no answer
So two of the categories prime respondents with the idea that other South Carolina voters either "have not decided" or are "not paying much attention."
Most pollsters take the opposite approach. They try to word their questions, train their interviewers or structure their automated calls in a way to push voters toward expressing a preference. Most pollsters include an explicit follow-up to those who say they are uncertain, asking which way they "lean." The pollsters that typically report the lowest undecided percentages have probably trained their interviewers to push especially hard for an answer. And SurveyUSA, the pollster with the smallest undecided in South Carolina (1%), typically inserts a pause in their automated script, so that respondents have to wait several seconds before hearing they can "press 9 for undecided."
But it is probably best to focus on the underlying cause of all this variation: South Carolina voters feel a lot of uncertainty about their choice. Four of the pollsters followed up with a question about whether voters might still change their minds, and 18% to 26% said that they might. So many South Carolina Democrats -- like those in Iowa, New Hampshire before them -- are feeling uncertain about their decision. Thus, as reader Russ points out, "the last 24 hours" may count as much in South Carolina as elsewhere.
2) Interviewer or automated? - A related issue is what pollsters call the survey "mode." Do they conduct interviews with live interviewers or with an automated methodology (usually called "interactive voice response" or IVR) that uses a recording and asks respondents to answer by pressing keys on their touch-tone phones.
Three of the pollsters that released surveys over the last week (SurveyUSA, Rasmussen and PPP) use the IVR method (as does InsiderAdvantage), while the others use live interviewers. One thing to note is that the so-called "Bradley/Wilder effect" (or a the "reverse" Bradley/Wilder effect - via Kaus) assumes that respondents alter or hide their preferences to avoid a sense of "social discomfort" with the interviewer. Without an interviewer, there should be little or no effect.
In this case the difference seems to be mostly about the undecided percentage, which is lower for the IVR surveys. In the most recent surveys, the three IVR pollsters report a smaller undecided percentage (7%) than the live interviewer pollsters (17%). That pattern is typical, although pollsters disagree about the reasons. Some say voters are more willing to cast a "secret ballot" without an interviewer involved, while others argue that those willing to participate in IVR polls tend to be more opinionated.
If the Bradley/Wilder effect is operating, we would expect to see it on surveys that use live interviewers, but in this case, the lack of an interviewer seems to work in Obama's favor. He leads Clinton by an average of 17 points on the IVR polls (44% to 27%, with 19% for Edwards), but by only 9 points on the interviewer surveys (37% to 28%, with 17% for Edwards).
3) What Percentage of Adults? -- Four years ago, the turnout of 289,856
573,101 South Carolina Democrats amounted to roughly 9% 20% of the eligible adults in the state.* Turnout tomorrow will likely be higher, but how much higher is anyone's guess. Thus, selecting "likely voters" in South Carolina may not be as challenging as the Iowa or Nevada caucuses, but it comes close.
For Iowa, I spent several months requesting the information necessary to try to calculate the percentage of adults represented by each pollster. With the exception of SurveyUSA (who tell us their unweighted Democratic likely voter sample amounted to 33% the adults they interviewed), none of the pollsters have reported incidence data.
So some of the variation in results may come from the tightness of the screen, but we have no way to know for certain.
4) List or RDD? One important related issue is the "sample frame." Three of the South Carolina pollsters (SurveyUSA, ARG and Rasmussen) typically use a random-digit dial (RDD) technique that samples from all landline phones. They have to use screen questions to select likely Democratic primary voters.
As least two (PPP and Clemson) drew samples from lists of registered votes and used the records on the lists to narrow their sampled universe to those they knew had a past history of participating in primaries.
These two methods may also contribute to different results, and pollsters debate the merits of each approach.
5) Demographics? Differences in the likely voter selection methods mean that the South Carolina polls have differences in the kinds of people sampled for each poll. One of the most important characteristics is the percentage of African-Americans. It varies from 42% to 55% among the five pollsters that reported it (I extrapolated an approximate value for Rasmussen from their results by race crosstab).
Another important difference largely hidden from view is the age composition of each sample. Only three pollsters reported an age breakdown. SurveyUSA reports 50% under the age of 50, compared to 43% on the McClatchy/MSNBC/Mason-Dixon survey. PPP had an older sample, with only 23% under the age of 45.
So the bottom line? All of these surveys indicate quite a bit of uncertainty, both about who will vote and about the preferences that their "likely voters" express. Obama appears to have an advantage, but we will not know how large until the votes are counted.
*Kevin: thank you for the edit.