06/20/2008 04:54 pm ET | Updated May 25, 2011

A "Flaw" in the Iowa Poll?

Earlier today, commenter "axt113" noticed what he or she described as a "flaw" in the just released Iowa poll from SurveyUSA: "It has McCain winning the AA vote 55% to 45." I was distracted and nearly let it pass, but then our friend Ben Smith blogged a similar though subsequently hedged comment:

I'm sure it's a very small sample -- this is a poll of Iowa -- but it does raise the red flag when a survey shows Obama losing African-Americans to McCain.

UPDATE: I should have been clearer. It raises a red flag about the poll. Though really, it's mostly just another reason not to read the cross-tabs when they involve tiny, perhaps single-digit, samples.

The problem, which Ben alludes to, is that the weighted value of the African-American subgroup in the Iowa poll is just 2%. If we assume (for the moment) that the black respondents had a weight of 1.0, then those African-American results are based on a sample of 8-12 respondents. Pollsters typically have to weight up the African-American percentage in national surveys, since the black population is typically clustered in urban centers where response rates are lower (causing a non-response bias that needs to be corrected with weighting). I have no idea if such an adjustment would be necessary in Iowa, but if so, it would make that tiny subgroup even tinier.

I won't even bother to try to calculate the "margin of error" for 10 respondents. Some statisticians believe that the assumptions of the formulas break down at that level, rendering the calculations largely meaningless. For that reason, Many pollsters -- including every firm I've ever worked for -- have a policy of never releasing crosstabs to a client with a crosstab of less than 100 or less than 50 interviews (or whatever number they feel comfortable with).

Does the fact that 10 interviews produced a screwy result indicate a flawed poll? Not at all. That's the point of random sampling. The more interviews you do, the less error you get. Pull out any subgroup of 10 and you're bound to see very screwy and utterly "random" results. The larger the sample gets, the more those screwy (and offsetting) results cancel out.

If anything is flawed, it is arguably the practice of releasing cross-tab results based on such a small subgroup, though in fairness there are trade-offs here. SurveyUSA, has a mostly standard set of crosstabs that are sometimes very tiny. They do this (I assume) partly because it simplifies their programming tasks and partly because their format includes "row percents" that tell us about the weighted value of their standard demographics measures (race, age, gender, party). In other words, people like me badger pollsters to tell us the demographic composition of their samples. By including smaller subgroups in their standard table, SurveyUSA provides an answer as standard procedure.

Having said all that, Mike_in_CA raises a different but very good question regarding this Iowa poll:

[W]hy would SUSA poll Iowa right now, in the midst of catastrophic flooding? One has to wonder how many people have been forced out of their homes, away from their telephones. Probably not the best time to poll a state.

I do not know the answer, though I would be curious about the likely political skew that might result from those not available to be surveyed? Is it more urban, more rural? The floods appear to be affecting the eastern portions of the state. Are those typically more Democratic or Republican? Readers with knowledge of Iowa are encourage to comment.