12/19/2007 11:04 pm ET Updated May 25, 2011

The Insider Advantage Crosstabs

For today's puzzle, we have two new polls in Iowa, one from the ABC News/Washington Post partnership and another from the public relations firm InsiderAdvantage. The ABC/Post poll shows both Obama (at 33%) and Clinton (at 29%) significantly ahead of John Edwards (at 20%). The InsiderAdvantage survey -- or at least the result they chose to lead with -- shows that John Edwards (with 30%) has "leapfrogged ahead" of Clinton (26%) and Obama (24%). As our friends at NBC's First Read note, conflicting results like these make it "hard to know what's right or wrong."

Before digging deeper, it is worth highlighting this point from the ABC story:

Applying tighter turnout scenarios can produce anything from a 10-point Obama lead to a 6-point Clinton edge -- evidence of the still-unsettled nature of this contest, two weeks before Iowans gather and caucus. And not only do 33 percent say there's a chance they yet may change their minds, nearly one in five say there's a "good chance" they'll do so.

However, I want to pass along some problematic details on the recent InsiderAdvantage polls. One issue is that InsiderAdvantage sometimes conducts surveys using live interviewers, sometimes using an automated interactive voice response (IVR) method (in which respondents answer by pressing buttons on their touch tone phones) and almost never specifying which method they use in their public releases. In this case, I checked with InsiderAdvantage and they confirm that the latest Iowa surveys were done with the automated IVR method.

The second problem is potentially bigger. InsiderAdvantage typically emails us a few pages of cross-tabulations that we have sometimes posted to the site, but which they rarely post to their own site. We did not receive those crosstabulations for today's survey, perhaps because of the story I am about to share. The site RealClearPolitics has posted a more limited version for the Republican and Democratic results.

Take a look at the Democratic tab, and if you look closely, you'll see the problem: According to the crosstabs, Barack Obama gets 19.6% of the vote from men, 17.8% from women but 24.3% from all voters. Needless to say, that result is impossible, especially since they report 392 interviews conducted among men, 585 interviews among women and 977 overall (and since 392+585=977).**

We had posted the crosstabs for the InsiderAdvantage poll of Republicans in South Carolina earlier this month, but pulled them back when a reader noticed similar inconsistencies (for this posting, we have put the Democratic and Republican crosstabs back up on our server). The story of what happens next should give pause to anyone wondering how much faith to put in their surveys.

I emailed InsiderAdvantage to say that "something seems amiss" in their tabs. Mistakes happen, and I assumed I was simply reporting an error in the cross-tabulations that they would want to correct. Instead, I got some curious replies. I heard first from Matt Towery, the public face of InsiderAdvantage. He referred me to the statistician who weights their data and then offered this explanation:

We have produced many a poll that showed the male female column not seeming to "fit" with the totals. But as [the person who weights the data] will explain, the other weights applied cause the numbers to appear to "disagree" with the male female column. I can only tell you that we've used the same weighting system for going on ten years and it has rarely failed us.

Next, I heard from Gary Reese, an analyst at InsiderAdvantage, who shared his "guess" that "because of gender and age and race weightings, that may make individual cross-tabs read slightly off." The person that weights the data was not available, Reese wrote, but he would check with him and get back to me. The next day, Reese replied with a confirmation:

Was as I wrote yesterday. Multiple weightings of various demographics skew individual weightings that they don't necessarily add up to match the top line.

Now here I have to interject: I too have weighted data for many years, and this explanation is simply wrong. Either the data are weighted consistently (in a process that changes the "weight" given each respondent when the data are tabulated) or they are not. If cross-tabulations are based on weighted data, then the results in subgroups (men, women, etc) should be internally consistent with the total.

They gave me a number for the statistician that weights the data. I called, but heard nothing back, then got caught up in our office move and other more pressing stories. I finally heard back yesterday from Jeff Shusterman, the president of Majority Opinion Research (the company that conducts the InsiderAdvantage surveys) and he confirmed what should have been obvious to Towery and Reese: Only the total column in their crosstabs is weighted. Thus, for reasons that still perplex me, they choose to leave the columns for subgroups unweighted.

Before posting this item, I went back to Towery and Shusterman and asked for an explanation of the purpose of releasing weighted values for all respondents, but unweighted results for subgroups. Here is Shusterman's answer:

The purpose of the InsiderAdvantage/Majority Opinion polls are to provide a snapshot for major media outlets of the race at the time of polling and, as the election day approaches, to accurately predict the outcome of the election for which we have a substantial record of success. This snapshot and eventual prediction are contained in the total column of the cross-tabulations, which is accurately weighted. By contrast, our polls are not conducted to advise campaigns or to provide interesting subtext for academics or bloggers, so we do not weight or place emphasis on the other banner points.

If that's the case, I am not sure I understand why they choose to run "inaccurate" cross-tabulations at all, much less send them to us and to RealClearPolitics. Readers ought to take all of the this "interesting subtext" into account when trying to decide which polls to rely on (and we will save for another day the issue of what weighting up subgroups by factors of three or more does to the reported "margin of error").

Back to the issue of the conflicting results from Iowa. As we have reported, pollsters in Iowa have taken many different approaches to defining likely voters. The ABC News/Washington Post surveys have at least disclosed the demographics of their likely caucus-goers and the methods used to select them. InsiderAdvantage has not. Without more of these details, it is hard to do much more than speculate and pass on the good advice from First Read:

Look at the trends of the pollsters who have surveyed the state for multiple cycles, and be careful of pollsters who haven't polled Iowa before.

**Update: Several commenters are fixated on the footnoted paragraph above but appear to have paid little attention to the rest of this post. So to be clear: The contradictory results are "impossible" only if all of the crosstabs columns were weighted consistently, which they obviously were not. The results are also "impossible" in terms of the reality the data are supposed to represent, and that is the point. If you are ready to weight all Democratic voters to 48% black, then it makes no sense to release results for the same survey by gender where men are 10.9% black and women are 18.4% black.