American Research Group (ARG) does a large amount of state primary polling and is therefore potentially influential in estimating candidate support because they contribute more polls than most other organizations. This week we saw conflicting results from ARG and Time/SRBI polls of Iowa. (See Mark Blumenthal's analysis here.) The discrepancy of ARG polls from others in Iowa has been an issue here before, as has been the question of how much any single poll influences our trend estimates. Today we take another step towards systematically answering that question.
In the Democratic race, ARG has consistently found support for Clinton well above that of other polling organizations. In the chart above, ARG polls are in purple, the blue line is the trend estimated with all polls, including ARG, while the red line is the trend estimate without ARG. The light blue points are all non-ARG polls, while the purple points are the ARG polls.
This lets us compare three things: ARG polls to other polls, ARG polls to the trend, and the trend with ARG to the trend without ARG.
In the case of Clinton, ARG polls are consistently far above the results of other polls. This has been widely remarked upon already. And in the Clinton case, the ARG polls have shown some decline in support in Iowa, while other polls have shown an increase in her support. This is also the case in which ARG exerts a significant influence on the trend estimator. The blue trend line (with ARG included) is well above the red trend estimate which excludes ARG. This was especially true early in 2007 when there were few polls and several from ARG, giving them an extra influence due to lack of non-ARG data. As polling frequency has increased the two trend estimates have converged, but the non-ARG estimate remains a couple of points below the overall trend.
Blumenthal has talked about possible reasons for this, and I encourage you to see his post here.
I'm more concerned with the magnitude of difference and their effects here, so will leave it to Mark to explain the "why".
It is clear that ARG's estimates for Clinton have consistently been out of line with others, and that this has had an effect on my trend estimates, making Clinton appear more competitive in the first half of 2007.
But let's also look at the other candidates. ARG is less consistent in over- or under-estimating Edwards' support. Some ARG polls have put Edwards below trend, but others have him above trend. While ARG has disagreed with other pollsters in individual polls, the effect of ARG on the trend estimate for Edwards is negligible.
On the other hand, ARG has consistently had Obama below the support found in other polls, and well below the trend estimate. Despite this, the effect of ARG on the trend estimates has been small for Obama, with the blue and red trend estimates consistently quite close to one another.
Finally, Richardson has been a bit underestimated by ARG, but again with little influence on the trend estimates.
Bottom line: ARG has had a substantial effect on the Clinton trend estimate until recently. Still, the substantive effect is not trivial. Estimates including ARG put the trend at 26.2% for Clinton, 24.2% for Edwards, a Clinton lead of 2.0 points. But excluding ARG from the trends we get Clinton at 24.6% and Edwards at 25.9%, a 1.3 point Edwards lead. Of course both estimates say the race is close in Iowa, and perhaps we should stop there. But the consistent ARG overestimate of Clinton has influenced perceptions and estimates for this race.
If we switch to the Republican side, there is a consistent ARG overestimate of McCain support until very recently. ARG is also a bit high on Giuliani and a bit low on Romney. The Thompson numbers are relatively few and jump around.
Unlike the case of Clinton, the trend estimates are not much affected by the ARG data. The blue and red trend estimates lie very close to one another for all four Republican candidates, despite the high ARG readings for McCain.
There are two bottom lines here. Any pollster can experience consistent house effects that lead to over- or under-estimating support for some candidate. These may be due to sampling methods, filtering for likely voters, question wording or order, weighting methods, or perhaps to mysterious gremlins. ARG is an example of house effects, at least for Clinton and McCain and probably Obama. House effects are important because they give us a way of estimating what a poll would be if we adjust for those house effects. That gives better perspective than the raw numbers might. But house effects also allow us to say which polls are more in line and which more out of line with others. A house effect is not in and of itself evidence for bad polling methodology. There may be good reasons for choices that lead to significant house effects-- for example deciding to interview likely voters rather than adults or a decision not to push undecided voters or to push them for a preference. So we should be careful here in how we interpret the results. That said, it is crucial to know which organizations are consistently high or low for candidates (or any other variable.) The ARG lines in the figures above give a clear reading of that for the Iowa polling.
In the next few days we'll be rolling out a series of posts that look at house effects for all polling organizations across state and national polling. We'll have a systematic look at this, with estimates of the effects for each organization. I hope that will help clarify things.
The second bottom line point is that the trend estimates are pretty resistant to the effect of a single polling organization when there are plenty of other polls taken around the sample period, but that, as in the case of Clinton and ARG, this effect can be quite a bit larger when polling is sparse and a single organization contributes a substantial share of the polls while at the same time exhibiting a significant house effect. In one sense this problem goes away as we approach elections because the density of polling increases as does the heterogeneity of polling organizations. But as Iowa illustrates (and we'll see again in other primary states with limited polling) it is not always possible to be sure which polls are misleading us when the evidence is limited.
Stay tuned next week for the next step in examining the house effects in primary polling.
Cross-posted at Political Arithmetik.
Follow Charles Franklin on Twitter: www.twitter.com/PollsAndVotes