My Nationaljournal.com column, on the differences between rolling-average tracking polls and other "traditional" surveys, is now online.
Regular readers may be interested in the chart in the column created by our own Charles Franklin (see below) and spiffed up considerably by the National Journal's Reuben Dalke (see the column). I wondered how the tracking poll trends compare to standard trend estimates that you see on our national chart. The chart that Franklin created plots the trends on the Obama margin (Obama percentage minus McCain percentage) using a loess regression trend line based on the non-overlapping releases from Gallup Daily, Rasmussen Reports and all other national polls. To make for a fair comparison, all three lines are plotted with the same sensitivity.
I was also curious how the trends would look if we simply "connected the dots" between the non-overlapping tracking poll releases by Gallup and Rasmussen tracking surveys as well as the "traditional" USA Today/Gallup results (based on "likely voters") . You see that below.
The Gallup Daily line looks more variable than what you are used to seeing on Gallup's Daily release, partly because the time scale is more compressed, partly because we are plotting the Obama-McCain margin rather than separate lines for each candidate and partly because we are plotting only every third or fifth day which eliminates the "smoothing" effect of the overlapping intermediate samples.
What conclusions do you draw?
Update: In the comments, PatrickM asks:
As to the sampling process for the Gallup tracking survey: I thought the purpose of the tracking survey was to draw a discrete sample each night. Since completion quotas are set for each night, non-respondents must necessarily be "replaced" for that night's calls. Theoretically, all these replacements should balance out if non-response is random.
But Gallup seems to be taking a second bite at this apple by drawing an entirely new sample on the second night and supplementing it with non-respondents from the first night until the nightly completion quota is reached. So theoretically, the 3-day rolling results could include data from the originally drawn sample point AND its doppelganger replacement phone number.
I'm not a sampling expert, but is there anybody out there who can describe the rationale behind why this is OK?
The best explanation I have seen of "rolling cross section design" (a more technically correct term than "rolling average") is Kate Kenski's description of the National Annenberg Election Survey (NAES) in Chapter 4 in Romer, Kenski, Winneg et. al., Capturing Campaign Dynamics 2000 & 2004 .
The NAES, ongoing now for 2008 but mostly held back for academic analysis, uses the same general "tracking design" as Gallup only with far more rigor: In 2004, they protocol involved dialing non-contacts as many as 18 times over as many as 14 days.
I won't try to summarize the whole chapter, but this paragraph gets closest to answering Patrick's question:
What is important to note here is that there were strict procedures in place so that no telephone number was treated differently from any of the other numbers selected. Telephone numbers released on Tuesdays were not handled differently from telephone numbers released on Fridays. This protocol ensures that the probability of being interviewed is a random event. By stabilizing the proportion of respondents who completed an interview after having been called numerous times, the representativeness of the daily cross-sections is maximized.
Why is it important that the date of the interview be a random event? if the date of interview is random, then the characteristics of the sample on any given day will not vary systematically.