12/08/2008 03:58 pm ET | Updated May 25, 2011

The Fluctuating Convergence Mystery

The "convergence mystery" gets even more mysterious.


In Survey Practice, I initially raised the question of why the national presidential polls showed a great deal of variance in their results during the month of October, but then converged to a relatively tight cluster in the final predictions. Mark Blumenthal then calculated the variance among state polls, showing that they also exhibited much greater variance during October than in their final predictions.


He suggested the phenomenon was probably not a deliberate effort by pollsters to change their numbers. Instead, he proposed that pollsters, whose numbers were outliers, probably looked to see if their polls needed "fixing" - and sure enough, they found reasons to adjust their numbers closer to the mean. Thus, the convergence at the end of the campaign.


My original analysis was a weekly average of the national polls, while Mark's was a weekly average of selected state polls (including 12 battleground states with at least 20 polls in October/November). In a further analysis, I looked at the eight tracking polls from October 4 through November 2. The group includes four daily tracking polls for the whole time period, and four that started a bit later - two on Oct. 6, one on Oct. 12, and the last on Oct. 16.


Shown below is the overall graph of their results.


2008 Fluct Conv Mys Graph 1.png 
A quick examination shows a couple of times when the polls converged to a tight cluster before expanding to much greater differences - around Oct. 18 and again around Oct. 28-29.


The next graph shows the same polls, but with the "variance" plotted on the same graph (the pink line). What I hadn't noticed in the graph of all the polls are the three spikes in variance shown below.


2008 Fluct Conv Mys Graph 2.png

The next graph is a scatterplot of the variance. The linear regression line indicates a significant decline in variance over the month of October, though clearly there are spikes.


2008 Fluct Conv Mys Graph 3.png

The last graph shows the day-by-day fluctuation, with three major spikes, all occurring just a couple of days after each of the October debates.


2008 Fluct Conv Mys Graph 4.png 

The first spike occurs on Oct. 7, the second on Oct. 12-13, and the last Oct. 20-22. In each case, the spike begins five days after a debate. It's important to keep in mind that the daily tracking polls are typically about 3-day rolling averages, so that means the spike occurs two days after the debate.


These results add to the mystery of convergence, because they 1) show an overall decline in variance over the month, and 2) nevertheless show sudden and temporary spikes in variance, starting just two days after a vice presidential or presidential debate. The largest spike occurs right after the third presidential debate on Oct. 15 - not immediately reflected in the 3-day tracking polls until five days after the debate.


The delayed spikes can be accounted for in this way: The vice presidential debate took place on Oct. 2. The next day, the networks broadcast their interpretations of the debate, and the following day, the polls begin to show quite different results. The debate effect is not complete until the end of the 3-day tracking period, which would mean the first full results would be manifest on Oct. 7, five days after the debate.


Similar scenarios suggest that five days after each of the two succeeding debates, new spikes should occur - and they do. Oct. 12 (five days after the second presidential debate) and Oct. 20 (five days after the final presidential debate) find the beginnings of spikes - the first lasting two days, and the second lasting three days, before beginning the downward movement.


There is one last minor spike, from the end of October to the final prediction figures. It's hard to tell if this is random noise, or part of a predictable pattern.


In any case, the mystery is this: Why do the eight tracking polls show more variance in results following the debates? What is there about the debates that would cause different polls to show greater inconsistencies in results than normal? And why do the polls show a month-long decline in variance, except for the three temporary spikes?


I think that Mark's initial suggestion -- that pollsters with the outlying results tend to "fix" their methodology, and thus have their polls converge toward the mean - may need to be re-examined in light of the tracking poll data. The decline in the variance is gradual over the month, but interrupted by the debate-generated spikes.


Please offer any theories you might have that could explain this phenomenon.



Subscribe to the Politics email.
How will Trump’s administration impact you?