As noted yesterday, we are now following the PoliticsHome UK poll tracking and seat projection model developed by political scientists Robert Ford, Will Jennings, Mark Pickup and Chris Wlezien. I asked Ford if he could explain how their efforts differ from the model developed by Nate Silver and his colleagues at FiveThirtyEight.com. This is his response.
How does our model differ from Nate Silver's recently unveiled model of UK elections? The very brief answer is that our model involves applying a modified version of "uniform swing" - the same change of vote in each seat, with some modifications - while Nate's involves proportional swing where the change in each seat relates to the balance of party power beforehand. Under Silver's model, we should see a greater swing against Labour where Labour start more strongly, and this effect should increase proportional with Labour's starting strength.
Empirically, there is little support for Nate Silver's conception of proportional swing, as shown in this recent paper by my colleague David Voas.
There is no evidence of larger swings in recent elections (including 1997) where parties start off more strongly. There is some evidence that swings are larger where the parties are competing more closely, but in our view Nate's model is a poor way to capture this dynamic.
We agree with Nate that there is plenty of evidence that a naive application of uniform swing is misleading, however we feel the best approach is to improve on uniform swing rather than abandon it entirely. Two major factors are seldom accounted for in popular applications of uniform swing. Firstly, uniform swing is generally applied deterministically, making no allowance for random variation in swing between seats. Secondly, it is applied too rigidly, making no allowance for systematic deviations identified in the data. We apply a probabilistic model, based upon a formula developed by John Curtice and David Firth for application in the 2005 General Election, where it was employed very successfully to project the result from exit polls. The model allows for a non-normal distribution in swing variations, and calculates a probability of each party winning each seat based on the vote shares expected (from opinion polls or exit polls). The seat totals are simply the sum of the probabilities.
This model also incorporates systematic differences in swing suggested by the polling data. We anticipate stronger Conservative performance in the marginal seats where they are competing directly with Labour by allowing an extra 2 points of swing to them in such seats. We also anticipate a different pattern of party performance in Scotland - which has its own government and a different party system - by incorporating the latest polling data estimates from Scotland, and adjusting the change in the rest of England and Wales to ensure the aggregate changes sums up the same. These adjustment are based on differentials which have shown up robustly in several recent polls of marginal constituencies and of Scotland
Nate also makes a variety of adjustments of this kind, but his changes are not as well grounded in empirical evidence from the polling data. Firstly, the transition matrix he applies to vote shares is based upon a weak evidence base - while pollsters provide details of respondents' recalled 2005 vote, the transition matrices calculated from this are subject to bias due to respondents' tendency to misremember their votes - in particular remembering voting for the winning party when they did not. This phenomenon is well established, and British pollsters attempt to correct for it in their weighting. However, any model which uses transitions in vote from polling data is likely to overestimate the extent of switching from the current governing party to opposition parties, because many people who say they voted for the governing party last time did not actually vote for them. We suspect this may contribute to Nate's high estimate of change from Labour to the opposition parties.
Secondly, the changes Nate makes for regional differentials in swing are based on polling data that is two years old and was collected in a very different political environment to the current one - the Conservatives were a long way ahead in the polls while the Lib Dems were far below their current tally. We considered incorporating regional swings based on this data, but rejected the change due to the age of the data. We incorporate changes for Scotland as we have a good evidence base from Scotland specific polling, which is regularly updated.
We do not attempt to model "tactical voting", or the effects of incumbent retirements because we simply do not have good quality, recent data on the pattern or level of such effects. Our own regression analysis of incumbent effects did not reveal robust effects of incumbent retirements in recent elections, so we are rather surprised to learn that Nate has uncovered some. Modelling effects such as these, where the statistical evidence is weak requires making strong assumptions. We prefer not to make such assumptions, sticking only to effects where the evidence base is very strong.
On top of our votes to seats projection, we also make efforts to develop a robust estimate of current public opinion. Nate freely admits that his public opinion figures are "educated guesses based on recent cross-tabular results". We employ a state space model to estimate current public opinion every few days, while controlling for systematic "house effect" differences between the pollsters and differences in the sample sizes they employ in their polls. The polling data inputted into our model is therefore based on a more systematic aggregation of available public opinion, although to be fair our current estimate of public opinion is quite close to Nate's.
To sum up, we believe our model has a stronger basis in existing analysis of UK voting patterns, and is based upon techniques that were employed successfully in 2005. Our approach is more sophisticated than other available UK resources, both in terms of its poll aggregation technique and in terms of its seat projection technique. We disagree with Nate's claim that uniform swing models are a low bar to clear - a model based upon a modified uniform swing approach, which employed the probabilistic techniques we use, got the Labour majority in 2005 exactly right based upon exit poll data and early seat declarations. This looks to us like rather a high bar to clear!
Of course, this election is perhaps the most difficult to predict since polling began in Britain, and it may be that uniform swing fails miserably, and that proportional swing of the form Nate proposes manifests strongly next Thursday. We prefer to navigate these uncharted waters with tried and tested methods as a guide, Nate suggests a radically new environment requires radically new methods. We will all know for sure in a week!
For those interested in learning more, the model used to forecast the 2005 election based upon exit poll data and early results is detailed here. Our seat projection techniques are based on those used in this model.
Further details of the model are also available on our PoliticsHome.com page.