We might pick up a few very late polls this election morning, but even so, it is well and truly time to list some forecasts.
National popular vote. The last run of the model tonight produced 50.8% two party vote share for Obama. Since this is a two party estimate, the Romney estimate is 49.2%.
Modeling proportions of "Other", "Undecided", "DK" has never been a prominent piece of the work I've done for Pollster over the campaign. Estimates of these segments of the electorate are largely dominated by house effects, and a slow downward trend over the course of the campaign.
But if pushed, we'll say 1.5% for "Other", and ramp down the two-party estimates proportionally. That is, 50.1% Obama, 48.4% Romney, 1.5% Other.
Battleground states. Obama ends the campaign with poll leads and forecast wins in all of the battlegrounds except North Carolina. We are not especially confident about the forecast for Florida, with the model estimating that Obama wins there with only 59% probability. Even the North Carolina call for Romney is not especially confident, with just a 71% chance of a Romney win. Nonetheless, in each case, our calls go with the balance of probabilities: North Carolina to Romney, but the remainder of the battlegrounds staying with Obama. Note that North Carolina and Indiana are the only 2008 Obama states predicted to fall to Romney.
The model confidently predicts that Obama wins the election, with the probability of 270 Electoral College votes or more now up to 91.4%.
Our sense is that Electoral College count predictions will be graded on a continuous metric, not an "all or nothing basis". The mean, 317.2, is wrong with certainty, and on that basis many would think it a dubious or even frivolous forecast; but it does hedge against our considerable uncertainty with respect to the outcome in Florida (and other states, for that matter). The median of 332 is rationalized by linear loss -- as well as being the "no guts, no glory" estimate.
I would point out that it is hard enough coming up with an Electoral College count prediction (or maybe not, given the run of polling for Obama in the last few days), let alone trying to guess the loss function with which others will "grade" the performance of forecasts. Researchers at the University of Pennsylvania contacted me (and others in this business) seeking my probabilistic forecasts for battleground states; I'm more happy to send these along and will do so after one last look at any last minute polls released tomorrow morning. The researchers said they would use Brier scoring to assess the probabilistic forecasts. Were it always the case that the rules of the forecasting game were always made so clear.
Technical digression: a clarification. My Twitter feed has been busy with inquiries about how I do the Electoral College simulations. I replied a little tersely to some of these inquiries ("what is it about conditional independence that you don't understand?"; well, not literally, but...). I now understand that the fault is mine, that my exposition of the simulation methods I employ contained errors, and I apologize.
Here's the story. The model uses poll data (and house effect corrections) to generate estimates of Obama and Romney levels of support in the states (and at the national level). The modeling is done simultaneously: if you will, there are up to 52 latent quantities (e.g., Obama support in 50 states, the District of Columbia, plus the national level) moving over time, with polls giving us (noisy) snapshots as to where the latent targets might be on any given day. Patterns of correlation in historical election results supplies prior information as to which latent targets are more likely to move together than others. The national trajectory is estimated with considerable precision (e.g., there is so much national polling in the last few months of a presidential election); moreover, by construction, fluctuations in the national level target will be correlated with state level fluctuations. Quite aside from the prior, historical information, the polling data strongly suggests correlated trajectories across states. All this is to say is there is a lot of "correlation across states" baked into the model.
Now, to generate a distribution over Electoral College outcomes, I sample repeatedly from the joint distribution of the latent quantities on a given day. Obama's level of support in Ohio on day t is correlated with Obama's level of support in, say, Iowa on that day (and in many other states too). For a given draw from the joint distribution, I simply note if Obama > Romney in state j = 1, ..., 52; if so, then I assign the Electoral College votes for state j to Obama, otherwise to Romney. Because I sample from the joint distribution for Obama support levels across states, the simulated Electoral College outcomes are likewise correlated. I do not compute the probabilities that Obama > Romney in each state (averaged across Monte Carlo samples) and then flip state-specific coins independently.
I apologize for misleading readers in previous attempts to describe the modeling and simulation algorithms that underlie it.
Follow Simon Jackman on Twitter: www.twitter.com/simonjackman