Over the remainder of the election campaign, HuffPost will be presenting graphical summaries of the modeling I've been doing behind-the-scenes for them over the last few months. Some additional background on how it all works seems warranted.
The goal is to summarize the "state of play" in states where we have even a little polling, along with the national trend in voting intentions. Note right away that this isn't a forecast of what will happen in November on Election Day; but rather a synthesis of available polling -- the web site is called Pollster, not Forecaster.
That said, it is important to note that there is a lot more going on than just poll averaging.
Every couple of hours, my computer programs check to see if there are any new polls in the Pollster database. We're at a point in the campaign now where tempo of polling is ramping up and polls are being published throughout the working day. But even on a quiet day, survey houses like Rasmussen or Gallup are updating their rolling, national tracking polls. So at least once a day (but usually more often than that), my computer programs find new polls to integrate into the estimate of where each state is.
The goal is to form today's best estimate of the national and state-by-state levels of support for Obama and Romney. If we literally had no new polling on a given day, then today's best guess is simply yesterday's estimate. But the more interesting case is when, say, we have new national level polls, and a few state-level polls. How do we use that information to make estimates for all of the states we're tracking, even for states without new polling data?
This is where the modeling comes in. My model assumes that states (and the national track) move together, but in clusters. That is, some states follow other states more closely than others. Further, some states follow the national trend more closely than others. I rely on historical election returns to help assign states to clusters, and for initial guesses as to the strength of the clustering; in exploiting the historical data I upweight recent presidential elections relative to less recent ones.
There is a big geographic component to the clustering (e.g., New England, the Plains, the deep South, the Pacific West). Since the national outcome is simply the sum of what happens in the states, all states track the national outcome to some extent; that said, in recent U.S. presidential elections some states track the national swing more closely than other states. For example, Indiana, New York, Arizona, New Mexico and (of course) Ohio and Michigan have tracked the national swing much more closely than, say, Mississippi and South Dakota. Accordingly, polls in, say, Ohio and Michigan tend to more informative about the national trend (and vice versa) than polls from the deep South or the Plains.
In turn, this means that even though we may not have polls in every state on every day, we've got a plausible (if imprecise) model of movement in one state that might be related to movements in another state, and a model that links up state and national level changes in the vote. We'd always like more data than less, but the model lets us produce estimates day by day, even without a steady stream of polling data from each state, "borrowing strength" from state with polling data (or the national level) through the clustering.
The model also deals with the fact that different polls have different sample sizes, and so not all polls contribute equally to the model's estimates. The model also corrects for "house effects" that I talked about in an earlier post, the tendency for some survey houses to produce estimates that are systematically higher or lower for one candidate than other pollsters.
These house effect corrections are vital, letting us deal with the fact that some pollsters contribute many more polls than others; e.g., Rasmussen contributes 150 of the 720 presidential election polls in Pollster's collection of 2012 Obama-Romney head-to-head polls, Public Policy Polling contributes 103 polls, with another 109 survey houses contributing the remaining 467 polls. Without the house effect corrections, the model's estimate would be overwhelmed by any bias contained in the estimates produced these more prolific pollsters. In future posts I'll dig into the specific estimates of house effects, with a specific focus on likely voter versus registered voter filters.
In addition, I also aggressively down-weight polls that come from "one-off" pollsters (appearing just once in the Pollster data) and/or polls that are radically at odds with the rest of the polls for a given state.
So, for each day, for about 38 states (states where we have at least one poll in 2012) -- and for the national level -- I obtain an estimate of the state of play. The estimate is only that -- an estimate -- and so comes with some uncertainty. The uncertainty comes from multiple sources: (1) sampling error in the polls themselves; (2) uncertainty about the house effect corrections; (3) uncertainty about how quickly vote intentions are changing; (4) uncertainty about the strength of the correlations between and within clusters of states and the national level.
I take these sources of uncertainty into account when producing state-by-state probabilities of winning and the Electoral College summary. In states with less polling, we have more uncertainty about the state of play. Happily, these are generally states where the outcome isn't in much doubt (e.g., Utah).
Now that we're well and truly past the conventions, we're going to see a lot more state-level polling, particularly in "battlegrounds" like Ohio, Virginia, Pennsylvania and Florida. Already these states dominate the state-level polling in the Pollster data base: e.g., 41 polls from Florida, 38 from Ohio, 15 from New York, 13 from California, 1 from Utah. More data from more states will mean less uncertainty in the state-by-state estimates and less reliance on the model and its underlying assumptions.
In the meantime, please enjoy the estimates and graphs I'm producing for Pollster.
Follow Simon Jackman on Twitter: www.twitter.com/simonjackman
![]() |
![]() |
|
| Obama | Romney | |
|---|---|---|
| Electoral Votes (270 to win) |
332 | 206 |
| Obama | Romney | |
|---|---|---|
| Total | 65,899,660 | 60,932,152 |
| Percent | 51.1% | 47.2% |
| Democrats* | Republicans | |
|---|---|---|
| Current Senate | 53 | 47 |
| Seats gained or lost | +2 | -2 |
| New Total | 55 | 45 |
| Democrats | Republicans | |
|---|---|---|
| Seats won | 201 | 234 |
How very enlightening. I do find it interesting that in your elitist attitudes towards limiting the voters pool, you carefully avoided anything that could be applied to you.
To find that you, Jimmyforlife, are an advocate for the repeal of the 17th is not really a surprise to me, but, rather, an anticipated attitude.
Just for your edification: non-citizens and convicted felons[until they have petitioned for return of their civil rights and had them restored] are not members of the voting public as it is.
As of right now gallup has Romney ahead by 6 pts amongst likely voters...http://www.gallup.com/poll/154559/US-Presidential-Election-Center.aspx?ref=interactive
I wonder where Huffpo's electoral map would be if they added this poll into their calculations as they should!!!!
http://elections.huffingtonpost.com/pollster/2012-general-election-romney-vs-obama
But it's the Electoral College that matters, hence state polling. It doesn't matter if Romney gets 80% of the vote in the South (hence boosting his standing in national polls) if he's behind in swing states.
270
270
270
I don't care how we get there so long as we get there.
OHIO
OHIO
OHIO
So you "preselect" what polls to use in your model?
Whether they are selling advertising time to campaigns or merely collecting eyeballs and selling advertising time to to traditional merchants, the media outlets - including Huffpost - are making money hand over fist from this manufactured uncertainty and tweaking the nation's collective blood pressure. It is another implementation of the fear-mongering that those in love with power use to try to drive public sentiment, and the willing perpetuation by the media of shoddy information (lies) for corporate gain.
There is nothing newsworthy in these polls. When American voters pull together the gestalt of each candidate, Obama is going to win by a comfortable margin with >56% of the popular vote and 300+ electoral votes. He will enjoy a well-deserved legacy of being one of the least popular presidents of modern time, but he will win easily.
God help us all when the election is over and the Media juggernaut of this country has to find some other way of sucking revenue out of our tired economy.
PolitiFact has chronicled 19 “pants on fire” lies by Mr. Romney and 7 by Mr. Obama since 2007, but Mr. Romney’s whoppers have been qualitatively far worse: the “apology tour,” the “government takeover of health care,” the “$4,000 tax hike on middle class families,” the gutting of welfare-to-work rules, the shipment by Chrysler of jobs from Ohio to China. Said one of his pollsters, Neil Newhouse, “We’re not going to let our campaign be dictated by fact checkers.”
To be sure, the Obama campaign has certainly had its own share of dissembling and distortion, including about Mr. Romney’s positions on abortion and foreign aid. But nothing in it — or in past campaigns, for that matter — has equaled the efforts of the Romney campaign in this realm. Its fundamental disdain for facts is something wholly new.
The voters, of course, may well recoil against these cynical manipulations at the polls. But win or lose, the Romney campaign has placed a big and historic bet on the proposition that facts can be ignored, more or less, with impunity.