I want to take a close look at our House of Representatives summary scorecard, partly to address some of its shortcomings, but mostly to try to get an overall sense of what the available public polling is telling us about the likely outcome. This post may get a bit long and a little esoteric, but there are ultimately two big takeaway messages: The first is that we still see a remarkably large number of races -- 25 to 50 depending on which polls you trust -- where public polling data is inconclusive. The second is that if we assume that the pure "toss-up" races split about evenly between the parties, the Democrats stand to gain 30 to 35 seats on Tuesday.
Many of you have posted comments or sent email asking why we took the approach that we did for classifying House ratings. We did consider many different approaches. None were perfect. We ultimately chose the simplest - replicating the approach used for the statewide races - because the process of creating something more complicated required far more time and computer programming than we had available. In this post I want to try to look at what some alternative approaches would tell us about how who is ahead and who is behind.
First, let's review how we our scorecard classifications work. We take average of the last five random sample polls in each district and then classify each based on the statistical significance of the leader's margin. We classify races where a candidate leads by at least one standard error or better as "leaning," we classify leads of at least two standard errors as "strong." The rest we classify as toss-ups, meaning that the surveys provide no conclusive evidence about which candidate is ahead. If no polling is available, we assume no change in party and assign it a "strong" status for the incumbent party.
That last step is important for House races, because we can find no public poll data for 351 of the 435 districts. However, very few of those missing districts are considered even potentially competitive by the various respected handicappers. We currently itemize seven theoretically competitive seats as "no-poll" in the scoreboard (because the Cook Political Report listed these among the seats with the "potential" to become competitive), but Cook considers five of seven incumbents in these districts "likely" to be reelected (i.e. "not considered competitive at this point).
So far, so good. But one big problem, as many of you have pointed out, is that polls in House districts are far less numerous than those in statewide contests. As such, a lot of those "last 5 poll" averages include some pretty stale results. While we have logged in more than 250 new House polls since October 1, there are still only 32 districts with five polls or more to average. Applying the "last 5 polls" filter still leaves 37 polls from September - and 25 polls from the summer months - contributing to the averages that we use to classify districts.
In some cases, those stale results can give a very distorted impression of where the race stands today. Consider, Pennsylvania-07, the district currently represented by Republican Curt Weldon. We currently rates that district a toss-up, based on the average of five polls that includes two from September and one from March. Weldon trailed by an average seven points in the two polls conducted in October - enough to shift the district to "strong" Democrat status.
So I put all of our House data into a big spreadsheet and did some "what-if" analysis. The first question I asked was, what would happen if we had applied a filter so that only polls released since October 1 could be included in our averages. Here is the result:
As the table shows, the net impact on the scoreboard is not dramatic but improves the lot of the Democrats: The count of seats at least leaning Democratic grows from 221 to 222, while the count at least leaning Republican drops from 187 to 184. The number of toss-up seats grows from 27 to 29, and all but two of those toss-up seats (Georgia-12 and Indiana-07) are currently held by Republicans.
The net changes on the scoreboard obscure a bit more reshuffling at the district level. For those keeping track: Four districts (Florida-16, New Hampshire-02, Ohio-2 and Pennsylvian-07) move from toss-up to Democrat, but three (Indiana-07, Iowa-1
7 and New York-20) shift from leaning or better Democrat to toss-up. Three more seats (Arizona-05, Kentucky-02 and California-50) move from Republican to toss-up based on unfavorable trends since September.
The table also reminds us of the relatively small number of surveys available in many of these districts. The good news is that the average number of polls per district drops only slightly (from 3.4 to 2.9) when we count only the October polls. The bad news is that more than half of the competitive districts have been polled two or fewer times (40) or not at all (12) in October.
While we're at it, thare are few more good "what if" questions we can ask....
[11/5 (11:45): Picking up where we left off last yesterday...].
What about partisan polls? As I have noted previously, the House data includes quite a few internal campaign polls, roughly one of every four in our database, and polls from Democratic campaigns outnumber those from Republicans by more than four-to-one (85 to 20). Since October 1, one-in-five House polls have come from partisans, and again those polls have been released mostly by Democrats (42 to 10).
Do all these Democratic polls tilt our scoreboard in favor of the Democrats? Yes, but only slightly. If we focus on the averages filtered to include only polls released since October 1, removing the partisan polls leaves the number of Democratic seats unchanged at 222 and shifts a net two seats to the Republicans (from 184 to 186). The absence of favorable internal polls make three potential Democratic pickups seem less likely (Florida-13, Nebraska-03, and Ohio-01), but also leaves three other Republican incumbents looking more vulnerable (New York-19, North Carolina-08 and New York-20).
What about the Majority Watch automated polls? Their two waves of October surveys account for roughly a third of the House district polls released in the last month, and as the table shows, removing them from the averages does reduce the Democratic advantage on the scorecard. Keeping our October-only filter on, the Democratic seats drops from 222 to 215 the Majority Watch surveys also removed, while the number of Republican seats increases from 186 to 192.
What is driving the change? Removing the Majority Watch surveys changes our classification of 15 seats. In seven of these districts, Majority Watch conducted the only public polls released in October, and all seven were seats held by Republicans and classified as toss-ups or likely Democratic pick-up using their data. So without any polling data available, our model assumes "no change" and shifts all seven seats to the Republican column. In another eight seats, the absence of the Majority Watch surveys tips the balance in the averages just enough to shift our classification - 6 seats move toward the Republicans and 2 seats move to the Democrats.
Those changes beg an important question: How do the Majority Watch results differ from other pollsters in districts where we have other sources of data available? I count 40 districts in which public polls were released in October by both Majority Watch and other pollsters. So I went back to my big spreadsheet and averaged the averages for those 40 districts two ways: Once including only the Majority Watch surveys, once including only the results from other pollsters.
The results are a bit different. The Majority Watch surveys indicate a 3.3 point lead for the Democratic candidates in those districts (49.1% to 45.8%) compared a 1.0 point lead by other pollsters (44.6% to 43.6%). But notice that the percentage going to undecided or third party candidates is more than twice as large on the traditional telephone surveys (11.8%) as on the Majority Watch automated surveys (5.1%). So we have two potential explanations for the difference: One is that the automated surveys reach different kinds of voters (who tend to be more opinionated and less Democratic in their preferences). Another is that both types of surveys reach the same mix of voters, but that the absence of a live interviewer better simulates the "secret ballot" and entices more uncertain voters to express their true preference for the Democratic candidates. Which theory expalins the difference here? Take your pick.
Another question: What if we remove both the partisan and automated surveys? Unfortunately, at that point, this particular "model" essentially blows up because we have no polls to look at in 39 of the competitive districts. Since more than two thirds of the October "no-poll" districts (28 of 39) are currently held by Republicans, removing these polls shifts the scorecard in the Republican direction. Adding back the pre-October data nets us only five additional districts, but makes virtually no change in the scorecard numbers.
Still, even if we look only at the smaller number of districts with traditional live interviewer surveys conducted by independent pollsters, we still see Democrats leading by statistically meaningful margins in nine Republican districts. Moreover, these same surveys show Democrats with significant leads in 11 districts currently held by Democrats and indicate "toss-up" races in another 20 seats now held by Republicans.
[11/5 - 4:30 p.m. - Back again. And finally...]
One more thought about the last paragraph. Those 20 "toss-up" races exclude 9 districts with no traditional polls released during October that currently rated either "toss-up" or "lean Democrat" by the Cook Political Report.
But let me try to sum this up, following the same formula I used in discussing these results for the Slate Election Scorecard earlier in the week. The math is easier given one important finding: Not a single Democratic candidate in a district now held by a Democrat is currently trailing, regardless of the combination of polls examined. So the text and the table that follow focus on potential Democratic pickups.
- Eight seats currently held by Republicans show a Democrat leading by a statistically meaningful margin regardless of what combination of polls we look at: Arizona-8, Colorado-7, Indiana-2, Indiana-8, North Carolina-11, New Mexico-1, Ohio-18, and Pennsylvania-10.
- One seat deserves its own category: The one and only poll in the Texas-22 district formerly represented by Rep. Tom Delay shows Democrat Nick Lampson leading. However, a complicated ballot (Republican Shelley Sekula-Gibbs is a write-in candidate) makes this result tenuous.
- Nine more Republican seats look to be in statistically meaningful jeopardy, but only when we count the automated Majority Watch surveys (either because those are the only surveys available or because they tip the balance making the Democrat's lead statistically meaningful): Florida-16, Iowa-1, New Hampshire-2, New York-24, New York-25, New York-26, New York-29, Ohio-15, Pennsylvania-6 and Pennsylvania-7.
- Three more Democrats would show significant leads if we include the internal surveys released by partisan pollsters: Florida-13, Nebraska-3, and Ohio-1.
To sum up: If you trust the automated Majority Watch surveys and assume a pickup in Texas-22, then Democrats are leading in exactly the 18 seats the need to win a majority. If you trust all polls (including those released by partisans on both sides), then they currently lead in enough districts to pick up 21 seats. And they are not currently trailing in any.
But even more important: Polls have been conducted in October in another 29 seats where the averages indicate a statistical tossup. Only two of these seats are currently represented by Democrats. How well the Democrats ultimately do depends on how many of these still-too-close-to-call races they ultimately win. If they split evenly, then Democrats are looking at a gain of between 29 and 34 seats depending on which polls you trust.
But wait -- we need to remember one very important caveat. Even if we exclude the pre-October surveys, we are still looking at something of a time-lapse "snapshot" of voter preference. If voter Republicans have made late gains nationally over the last week (and at least two new national surveys out today suggest that they have), then these results may overstate the likely Democratic gains. As usual, we will need to wait to see the actual results to know for certain.
UPDATE: On that last note, be sure to see the post by Charles Franklin on late trends in the generical Congressional ballot.