Public approval of President Bush's handling of his job has fallen to 33.1% after a weekend of new polling. Polls from Newsweek (2/28-3/1/07, 31% Approval, 61% Disapproval), Zogby (3/1-2/07, 30%/69%) and Gallup (3/2-4/07, 33%/63%) have pulled approval down from a high estimate last week of 35.5%.

This post also introduces a new look at the approval estimates. The central theme of my analysis of presidential approval has been to present the results "in context". The worst failing of reporting common to polling stories is the exaggeration of results based on single polls without regard to other nearly simultaneous results. Some reports emphasize the CBS/New York Times Poll taken 2/23-27 at 29%, while others stress the ABC/Washington Post poll of 2/22-25 at 36%. The former represents "a new low" while the latter is an "upturn". Yet both are within the range of results we would expect if approval is "really" around 33% (as I predicted here.) The myopic focus on individual polls undermines the credibility of probability sample polling by ignoring the variability that sampling theory predicts.

By looking at polls in context with other polls and over time, I aim to temper our interpretations by focusing on the common trend in approval polls, rather than emphasizing extreme results (in either direction.)

The trend estimate, plotted in blue in the figure above, is a local regression fit to the approval series. The advantage of local regression is that it can flexibly fit data with lots of "bumps and wiggles". The local regression will run through the "middle" of the data, with roughly equal numbers of polls above and below the trend line.

To see the most recent polls in context, I'm going to start posting the graph below-- a plot of the six latest polls showing the blue trend estimate and the data for each of the six polls.

This plot allows easy comparison of each poll with the trend, revealing polls that tend to run above or below the trend, an example of "house effects" which reflect persistent differences among polling organizations. These difference can be due to question wording, the size of "don't know" responses, sampling frames (adults versus likely voters, for example), question order and a variety of other causes. The plots above make clear both the size of such effects and the tendency of all quality polls to move up and down with the trend line regardless of house effects. In the plot above, Fox tends to produce results a bit above the trend, while CBS/NYT tend to be a bit below the trend line. But both polls move up and down with the trend, demonstrating that they are responding to the same changes in approval, even if the house effects produce persistent shifts from the trend.

This "six poll plot" also makes it clear how much variation there has been over recent polling so whatever the latest poll is, it can be seen in relation to both the trend estimate and to five other polls and past results of the same poll. If a poll is "out of line" with others, the six poll plot will make that immediately clear.

The next question about polls is whether they remain mostly within a reasonable interval of the trend estimate. When a poll falls beyond the range we expect due to random sampling plus non-sampling errors, it should be clear that the result is "unusual". This could be due to a sudden change in approval, but is more often just a random fluke, in which case the next poll by that organization usually returns to the range of other polling.

This reasonable interval is taken here to be a 95% confidence interval. I've estimated this for the 2005-present polling. Using all polls since 2002 produces little difference. The impact of 9/11 makes the 2001 data more variable than more recent years so I exclude that year in calculations of variability. The estimate I use includes the effects of non-sampling errors as well as sampling. The typical poll here has a sampling error of about +/- 3.0% to +/-3.5%. The actual 95% confidence interval is around +/- 5%. That increase from around 3% to around 5% reflects house effects, question wording, and everything else that increases poll variability beyond what is due to sampling alone. This is a more realistic estimate of the variability of poll results.

Plotting the residuals (the observed approval minus that predicted by the trend line) over time makes clear how variable polls are, and indicates which ones fall outside the 95% confidence interval. Here I highlight and label the last 10 polls to provide context. The plot below shows this view of current polling.

When a poll falls outside of this interval, it is further away from the trend estimate than we would expect 95% of the time. However, this doesn't mean the poll is "bad" and especially doesn't mean the survey organization is of poor quality. By definition, 5% of polls will fall outside this interval, so condemning polls or pollsters that occasionally produce these "outliers" is a bit harsh. We might wish to discount such polls, until supported by more evidence, or we might worry if we see a persistent pattern of outliers from an organization. But an occasional outlier is inevitable. This plot lets us spot such outliers immediately.

The plot also lets us see the variability in the last ten polls relative to the trend estimate. This is another way to see results in context. In the current plot, it is clear that while four polls have fallen well below the trend, there have been three of the last 10 polls that are equally far above the trend estimate. This range of +/- 5 points puts the extreme polls in context of all the variability we have seen recently.

Just as it is important to look at variability across polls, it is also important to examine the variability in the trend estimate. My local regression estimate has some advantages over simple rolling averages, but it is not without its own uncertainties. For example, it takes some 10-12 polls before changes in trend have been clearly identified by the estimator. Which polls happen to be "latest" at any moment have significant influence on the estimate. For example CBS/NYT at 29% pulled the estimate down when that was the latest, while the ABC/WP at 36% pulled the estimate up. This means that the current estimate can vary depending on which polls happen to be the latest. There is also uncertainty in the trend due to which polls are observed and when they are taken. To get a look at this uncertainty in the estimate, I create a "bootstrap" estimate of 20,000 replications of the approval series. Each of these bootstrap samples draws a random selection of all the polls, but allows a poll to be selected more than once or to not be selected at all. Each poll is equally likely to be sampled. For each sample, the full trend is estimated. In the end we see 20,000 estimates of the trend which vary due to the random draws of polls to include. This gives a good estimate of the variability the blue trend estimate might have if a different set of polls had been observed.

The result of this bootstrap estimation is presented below. The gray area is the full range of 20,000 samples, while the blue line is the estimate based on the actual polls we have.

The range of estimates is generally fairly close to the trend estimate, and certainly much closer than the range of actual polls around the trend. But it is also clear that the variability at the end of the series is a bit larger than it is in the interior of the series. This reflects the influence of late observations. This influence is reduced when there are polls on both sides to stabilize the trend estimate.

Finally, the estimate of the "current" approval level, varies up and down with each new poll. To see how sensitive this estimate is, I add a new plot below. It plots the "current" estimate when each of the last 20 polls was the latest. The variability of these around the blue line shows how much uncertainty we should have about the current estimate. The blue line is always our current best estimate of the trend (and the current approval.) But new data will change that a bit. The red points below remind us how much the estimate has varied recently.

All polls are subject to random variability. By looking at polls in context--- compared to the recent past, to other polls and to themselves over time--- we can gain a clear understanding of the state of presidential approval and its dynamics over the course of an administration. The graphs above provide a consistent and systematic look at the context of polling and the variability of the trend estimate which is our best estimate of where approval stands at any moment.

*Cross-posted at Political Arithmetik.*