HuffPost Pollster FAQ

Topic: The Polls

Topic: The charts

Frequently Asked Questions

Topic: Trend estimates

How do you compute your poll averages?

Unlike many other poll aggregators, our charts currently report estimates of candidate support (or for charts of other poll questions) based on local regression-based "trend estimates. So, strictly speaking, we do not currently report "averages," with two exceptions: In 2006 our charts did report a last-5-poll average. Also, when we have fewer than 5 polls available we will report a simple polling average on our summary map.

What is a trend estimate?

Every chart includes a current estimate of support for each candidate (or responses to an issue or rating question) based on all the available data. We call this a "trend estimate." It is NOT a simple average of a certain number of recent polls but smoothed estimate of support as of the most recent poll based on local regression, a statistical procedure that creates a smoothed line when data points are scattered. So simply averaging the most recent polls will not replicate this estimate.

Our standard estimator, which appears as the default on our charts, is designed to resist "chasing" noise in poll results (which, as Charles explains, may be the result of random sampling error or "house effects" and differences in methodology rather than real movement in the polls) but still be sensitive enough to detect real movement. However, at times it may appear to be too conservative or too sensitive. In particular, this somewhat conservative estimator may be slow to chase trends early on, which means it can be slow to accept that public opinion is actually changing. While we feel strongly that we should not make subjective judgment calls on this and therefore use the same default estimator for all of our charts, and that generally speaking our default smoother does a good job of filtering out random noise in the polling data, if you wish you can set the smoothing function to more or less sensitive using the "smoothing" button under "tools" in our interactive charts.

For interested parties, Charles has posted R code similar to what our smoothing functions use here.

Why don't you use a simple rolling average like other sites?

Charles Franklin, who created the statistical routines that plot our trend lines, provided the following explanation of one of the primary advantages of trend estimates over rolling averages:

"Here is a way to think about this: suppose the last 5 polls in a race are 25, 27, 29, 31 and 33. Which is a better estimate of where the race stands today? 29 (the mean) or 33 (the local trend)? Since support has risen by 2 points in each successive poll, our estimator will say the trend is currently 33%, not the 29% the polls averaged over the past 2 or 3 weeks during which the last 5 polls were taken. Of course real data are more noisy than my example, so we have to fit the trend in a more complicated way than the example, but the logic is the same. Our trend estimates are local regression predictions, not simple averaging. If the data have been flat for a while, the trend and the mean will be quite close to each other. But if the polls are moving consistently either up or down, the trend estimate will be a better estimate of opinion as of today while the simple average will be an estimate of where the race was some 3 polls ago (for a 5 poll average-- longer ago as more polls are included in the average.) And that's why we estimate the trends the way we do."

At the same time, a rolling average is much more dependent on which polls have been released recently, especially in cases where different pollsters have large house effects. See here for charts showing the differences between several different estimators, including a rolling average estimator, during the early 2008 primary season.

Trend estimates are especially good for telling the direction of the trend, but even combining all of the polling wisdom should not be construed to be a "perfect" reflection of the state of public opinion

Why do some charts include straight line estimates and other charts include curved estimates?

In cases where we have very little poll data it is difficult for our standard estimator to filter out the noise in polling data due to random variation as well other methodological differences that make up a pollster's "house effect." For this reason, when we have very little data available (8 polls or fewer), we plot a straight line (linear regression) rather than our standard fitted curve (localized regression).

A few days ago, the trend estimate for the most recent polls showed a pattern that is no longer there. What happened to that pattern?

Because we are always adding new polls and the regression trend estimate takes into account all polling data, the "nose" of the trend line, covering the last few weeks, will move up or down as new polls become available and apparent patterns in this nose may change or disappear. It could be that what initially appeared to be a pattern gets "smoothed" over as contrary evidence is added at a later date, or it could be that pollsters are releasing polls with an earlier release date. Also, our estimates place a poll based on the dates it was in the field, not based on the date when it was released, so if a poll with an older field period is released it may affect the trend estimate at points surrounding that date.

You added a new poll result and your estimate changed in the opposite direction of the new poll. What happened?

As Mark explains here, the fact that we use trend estimates rather than rolling averages helps explain this seemingly contradictory results. Mark explains:

"The key difference between trend estimates and rolling averages is that an average produces a new estimate for each combination of polls included in the average at any point in time. The regression line produces a trend line - a line, rather than a point - with a particular slope that is either moving up, down or staying level at any point in time."

In addition, our formula is designed to be somewhat resistant to the results of any single poll, which may be an outlier rather than representing a change in public opinion - or in this case, a change in the direction that public opinion is moving. So, if our trend estimate is moving in a certain direction based on the results of several polls, it may not be "convinced" that the slope of the line has actually changed (or at least changed direction). Adding a new poll changes the end date and therefore continues the upward or downward slope, and thus sometimes having the effect of moving our estimate in the opposite direction of what might be expected based on the latest poll.

Topic: The Polls

What polls are included in your charts and trend estimates?

We include all publicly available polls that claim to provide representative samples of the population or electorate. This includes partisan polls and those sponsored by particular campaigns. In addition, we use polls conducted using any mode (live interviewer, IVR or automated, and internet panel).

As Charles Franklin explains,

"The first rule for Pollster is that we don't cherry pick. We make every effort to include every poll, even if it sometimes hurts. So even when we see a poll way out of line with other polls and what we "know" has to be true, we keep that poll in our data and in our trend estimates. There are two reasons. First, once you start cherry picking you never know when to stop. Second, we designed our trend estimator to be pretty resistant to the effect of any one poll (though when there are few polls this can't always be true.) That rule has served us pretty well. Whatever else may be wrong with Pollster, we are never guilty of including just the polls (or pollsters) we like."

For many of our trends, we have found that individual pollsters have a relatively small effect on our estimates. In some cases, especially for less heavily polled questions, divergent poll results may have a stronger effect; however, in such cases it is impossible to know which polls are "right" (or more right than others), and it may be even more damaging to throw out data for the least polled questions. Furthermore, in such cases where data is relatively scant, it is even more important to take stock of any data that is available. Ultimately, we will never exclude polls just because they show results that we do not believe (or want) to be true because we do not and cannot know the true state of public opinion - that's what polling is for!

For some of our trends, such as President Obama's job approval and party identification, we have included separate charts polls that use different sample populations because of diverging poll numbers that seem to be partially related to differences in the sampled population. In addition, a few of our trends rely heavily on one or two pollsters, and the house effects of those pollsters combined with the sporadic release of surveys with different house effects may combine to give the appearance of a trend where there is none, a problem that Mark has addressed here. If you want to see what our trends would look like without a particular pollster or polling mode, you can use our interactive charts to filter out the polls you don't want to see, and you can even embed your customized chart on your own site, and it will update as our own charts do.

How do you track rolling average polls that include overlapping samples? / Why do you skip some releases for daily tracking polls?

Daily tracking polls are composed of rolling samples. For example, both Gallup and Rasmussen's daily tracking polls include rolling three day samples. For this reason, the results for any three days in a row for those particular trackers will include results from overlapping days. To avoid including data from the same day more than once in our own trend estimate, at any given time our charts and data tables do not include data from tracking polls whose field periods overlap. So, for Gallup and Rasmussen, every third day's poll is included in our charts, counting back from the most recent poll which is always included. As a result, the data from any given day's release for the Gallup and Rasmussen tracking polls will only show up every third day. All of the data for the tracking polls is entered into our databases and will appear in our charts when it does not overlap with other poll data - all the available date will show up at some point in our charts and rotates in and out of the charts based on the field dates of the most recent poll. This ensures that we include the most recent available data but without double-counting overlapping field periods.

When surveys release results from more than one sample population (e.g., registered voters and likely voters), how do you choose which sample to include in your charts?

Generally speaking, most public polls do not release results for more than one population - that is, they release results for samples of adults, registered voters, or likely voters but not more than one. When a pollster provides results for more than one sample population (i.e. both adults and registered voters), we apply the following rules:

  1. For all vote preference questions, we use the sample that represents the pollster's best attempt to approximate the likely electorate at any given time. So if pollsters report two or more sets of results, our rule is to give preference to the narrower population. Thus, we give preference to "likely voter" results over those for all registered voters, and to registered voters over all adults. The one exception is when pollsters release results for likely voters but also include results for a "very likely" or "definite" voter subgroup.
  2. For issue and ratings questions, we take a different approach and include the broadest available sample for our charts in order to best represent the population as a whole.

As Mark has pointed out, some pollsters will release two sets of numbers, but give greater emphasis to the broader population (e.g. adults or registered voters) in their own analysis. As such, our policy of using the narrower population for vote preference questions can be problematic. Nevertheless, we believe it is essential to maintain a simple, objective rule for which sample population to choose if we are given a choice, so as to avoid cherry-picking and choosing the number that looks right. As Charles notes, our rule takes into account the best professional judgment of the pollster, who by releasing more than one set of results implies that one of those populations better represents the likely electorate than the other.

We include links to articles or documents with full poll data in the table that appears below each chart. Just click on the name of the poll to be taken to that pollster's results page.

Topic: The charts

How do I customize and embed your charts?

As Mark has written, you can use our interactive charts to:

  • • Select or limit the polls used to draw trend lines and calculate polling estimates with the "filter" tool. If you don't like a particular pollster, just un-click and take them out.
  • • Toggle between the display of the default trend line and alternatives that are more or less sensitive using the "smoothing" tool -- these are essentially the same as the "steady blue" and "ready red" trend lines often used by Charles Franklin.
  • • Hold your mouse over any data point to display details about each the poll.
  • • Click the mouse on any data point to "connect the dots" between all polls fielded by that pollster.
  • • Modify the date range (x-axis) and percentage range (y-axis) by clicking on either axis directly or with forms found on the "tools" menu.
  • • Select the candidates you want to see displayed on the chart with the "choices" tool .
  • • Toggle the display of data points, trend lines and grid lines on or off with the "plot" tools.
  • • Copy the code necessary to bookmark your customized chart or share it via email with the "URL" tool.
  • • Get the code necessary to place a small version of the customized chart on your own blog or web site with the "Embed" tool.

Why aren't all candidates listed?

Our charts and maps are based entirely on public polling data collected and released by other organizations. We have no control over the candidate choices they offer or any other aspect of the design of their surveys. We include on the map candidates that have been offered as a choice on at least half of the last six polls and when the trend estimate for that candidate is at least three percent of the total.

Topic: Questions on the Election Dashboard and NEW Trend Estimates for 2010