THE BLOG

The Demographic Composition of the PA Polls

04/21/2008 03:39 pm ET | Updated May 25, 2011

Just before the March 4 primaries, I did posts on the demographic compositions of the polls from both Texas and Ohio. With the ever valuable Eric Dienstfrey away on vacation this week, I am doing this post in a bit of a rush (so apologies in advance for typos). I would strongly recommend reviewing my post on the Texas demographics as a companion to this piece.

I have broken the available results into two tables below. Most come from documents posted on the web. Quinnipiac provided results on request, and the Zogby numbers were shared with my colleagues at the National Journal.

The racial mix of the Pennsylvania polls is not quite as critical to the level of candidate support as in Texas, since the share of black and Latino voters is smaller. Still, since Obama typically does better among African-Americans, men, younger voters and those with college degrees or higher incomes, while Clinton does better with whites, women, older voters and those with lower incomes or without a college degree, the demographic composition of the electorate will play a role in determining the outcome of the race.

04-21_demo_genderraceage.jpg

The surveys show more variation on some characteristics than others. Most, for example, show the percentage of women as somewhere between 55% and 58%, and most show the African-American percentage as somewhere in the mid-teens. Of course, with Barack obama expected to receive 80% to 90% of the black vote, the difference between an African American composition of 13% and 18% can alter Obama's vote total by 3 to 4 points.

On the other hand, we see quite a bit of difference in age. Unfortunately, the pollsters do not all use the same categories to ask about and report respondent age. Still, we can see quite a bit of difference, particularly in the percentages in the 18-to--29, 18-to-35 and 18-to-44 categories. We see that 18-to-29-year-olds are are anywhere from 4% to 16%, that 18-to-44/45-year-olds are anywhere from 22% to 43%, depending on the pollster. Given that Obama typically does much better among younger voters, and that Clinton does much better among retirees, this variation is obviously critical. [Update: Brian Schaffner also blogged on this issue today].

04-21_demos_educationincome.jpg

Socio-economic status is another critical characteristic in the Obama-Clinton race, especially in Pennsylvania (and something that I have written about often). Unfortunately, quite a few pollsters either ask or report nothing about the level of self-reported education or income of their samples. Still, we see considerable variation. The percentage of respondents with college degrees varies from 29% to 44%. I should point out that education and especially income are subject to more measurement error than other demographic items, especially if the text of the question and the number of categories differs.

Finally, since readers asked for it the last time, I have also posted one more table that includes all of the data above, plus the vote preference results. You will need to click on the graphic below to see a larger, readable version.

It is important to remember that pollsters come to these composition statistics through different paths. Some interview samples of adults, weight those demographically to match census estimates of Pennsylvania's adults, then select "likely voters" and let their demographics fall where they may. Others will weight their "likely voter" samples directly to pre-determined demographic targets. Some pollsters will not set weights or quotas for demographics, but will set such weights or quotas for geographic regions (based on past turnout and their assumptions about what might be different this time).

Trying to discern the differences in these methods is beyond our capacity today. The important thing is to remember that different pollsters conceive of "likely voters" in different ways, and the "likely voters" they reporting at are not identical.

Update: Poblano at FiveThirtyEight.com blogged some worthy thoughts about differences in likely voter models today.

Please note that, given the crunch of time I have probably not proofed the tables as well as I should have. If you catch a typo, please do not hesitate to send an email so I can correct it.