11/20/2012 09:23 pm ET Updated Dec 06, 2017

Speak of the Rival: Opponent References in the Republican Primary Debates

There has been much discussion of the interactions between the presidential candidates during the recent general-election debates. Though interesting, when only two candidates are on the stage their interactions are fairly predictable. Perhaps more interesting are the interactions between primary candidates, who are often more numerous. Do weaker candidates focus on a frontrunner who strives to stay "above the fray"? Do candidates form impromptu alliances and "gang up" on vulnerable opponents? Can the number of times a candidate is mentioned affect their prospects, or is it the other way around? We sought to examine some of these questions by studying opponent references -- when a candidate uses the name of one of his or her opponents, either indirectly through describing an action or policy position or through directly addressing an opponent -- in the 2012 Republican primary debates.

As I have mentioned before, one of the objectives of this project was to see if we could find something that correlated with the candidates' poll standing. In addition to subject matter, another avenue we explored was the connection between poll standing and references in a debate: do mentions in a debate help a candidate in the polls, or do rising poll numbers perhaps lead others to refer to the candidate, engaging (or attacking) his or her positions?

We began by isolating the candidates' statements which contain references to their fellow candidates. Simply tagging each use of a candidate's name is insufficient. The candidates refer to each other by first name as well as last, and also by title (only Herman Cain has not held an elected post). These complications are not insurmountable, but there are more challenging ones. Two candidates -- Governor Rick Perry and former Senator Rick Santorum -- share the same first name, and four -- Rick Perry, Jon Huntsman, Tim Pawlenty, and Mitt Romney -- are current or former governors. For many of these cases there is no way to automatically determine the subject of the reference, so they had to be tagged by hand.
Below is a table of the candidate's references to each other. Self-references, when candidates refer to themselves by name, are not included, as we consider that to serve a different purpose.


Table of opponent references. Rows represent speakers and columns the person spoken of; for example, Mr. Santorum referred to Mr. Romney by name on 56 occasions.

A few interesting observations can be made from this chart alone. The more serious candidates are mentioned more than those who were long shots, with eventual nominee Mitt Romney receiving the most references. Interestingly, he also mentions his opponents more than any other candidate, although since he has the most speaking time in the debates it is not that surprising. Newt Gingrich and Rick Santorum also refer to each other many times, which, given that they were considered to be competing for the title of "conservative alternative" to Mr. Romney, is perhaps not surprising as well.

There are two possible relationships between opponent references and poll data that we considered. The first is that the number of times a candidate is addressed by his peers in a debate has an influence on his or her poll standing. The second is that how a candidate is fairing in the polls affects how frequently other candidates address him or her. To test for these relationships, we correlated the number of times a candidate is mentioned in a debate with the change in that candidate's standing in the polls. To test the former relationship, we used the change between the day of the debate and seven days after, and for the latter between seven days before the debate and the day of. We used aggregated poll data from, which can be found here.

Overall, the results show a very weak or nonexistent relationship. However, there were two correlations that were strong enough to merit attention and that lend support to common interpretations of the debates. The chart below shows the correlation between change in poll standing leading up to the debates and references to Mr. Cain, who enjoyed a brief period as the frontrunner last fall. The R-squared value is a measure of the strength of the correlation; while statisticians would prefer a higher value, 0.753 is acceptable for social science work. As well, because we expect that other factors affect the poll standing, we can tolerate a weaker relationship.


What this chart shows is that in the debates that occurred during his rise in the polls, Mr. Cain received more references than he did when his standing was stable or declining. This makes sense intuitively -- his rise in the polls generated media attention, leading his opponents to engage with him in a way that they otherwise would not have.

The other relationship I want to highlight demonstrates the other suggested causal link. The correlation of Mr. Perry's change in standing after the debates with his mentions within it shows a fairy weak negative relationship. It is the strongest negative relationship, however.


This is the opposite of what we originally hypothesized (more references leads the public to take the candidate more seriously, raising poll standing). However, more references are not necessarily good. Governor Perry is widely believed to have done poorly in the debates, and his opponents attacked him rather strongly.

This post concludes my summary of our research on the 2012 Republican primary debates. While our results do not suggest anything particularly surprising, we did find evidence to support many common assessments of the debates. As computational methods improve and natural language toolkits expand, the future for applying these techniques to the political arena is promising. If you found these posts interesting, or have any suggestions for further research, please leave a comment below or on one of our other posts. Be sure as well to check out Dr. Benjamin Knoll's post analyzing the psychology of the presidential contenders using data from debate transcripts.