10/17/2016 11:36 am ET Updated Oct 18, 2017

Words Matter: A Linguistic Analysis of the Presidential Debates

"Word matter when you run for president," as Hillary Clinton reminded her opponent during the first presidential debate. Clinton was clearly admonishing Donald Trump for a season of off-the-cuff remarks and tweets which have been routinely misleading, false, hateful, derogatory, inflammatory, juvenile, and--most recently--"lewd." Trump's counter, at once boastful and inscrutable, is that he has ""the best words." Does he? Let's put aside the fact-checking for the moment and put this linguistic claim to the test.

Using the transcripts supplied by NPR, we first considered the candidates' ability to follow the social conventions of discourse. The most basic of these is that we take turns speaking. One would expect, in a relatively structured context like a debate, particularly one with a moderator, that these turns would be evenly distributed. However, in the first debate, Trump took 96 turns to Clinton's 71. This imbalance was exaggerated in the super-heated exchanges of the second debate, with Trump taking 70 turns to Clinton's 39.

Appropriate turn-taking also means not interrupting other speakers. Across the two debates, Clinton was interrupted almost twice as often as Trump was. The vast majority of the time (81% of interruptions), Clinton was interrupted by Trump, whereas Trump was interrupted by Clinton only 11% of the time (when Trump was interrupted, it was usually the moderator attempting to redirect or clarify Trump's answers).

This raises a third conversational convention, that when asked a question, our responses are relevant and informative. Based on candidates' responses to direct questions from the moderators or the audience, Clinton was almost always on topic (88% of responses), while Trump replied with relevant information less than half (47%) of the time. These numbers provide a clue to Trump's plaintive question to Martha Raddatz during the second debate, "Why don't you interrupt her?".

Next, we examined the candidates' ability to form sentences. On average, Clinton produced longer sentences (~15 words/sentence) than Trump did (~10 words/sentence). Using readability indices, this put Clinton's speech at about the 8th grade level, and Trump's at about the 6th grade level. Size isn't everything, though--Trump should not be criticized for having small sentences.

What about grammatical accuracy? In the first debate, Clinton produced 10 sentence fragments (not counting those which were interrupted), and Trump produced 83 fragments. Most of these were complex sentences that began with a subordinate clause ("When you look at what's happening in Mexico..."; "As far as my tax returns...") which Trump abandoned in mid-stream. His fragmented language is consistent with an attention span that his ghostwriter likened to "a kindergartner who can't sit still in a classroom."

Finally, we assessed the words that were used by the two candidates. Here, Trump lagged on several measures. He used words which were shorter, more common, and less varied on average, than Clinton's. Befitting a narcissist (or a kindergartner), he used first person singular pronouns (I, me, myself) and second person pronouns (you, your) twice as often as Clinton, while she was more likely to use first person plural pronouns (we, us). Trump used twice as many empty words (e.g. anybody, everybody, nothing, thing) as Clinton.

To explore the ideas brought up most often by the candidates, we excluded names, numbers, empty words, and grammatical words (she, it, the, in, and). The most frequent nouns produced by each candidate were people (Clinton) and country (Trump). The most frequent verbs were think (Clinton) and say (Trump)--I think that says it all.

Because many of the ideas were discussed by both candidates, we filtered the most common 50 words to find those words spoken by only one of the candidates. These are listed below, in order of frequency. Trump's words are often polarizing (right--wrong, win--lose), while Clinton's words are more often goal-oriented (try, propose, build, hope).

  • Clinton's most common unique nouns: fact, police, state, economy, plan, debt, information, problem, family, gun, home, justice, income.

  • Trump's most common unique nouns: company, dollars, money, war, law, politician, audit, city, trade.

  • Clinton's most common unique verbs: try, support, call, use, face, propose, build, hope.

  • Trump's most common unique verbs: leave, tell, like, happen, agree, stop, release, defend, lose, win.

  • Clinton's most common unique adjectives: nuclear, important, wealthy.

  • Trump's most common unique adjectives: bad, great, right, better, tremendous, wrong.
  • Some of these themes were also identified by the Linguistic Inquiry and Word Count (LIWC) program, which categorizes words into meaning categories. According to LIWC, Clinton focuses more on home, family, religion, and work, while Trump focuses more on death and numbers. In the "drives" category, Clinton's words demonstrate a greater desire for affiliation and achievement, while Trump's words suggest a stronger drive for risk. Clinton produced more words associated with cognitive processes, notably in the insight category. Clinton's words display slightly more positive emotion, and Trump's significantly more negative emotion, especially in the second debate.

    If words matter, then Trump has his work cut out for him as we approach the third and final debate. He has one more opportunity to use his best words--those which focus on the issues rather than himself and his opponent, to put them into complete and relevant sentences, to pay attention, take turns, and use his inside voice.