How Could IBM's Watson Think That Toronto Is a U.S. City?

02/16/2011 08:09 am ET | Updated May 25, 2011
  • Stephen Baker Author of The Numerati and Final Jeopardy, former BusinessWeek tech writer

IBM's Jeopardy machine was so dominant -- until the very end of the second day of the man-machine match, when it made what looked like a clueless mistake. It suggested Toronto as a "US city." And now instead of bowing before the new model of machine intelligence, masses of Jeopardy fans are ridiculing it on Twitter and elsewhere.

Just what IBM hoped to avoid. As I write in Final Jeopardy, the team building the machine actually had a group--the so-called "dumb team" -- to try to steer Watson away from embarrassing gaffes. These were most likely to occur in Final Jeopardy, where the clues are more complex and Watson is compelled to respond, even if it has low confidence, as it did in Toronto. The team even considered programming to the computer to throw up its hands when puzzled, and just say it didn't know. But instead, they let it guess.

Read the comments on Twitter, and there's lots of misunderstanding about how Watson works. Watson doesn't have lists of things it "knows." Every clue is a research project, and it comes up with the statistically most promising answer.

Here's a summary of the issues:

1) Watson can never be sure of anything. Is it possible that the old rock star Alice Cooper is a man? If Watson finds enough evidence, it will bet on it--even though the name "Alice" is sure to create a lot of doubt. This flexibility in its thinking can save Watson from gaffes--but also lead to a few.

2) Category titles cannot be trusted. I blogged about this earlier, in a post How Watson Thinks. It has learned through exhaustive statistical analysis that many clues do not jibe with categories. A category about US novelists, for example, can ask about J.D. Salinger's masterpiece. Catcher in the Rye is a novel, not a novelist! These things happen time and again, and Watson notices. So it pays scant attention to the categories.

3) If this had been a normal Jeopardy clue, Watson would not have buzzed. It had only 14% confidence in Toronto (whose Pearson airport is named for a man who was active in World War One), and 11% in Chicago. Watson simply did not come up with the answer, and Toronto was its guess. (It communicated its low confidence by adding a lot of question marks.)

Even so, how could it guess that Toronto was an American city? Here we come to the weakness of statistical analysis. While searching through data, it notices that the United States is often called America. Toronto is a North American city. Its baseball team, the Blue Jays, plays in the American League. If Watson happened to study the itinerary of my The Numerati book tour, it included a host of American cities, from Philadelphia and Pittsburgh, to Seattle, San Francisco, and Toronto. In documents like that, people often don't stop to note for inquiring computers that Toronto actually shouldn't be placed in the group.

Long story short: Watson screwed up on the clue. It comes up with a clunker or two in nearly every game. But it also gets lots of clues right -- and is close to being the greatest Jeopardy player ever.