My Jeopardy Battle Against IBM's Watson

"Help me Obi-wan Kenobi! You're my only hope." Watson's voice echoed through the improvised Jeopardy studio at IBM Research, just north of New York City. It was a mid-summer practice round for IBM's Jeopardy computer. But the game was halted while technicians made adjustments to Watson's sound system. Like a recording on a loop, the machine kept pronouncing the same sentence: "Help me Obi-Wan Kenobi..."

For me, this technical glitch came at a bad time. I was just getting comfortable with the buzzer. Next thing I knew, the computer was on the fritz and repeating the same asinine phrase 25 or 30 times. It was distracting. This type of technical delay, I realized, might pose yet another complication for human champions, Brad Rutter and Ken Jennings, in their million-dollar showdown with the machine (to be televised Feb. 14, 15 and 16). Unlike humans, Watson has eternal patience. It's utterly nerveless and will never suffer from a sore back.

This game was my golden opportunity. While researching my book, Final Jeopardy, I had been studying the machine for months, interviewing its creators and watching scores of practice rounds. I thought I knew Watson inside-out. And I planned to use all of this knowledge to defeat it.

I should be clear at the outset that this was not a formal sparring round, and would not be recorded in Watson's scientific record. The team building Watson's visual display, or avatar, simply wanted to see how the machine's avatar responded in game conditions. So they drafted a couple of humans, me and an IBM employee named Richard. This was not a serious game. But, as I told myself more than once, Watson didn't know that.

Here was my strategy going in. I knew that Watson aced "factoid" clues, bits of history or science tied to specific dates and names. I'd steer clear of such hard facts and push toward the foggier realm of humor and innuendo. Watson was also slow on one- or two-word clues. The machine needed two or three seconds to process its answers, and it only took the host about half a second to read them. So If I could find clues whose only words were single names or movie titles, I'd feast on them. And when it came to betting, I'd be bold. I'd seen a Jeopardy ace named Greg Lindsay destroy Watson by betting the farm on every Daily Double. If I got the chance, I'd follow Greg's lead.

A lot in Jeopardy boils down to luck. I thought it was smiling on me when I saw the middle category on the big board: Mexican Food. I'd worked as a correspondent in Mexico City for five years, and before that wrote for a paper on the border, in El Paso, Texas. So I called for the $1,000 clue in the category. (Daily Doubles, I'd learned, were more likely to be hidden behind high-dollar clues. Watson also "knew" this.) The clue: "Touted as a hangover cure, this hearty soup made with tripe is popular on New Year's morning." That was an easy one for me, and Watson didn't buzz. With my response -- "What is menudo?" -- I rocketed into an early lead.

I was standing to Watson's right -- the spot Ken Jennings would occupy in the big match. So I was only a couple feet away from the machine's "hand," the contraption it used to press the Jeopardy buzzer. On the next clue, about an "appetizer of tortilla chips topped with cheese and jalapenos," I heard a staccato rapping to my left. It was Watson hitting its button three times, and beating me to the buzzer. It answered, "What is nachos?" (pronouncing it as "natch" instead of "notch") It didn't matter that I knew more than this machine about Mexico food, I realized, if it beat me to the buzzer. This happened often.

Still, it wasn't invincible. While it beat Richard and me to the buzzer on clues about Nathanial West, Evelyn Waugh, buzz saws and the astronaut Buzz Aldrin (Who is Edwin Eugene "Buzz" Aldrin, Jr." the computer said), it also screwed up. It responded to one clue about Virginia Woolf by naming her husband, Leonard. The computer's biggest flop came in a Daily Double. It was asked about the two famous comedians' noses imprinted on the pavement at "Grauman's Chinese." Watson responded "Who is Jimmy Durante?" When the host asked for the other name, Watson said, "Sorry, all I know is 'Who is Jimmy Durante?" (The other was Bob Hope.) That cost it $4,400, dropping it below $20,000.

I had a mere $5,200 when I landed on a Daily Double in the Spy vs. Spy category. I bet $5,000, and faced this clue: "This playwright was slain under mysterious circumstances possibly related to spying for Elizabeth I." I identified Christopher Marlowe, and clawed my way up to Richard's level.

We were both losing, of course. But if we could end Double Jeopardy with at least half of Watson's score, victory was still within reach. Perhaps the computer's greatest weakness, I knew, was in Final Jeopardy. That last clue in the game, in which contestants could bet every dollar they had, tended to be more complex. It often involved several levels of reasoning. On a number of occasions, I'd seen Watson blow a lead on Final Jeopardy. (On one of them, it confused a 19th-century literature clue about Dickens' Oliver Twist for the 1990s techno band, The Pet Shop Boys.)

The category for Final Jeopardy was Naval Heroes. Watson had $19,300. I trailed with $11,400. Richard had $11,000. It was time to bet. The calculation of wagers is where Watson has a big advantage. For all the confusion it suffers in the realm of words, it's a superstar in math. I puzzled for a minute or so on a calculation that would take Watson a nanosecond. I figured that the computer would bet at least $3,500 -- enough to reach my highest possible score, $22,800. So I bet $8,000 -- enough to beat it it if missed.

Here was the clue: "When he was killed in battle in 1805, he was wearing a uniform coat with sewn-on replicas of his 4 orders of chivalry."

Richard bet everything he had and didn't know the answer. I wrote, "Who is Nelson?" This was correct. I had the lead, $100 ahead of Watson. The attention turned to the computer. Like me, it identified Nelson. Oddly enough, though, its bet was only $300, placing it a scant $200 ahead of me. If I had bet everything I had, I would have beaten the machine.

Naturally, I was feeling pretty good about myself. A few minutes later I ran into J. Michael Loughran. He's the press officer who has been with the Jeopardy project from day one. He congratulated me, and then just as quickly dismissed my achievement. The Watson I was playing, he informed me, was probably an old version of the software "taken off someone's laptop."

My soaring ego came down with a thud. For all I know, I was playing a 2009 version of the program, or maybe even a relic from its kindergarten days, in 2008. In any case, I still lost.

