"Help me Obi-wan Kenobi! You're my only hope." Watson's voice echoed through the improvised Jeopardy studio at IBM Research, just north of New York City. It was a mid-summer practice round for IBM's Jeopardy computer. But the game was halted while technicians made adjustments to Watson's sound system. Like a recording on a loop, the machine kept pronouncing the same sentence: "Help me Obi-Wan Kenobi..."
For me, this technical glitch came at a bad time. I was just getting comfortable with the buzzer. Next thing I knew, the computer was on the fritz and repeating the same asinine phrase 25 or 30 times. It was distracting. This type of technical delay, I realized, might pose yet another complication for human champions, Brad Rutter and Ken Jennings, in their million-dollar showdown with the machine (to be televised Feb. 14, 15 and 16). Unlike humans, Watson has eternal patience. It's utterly nerveless and will never suffer from a sore back.
This game was my golden opportunity. While researching my book, Final Jeopardy, I had been studying the machine for months, interviewing its creators and watching scores of practice rounds. I thought I knew Watson inside-out. And I planned to use all of this knowledge to defeat it.
I should be clear at the outset that this was not a formal sparring round, and would not be recorded in Watson's scientific record. The team building Watson's visual display, or avatar, simply wanted to see how the machine's avatar responded in game conditions. So they drafted a couple of humans, me and an IBM employee named Richard. This was not a serious game. But, as I told myself more than once, Watson didn't know that.
Here was my strategy going in. I knew that Watson aced "factoid" clues, bits of history or science tied to specific dates and names. I'd steer clear of such hard facts and push toward the foggier realm of humor and innuendo. Watson was also slow on one- or two-word clues. The machine needed two or three seconds to process its answers, and it only took the host about half a second to read them. So If I could find clues whose only words were single names or movie titles, I'd feast on them. And when it came to betting, I'd be bold. I'd seen a Jeopardy ace named Greg Lindsay destroy Watson by betting the farm on every Daily Double. If I got the chance, I'd follow Greg's lead.
A lot in Jeopardy boils down to luck. I thought it was smiling on me when I saw the middle category on the big board: Mexican Food. I'd worked as a correspondent in Mexico City for five years, and before that wrote for a paper on the border, in El Paso, Texas. So I called for the $1,000 clue in the category. (Daily Doubles, I'd learned, were more likely to be hidden behind high-dollar clues. Watson also "knew" this.) The clue: "Touted as a hangover cure, this hearty soup made with tripe is popular on New Year's morning." That was an easy one for me, and Watson didn't buzz. With my response -- "What is menudo?" -- I rocketed into an early lead.
I was standing to Watson's right -- the spot Ken Jennings would occupy in the big match. So I was only a couple feet away from the machine's "hand," the contraption it used to press the Jeopardy buzzer. On the next clue, about an "appetizer of tortilla chips topped with cheese and jalapenos," I heard a staccato rapping to my left. It was Watson hitting its button three times, and beating me to the buzzer. It answered, "What is nachos?" (pronouncing it as "natch" instead of "notch") It didn't matter that I knew more than this machine about Mexico food, I realized, if it beat me to the buzzer. This happened often.
Still, it wasn't invincible. While it beat Richard and me to the buzzer on clues about Nathanial West, Evelyn Waugh, buzz saws and the astronaut Buzz Aldrin (Who is Edwin Eugene "Buzz" Aldrin, Jr." the computer said), it also screwed up. It responded to one clue about Virginia Woolf by naming her husband, Leonard. The computer's biggest flop came in a Daily Double. It was asked about the two famous comedians' noses imprinted on the pavement at "Grauman's Chinese." Watson responded "Who is Jimmy Durante?" When the host asked for the other name, Watson said, "Sorry, all I know is 'Who is Jimmy Durante?" (The other was Bob Hope.) That cost it $4,400, dropping it below $20,000.
I had a mere $5,200 when I landed on a Daily Double in the Spy vs. Spy category. I bet $5,000, and faced this clue: "This playwright was slain under mysterious circumstances possibly related to spying for Elizabeth I." I identified Christopher Marlowe, and clawed my way up to Richard's level.
We were both losing, of course. But if we could end Double Jeopardy with at least half of Watson's score, victory was still within reach. Perhaps the computer's greatest weakness, I knew, was in Final Jeopardy. That last clue in the game, in which contestants could bet every dollar they had, tended to be more complex. It often involved several levels of reasoning. On a number of occasions, I'd seen Watson blow a lead on Final Jeopardy. (On one of them, it confused a 19th-century literature clue about Dickens' Oliver Twist for the 1990s techno band, The Pet Shop Boys.)
The category for Final Jeopardy was Naval Heroes. Watson had $19,300. I trailed with $11,400. Richard had $11,000. It was time to bet. The calculation of wagers is where Watson has a big advantage. For all the confusion it suffers in the realm of words, it's a superstar in math. I puzzled for a minute or so on a calculation that would take Watson a nanosecond. I figured that the computer would bet at least $3,500 -- enough to reach my highest possible score, $22,800. So I bet $8,000 -- enough to beat it it if missed.
Here was the clue: "When he was killed in battle in 1805, he was wearing a uniform coat with sewn-on replicas of his 4 orders of chivalry."
Richard bet everything he had and didn't know the answer. I wrote, "Who is Nelson?" This was correct. I had the lead, $100 ahead of Watson. The attention turned to the computer. Like me, it identified Nelson. Oddly enough, though, its bet was only $300, placing it a scant $200 ahead of me. If I had bet everything I had, I would have beaten the machine.
Naturally, I was feeling pretty good about myself. A few minutes later I ran into J. Michael Loughran. He's the press officer who has been with the Jeopardy project from day one. He congratulated me, and then just as quickly dismissed my achievement. The Watson I was playing, he informed me, was probably an old version of the software "taken off someone's laptop."
My soaring ego came down with a thud. For all I know, I was playing a 2009 version of the program, or maybe even a relic from its kindergarten days, in 2008. In any case, I still lost.
Follow Stephen Baker on Twitter: www.twitter.com/stevebaker
Wray Herbert: Jeopardy! IBM Challenge Spotlights Cognitive Anchoring on National TV
Watson uses the answers that the humans have given to help it figure out what a category means - the example give on NOVA was related to the months of the year.
To use this against Watson, it makes sense for the two humans to ALWAYS choose questions from the MOST VALUABLE down to the least valuable. That way, even if it only takes Watson one correct answer to figure out the meaning of a particularly inscrutable category name, the humans have removed the highest dollar value answer from play, reducing the amount available for Watson to win in that category once it figures out what the category is about.
Anyway, can't wait for the big show!
I love your "Final Jeopardy" book and I'm looking forward to reading the final chapter.
Seriously, they give it a body to hit a buzzer, and next thing you know, BAM! Robot Overlords!
Your sarcasm is misplaced. There is no irony. And there is a MASSIVE difference between a fast computer and one that understands language. If it understands english well enough to win Jeopardy, imagine what you will get when it's applied to translations, research, and even just daily computer usage. I'll take understanding over speed anyday.
True, the Chinese have the world's fastest supercomputer, but it's not doing anything as scary or exciting as taking the lead in high-speed rail. Instead, it's working on petroleum exploration, aircraft simulation, and leased work for a fee (http://en.wikipedia.org/wiki/Tianhe-I). The US still has 5 of the top 10 supercomputers, including those working on modeling atomic reactions. Don't fret the American position on supercomputers.
As for Watson, downplaying it as a crank project is a misnomer. One of the greatest barriers in AI research is that computers simply don't think like humans - they don't get puns, they don't understand metaphors, they just can't relate to human thought! Watson is an attempt to change all of that and, in characteristic IBM fashion, make a big splash at the same time. Similar to IBM's famous chess-playing computers, the spin-offs from the Watson project could be incredible for IBM and are difficult to quantify or imagine. Plus, the amount of free publicity is much more valuable than mere advertising.
Then they denegrated ASIMO when it had problems with simple obstacles.
Then they denegrated ASIMO when it had problems mastering staircases.
And yet ASIMO keeps on travelling its journey, such as WATSON is doing.
Interesting and yes, scary.