iPhone app iPad app Android phone app Android tablet app More

Featuring fresh takes and real-time analysis from HuffPost's signature lineup of contributors
Stephen Baker

Stephen Baker

Posted: February 8, 2011 10:45 AM

"Help me Obi-wan Kenobi! You're my only hope." Watson's voice echoed through the improvised Jeopardy studio at IBM Research, just north of New York City. It was a mid-summer practice round for IBM's Jeopardy computer. But the game was halted while technicians made adjustments to Watson's sound system. Like a recording on a loop, the machine kept pronouncing the same sentence: "Help me Obi-Wan Kenobi..."

For me, this technical glitch came at a bad time. I was just getting comfortable with the buzzer. Next thing I knew, the computer was on the fritz and repeating the same asinine phrase 25 or 30 times. It was distracting. This type of technical delay, I realized, might pose yet another complication for human champions, Brad Rutter and Ken Jennings, in their million-dollar showdown with the machine (to be televised Feb. 14, 15 and 16). Unlike humans, Watson has eternal patience. It's utterly nerveless and will never suffer from a sore back.

This game was my golden opportunity. While researching my book, Final Jeopardy, I had been studying the machine for months, interviewing its creators and watching scores of practice rounds. I thought I knew Watson inside-out. And I planned to use all of this knowledge to defeat it.

I should be clear at the outset that this was not a formal sparring round, and would not be recorded in Watson's scientific record. The team building Watson's visual display, or avatar, simply wanted to see how the machine's avatar responded in game conditions. So they drafted a couple of humans, me and an IBM employee named Richard. This was not a serious game. But, as I told myself more than once, Watson didn't know that.

Here was my strategy going in. I knew that Watson aced "factoid" clues, bits of history or science tied to specific dates and names. I'd steer clear of such hard facts and push toward the foggier realm of humor and innuendo. Watson was also slow on one- or two-word clues. The machine needed two or three seconds to process its answers, and it only took the host about half a second to read them. So If I could find clues whose only words were single names or movie titles, I'd feast on them. And when it came to betting, I'd be bold. I'd seen a Jeopardy ace named Greg Lindsay destroy Watson by betting the farm on every Daily Double. If I got the chance, I'd follow Greg's lead.

A lot in Jeopardy boils down to luck. I thought it was smiling on me when I saw the middle category on the big board: Mexican Food. I'd worked as a correspondent in Mexico City for five years, and before that wrote for a paper on the border, in El Paso, Texas. So I called for the $1,000 clue in the category. (Daily Doubles, I'd learned, were more likely to be hidden behind high-dollar clues. Watson also "knew" this.) The clue: "Touted as a hangover cure, this hearty soup made with tripe is popular on New Year's morning." That was an easy one for me, and Watson didn't buzz. With my response -- "What is menudo?" -- I rocketed into an early lead.

I was standing to Watson's right -- the spot Ken Jennings would occupy in the big match. So I was only a couple feet away from the machine's "hand," the contraption it used to press the Jeopardy buzzer. On the next clue, about an "appetizer of tortilla chips topped with cheese and jalapenos," I heard a staccato rapping to my left. It was Watson hitting its button three times, and beating me to the buzzer. It answered, "What is nachos?" (pronouncing it as "natch" instead of "notch") It didn't matter that I knew more than this machine about Mexico food, I realized, if it beat me to the buzzer. This happened often.

Still, it wasn't invincible. While it beat Richard and me to the buzzer on clues about Nathanial West, Evelyn Waugh, buzz saws and the astronaut Buzz Aldrin (Who is Edwin Eugene "Buzz" Aldrin, Jr." the computer said), it also screwed up. It responded to one clue about Virginia Woolf by naming her husband, Leonard. The computer's biggest flop came in a Daily Double. It was asked about the two famous comedians' noses imprinted on the pavement at "Grauman's Chinese." Watson responded "Who is Jimmy Durante?" When the host asked for the other name, Watson said, "Sorry, all I know is 'Who is Jimmy Durante?" (The other was Bob Hope.) That cost it $4,400, dropping it below $20,000.

I had a mere $5,200 when I landed on a Daily Double in the Spy vs. Spy category. I bet $5,000, and faced this clue: "This playwright was slain under mysterious circumstances possibly related to spying for Elizabeth I." I identified Christopher Marlowe, and clawed my way up to Richard's level.

We were both losing, of course. But if we could end Double Jeopardy with at least half of Watson's score, victory was still within reach. Perhaps the computer's greatest weakness, I knew, was in Final Jeopardy. That last clue in the game, in which contestants could bet every dollar they had, tended to be more complex. It often involved several levels of reasoning. On a number of occasions, I'd seen Watson blow a lead on Final Jeopardy. (On one of them, it confused a 19th-century literature clue about Dickens' Oliver Twist for the 1990s techno band, The Pet Shop Boys.)

The category for Final Jeopardy was Naval Heroes. Watson had $19,300. I trailed with $11,400. Richard had $11,000. It was time to bet. The calculation of wagers is where Watson has a big advantage. For all the confusion it suffers in the realm of words, it's a superstar in math. I puzzled for a minute or so on a calculation that would take Watson a nanosecond. I figured that the computer would bet at least $3,500 -- enough to reach my highest possible score, $22,800. So I bet $8,000 -- enough to beat it it if missed.

Here was the clue: "When he was killed in battle in 1805, he was wearing a uniform coat with sewn-on replicas of his 4 orders of chivalry."

Richard bet everything he had and didn't know the answer. I wrote, "Who is Nelson?" This was correct. I had the lead, $100 ahead of Watson. The attention turned to the computer. Like me, it identified Nelson. Oddly enough, though, its bet was only $300, placing it a scant $200 ahead of me. If I had bet everything I had, I would have beaten the machine.

Naturally, I was feeling pretty good about myself. A few minutes later I ran into J. Michael Loughran. He's the press officer who has been with the Jeopardy project from day one. He congratulated me, and then just as quickly dismissed my achievement. The Watson I was playing, he informed me, was probably an old version of the software "taken off someone's laptop."

My soaring ego came down with a thud. For all I know, I was playing a 2009 version of the program, or maybe even a relic from its kindergarten days, in 2008. In any case, I still lost.

 

Follow Stephen Baker on Twitter: www.twitter.com/stevebaker

"Help me Obi-wan Kenobi! You're my only hope." Watson's voice echoed through the improvised Jeopardy studio at IBM Research, just north of New York City. It was a mid-summer practice round for IBM's J...
"Help me Obi-wan Kenobi! You're my only hope." Watson's voice echoed through the improvised Jeopardy studio at IBM Research, just north of New York City. It was a mid-summer practice round for IBM's J...
 
 
  • Comments
  • 28
  • Pending Comments
  • 0
  • View FAQ
Comments are closed for this entry
View All
Favorites
Bloggers
Recency  | 
Popularity
HUFFPOST SUPER USER
Robert SF
07:36 PM on 02/13/2011
Here's what not many have noticed. You don't need to be Jeopardy-smart to answer phones at a call center. Within five years, a commercial version of Watson will cost under $100k and will be able to handle 100 calls simultaneously, 24 hour a day, with no days off. At that point, call centers staffed by humans no longer make sense, and unemployment will rise even more.
04:31 PM on 02/10/2011
The NOVA documentary revealed a very obvious weakness in Watson. Hopefully the two humans picked up on it during the test matches and were able to use it to collude against the computer.

Watson uses the answers that the humans have given to help it figure out what a category means - the example give on NOVA was related to the months of the year.

To use this against Watson, it makes sense for the two humans to ALWAYS choose questions from the MOST VALUABLE down to the least valuable. That way, even if it only takes Watson one correct answer to figure out the meaning of a particularly inscrutable category name, the humans have removed the highest dollar value answer from play, reducing the amount available for Watson to win in that category once it figures out what the category is about.
photo
HUFFPOST SUPER USER
Infostream
04:49 AM on 02/10/2011
I want to know more about how Watson hits the buzzer, seems it could easily have an unfair advantage there. Often people who know the answers don't win because their reflexes on the buzzer aren't good, a machine could hit it 100 times per/second.

Anyway, can't wait for the big show!
This user has chosen to opt out of the Badges program
photo
10:44 AM on 02/09/2011
Will Watson have that annoying habit that many contestants have of raising the pitch on the final word of their answer making that answer sound like a question, which technically it is, I suppose, but as irritating as someone who answers questions with questions.
06:27 AM on 02/09/2011
I met Tom Watson once when I was an undergraduate, not in the computer lab which I never saw, but on the ski slopes in Vermont. I believe he was the second generation at IBM. He was the kind of guy who would have appreciated the name.
photo
HUFFPOST SUPER USER
notdarkyet
End the Drug War.
11:08 PM on 02/08/2011
I saw the special on PBS on the making of Watson at at the end he was kicking but, making over forty grand. Should be fun, but I'd bet on Watson.
This user has chosen to opt out of the Badges program
photo
HUFFPOST BLOGGER
Bill Swadley
Writer, finance exec, dad
08:05 PM on 02/08/2011
They've been advertising the heck out of this on the show. Can't wait.
07:53 PM on 02/08/2011
It's true that Watson's opponents can sometimes predict the categories in which he is likely to be weaker or stronger, but why was your strategy to steer clear of Watson's strong areas? All of the clues will be played sooner or later anyway, so why not play defense and try to keep Watson from maximizing points in his best categories? I played several matches against the final version of Watson last month, and it seemed to me that my best hope was to find an early Daily Double, especially in a category that Watson liked, before he had a chance to find it and use it against me.

I love your "Final Jeopardy" book and I'm looking forward to reading the final chapter.
photo
HUFFPOST BLOGGER
Stephen Baker
author of The Numerati and Final Jeopardy
09:33 AM on 02/09/2011
Ed, thanks for your comments, and I'm thrilled you like the book. (If you could put a review on the Amazon site, I'd be in your debt.) As far as your question, I think we agree that the best hope was to get daily doubles. And the only way to do that is to control the board. And if you go into categories where Watson's strong, you cede control to it.
07:46 PM on 02/08/2011
just unplug the dang thing. You win.
This user has chosen to opt out of the Badges program
photo
07:18 PM on 02/08/2011
That was a fun read. Sounds like we need some Jedi to stop the droid uprising.
photo
HUFFPOST SUPER USER
Demitasse
Ars longa, vita brevis
05:11 PM on 02/08/2011
Some future version of Watson (say 9.0) in a quantum computer & we could be looking at a machine that could not only understand us but also mimic us.
HUFFPOST SUPER USER
gurukalehuru
cwtc7
03:45 PM on 02/08/2011
I am looking forward to this game as much as most Americans were looking forward to the Super Bowl.
04:17 AM on 02/09/2011
Me, too! (I didn't watch the Super Bowl.) In this case, I'm almost as interested in which of the two humans comes in second...er, I mean, which human beats the other human. (Not to be disloyal to my species, but as a career-long IBM employee, I kinda gotta root for Research's "kid".)
HUFFPOST SUPER USER
gurukalehuru
cwtc7
08:52 AM on 02/09/2011
Oh, I'm absolutely rooting for Watson.
T-Haight
What was wrong with federalism?
03:04 PM on 02/08/2011
Get back to work pleasing your robot masters, you petty human!

Seriously, they give it a body to hit a buzzer, and next thing you know, BAM! Robot Overlords!
photo
HUFFPOST BLOGGER
Stephen Baker
author of The Numerati and Final Jeopardy
12:52 PM on 02/08/2011
Nosybear, I actually think that developing computers to understand language and search for answers is an important use of technology. But I think if you look at the iPhone app store, you'll find plenty of technology development to back your case.
photo
HUFFPOST SUPER USER
Nosybear
Liar, damn liar, statistician and brewer
12:40 PM on 02/08/2011
I have to love the irony: The Chinese are using their (faster) supercomputers to design 400 mph trains and systems to put men on the moon - non-English speakers, BTW - while we use our (slower) supercomputers to play Jeopardy. Brilliant.
photo
HUFFPOST SUPER USER
Valkyrie Ice
Writer for H+ Magazine, and commenter at random
01:57 PM on 02/08/2011
Yes, it is brilliant, because a computer capable of understanding language well enough to win jeopardy will also understand language well enough to read the entire law library of the the US and win every case in court. It will be able to sort through every medical library ever written and find who knows what connections that might lead to a cure for cancer. It could read every science book ever written and be able to teach a class better than the best professor.

Your sarcasm is misplaced. There is no irony. And there is a MASSIVE difference between a fast computer and one that understands language. If it understands english well enough to win Jeopardy, imagine what you will get when it's applied to translations, research, and even just daily computer usage. I'll take understanding over speed anyday.
photo
HUFFPOST COMMUNITY MODERATOR
Witkacy
05:21 PM on 02/08/2011
Exactly! Fanned
photo
HUFFPOST SUPER USER
notdarkyet
End the Drug War.
11:11 PM on 02/08/2011
The creators talked about all the difficulties of language involved in Jeopardy and making Watson. The clues can be puns or go into obscure connections, it's hard to explain, but PBS had a show on it and how they did it.
T-Haight
What was wrong with federalism?
03:16 PM on 02/08/2011
Wow, what a downer of a comment. I'd be depressed if it were true.

True, the Chinese have the world's fastest supercomputer, but it's not doing anything as scary or exciting as taking the lead in high-speed rail. Instead, it's working on petroleum exploration, aircraft simulation, and leased work for a fee (http://en.wikipedia.org/wiki/Tianhe-I). The US still has 5 of the top 10 supercomputers, including those working on modeling atomic reactions. Don't fret the American position on supercomputers.

As for Watson, downplaying it as a crank project is a misnomer. One of the greatest barriers in AI research is that computers simply don't think like humans - they don't get puns, they don't understand metaphors, they just can't relate to human thought! Watson is an attempt to change all of that and, in characteristic IBM fashion, make a big splash at the same time. Similar to IBM's famous chess-playing computers, the spin-offs from the Watson project could be incredible for IBM and are difficult to quantify or imagine. Plus, the amount of free publicity is much more valuable than mere advertising.
07:18 PM on 02/08/2011
People denegrated ASIMO when it had problems with "simple" bi-pedal movement.
Then they denegrated ASIMO when it had problems with simple obstacles.
Then they denegrated ASIMO when it had problems mastering staircases.
And yet ASIMO keeps on travelling its journey, such as WATSON is doing.
Interesting and yes, scary.