THE BLOG

Reading as Construction in Humans and Machines - Part III: Machines

03/17/2015 01:32 pm ET | Updated May 17, 2015

When we wound up last time, we had arrived that the conclusion that, whether or not we humans draw on our background knowledge to construct accurate models of the entities and events we read about, it is essential to their purpose that reading machines do so.

The question is: what sort of models will serve this purpose?

To Script or Not to Script?

Take, for instance, the case of the knowledge structure known as the "script" -- essentially, a set of rules governing a stereotypical situation, such as, for example, going to a restaurant.

Roger Schank and colleagues began promoting scripts at Yale back in the mid-1970s [Schank & Abelson 1977], as a way to furnish computers with enough common sense to "understand" (very simple) stories, like the following [adapted from Dyer 1989, p. 16]:

STORY 1: John went to the restaurant. The waiter brought John a menu. John ordered lobster. When he'd finished eating, John paid the bill and left a big tip.

By referencing a sufficiently well-elaborated restaurant script, it was held, a computer could answer such questions about STORY 1 as: "What did John eat?"

Of course, confronted with this question, any human reader will immediately respond "Lobster!" But look again - that's not actually stated anywhere in the verbatim transcript of the story. Rather, "Lobster!" is an inference, based on our background knowledge about restaurants -- which includes, among other things, the fact that restaurant patrons typically eat what they've ordered. And, absent something like a script, "Lobster!" is an inference that a computer is not equipped to make.

Schank claimed that humans have internalized vast numbers of such scripts, exhaustively detailing what to expect in a wide variety of everyday scenarios. And, further, that loading the equivalent knowledge structure into a computer would enable it to fill in the blanks and constructively understand the restaurant story.

Schank's claims that this alone would suffice were, however, vigorously contested by Hubert Dreyfus in his What Computers Can't Do [1972/79, p. 43]:

[T]he program has not understood a restaurant story the way people in our culture do, until it can answer such simple questions as: When the waitress came to the table did she wear clothes? Did she walk forward or backward? Did the customer eat his food with his mouth or his ear? If the program answers, "I don't know," we feel that all of its right answers were tricks or lucky guesses and that it has not understood anything of our everyday restaurant behavior.

... And, at times, Schank appeared to agree with him [Dreyfus 1972/79, p. 311]:

In a talk at the University of California at Berkeley (October 19, 1977) Schank agreed ... that to understand a visit to a restaurant the computer needs more than a script; it needs to know everything that people know. He added that he is unhappy that as it stands his program cannot distinguish "degrees of weirdness." Indeed, for the program it is equally "weird" for the restaurant to be out of food as it is for the customer to respond by devouring the chef.

Thus, Schank seems to agree that without some understanding of degree of deviation from the norm, the program does not understand a story even when in that story events follow a completely normal stereotyped script. It follows that although scripts capture a necessary condition of everyday understanding, they do not provide a sufficient condition.

One passage in particular bears repeating: to understand the restaurant story, or indeed any story,

... the computer needs more than a script, it needs to know everything that people know ...

But ...

If Not Scripts, Then What?

Then nothing, according to Dreyfus, who has for more than four decades staunchly maintained that the task of simulating human reasoning powers (and hence, human reading abilities) via the manipulation of symbols is impossible on its face. Impossible, because our human reasoning has nothing to do with manipulating symbols and everything to do with experiencing and reacting to our physical embodiment in an immediate situation.

For three of those four decades, Dreyfus's counsel of despair has been doggedly opposed by Douglas B. Lenat, founder of the Cyc project. Lenat freely admits that earlier attempts to replicate human intelligence have come to naught -- that "understanding even the easiest passages in common English... is far beyond the capabilities of present-day computer programs" [Lenat 1984, p. 204]. However, he locates the source of the problem not in any putative logical impossibility, but in a failure to take the bull by the horns. Instead of knuckling down to the "hard work" of "building the needed KB manually, one piece at a time," previous AI projects busied themselves with a fruitless search for shortcuts and free lunches [Lenat & Guha 1990, p. 26]. It's pretty clear that Lenat & Co. view Schankian scripts as yet another quest for a free lunch.

As his alternative to all that, in August 1984 Lenat secured funding from the industry-backed Microelectronics and Computer Technology Corporation and launched what has become arguably the largest, and certainly the longest-lived, attempt to encode the world's knowledge in a single system. Called "Cyc" -- short for encyclopedia -- the project began with the stated goal of capturing (some non-trivial portion of) the information contained in a "one-volume desk encyclopedia" [Lenat, Prakash, & Shepherd 1985, pp. 75 & 76]:

Our plan is to carefully represent approximately 400 articles (about 1000 paragraphs worth of material) from a one-volume desk encyclopedia. These are chosen to span the encyclopedia and to be as mutually distinct types of articles as possible.

Six years into the project, however, things had changed radically, and Lenat was taking pains to distance himself from what he now derided as "one of the unfortunate myths about Cyc" -- namely, "that its aim is to be a sort of electronic encyclopedia" [Guha & Lenat 1990, p. 34]. Instead, the newly-proclaimed objective was one that readers of this blog-series might find more familiar:

If anything, Cyc is the complement of an encyclopedia. The aim is that one day Cyc ought to contain enough commonsense knowledge to support natural language understanding capabilities that enable it to read through and assimilate any encyclopedia article.

The goal, in other words, had become one of formalizing precisely the sort of background knowledge that a human reader could be expected to apply to the task of understanding an encyclopedia article, or pretty much any other reading exercise. And for much the same purpose: to enable Cyc to "one day" assimilate new knowledge as humans do -- by reading.

In the eyes of many outside observers, all this was a bridge too far [Copeland 1997]:

Lenat's prediction that he will produce 'a system with human-level breadth and depth of knowledge' by the early years of next [i.e., the twenty-first] century betrays a failure to appreciate the sheer difficulty of the ontological, logical and epistemological problems that he has taken on.

And early hands-on experiences with Cyc itself did little to dissuade the critics. When Stanford's Vaughn Pratt was given a demo of the knowledge base's then-current capabilities from project co-leader Ramanthan Guha in April 1994, the results were less than awe-inspiring: Cyc knew that the earth has a sky, but not what color that sky was; that earth is larger than Venus, but not what its diameter is; that people eat food, but not that they'll die of starvation if deprived of it for long enough, etc. As Pratt it summed up [Pratt 1994]:

It was clear that the bulk of my questions were going to be well beyond CYC's present grasp.

Nor, tellingly, were such concerns limited to those outside the project looking in. In the same year as Pratt's visit, Guha himself dropped out of the Cyc effort, complaining that "We were killing ourselves trying to create a pale shadow of what had been promised" [Stipp 1995].

Still, Lenat himself has soldiered on, periodically emitting upbeat progress reports and predictions -- most recently just last year [Love 2014]. Still, my favorite projection in this regard, which paints an admittedly best-case scenario in breathtakingly optimistic hues, dates from early on [Guha & Lenat 1990, p. 57]:

No one in 2015 would dream of buying a machine without common sense, any more than anyone today would buy a personal computer that couldn't run spreadsheets, word processing programs, communications software, and so on.

At the risk of belaboring the obvious, here it is the target year already and I'm still waiting on my common-sense computer. Not to mention my flying car!

Next time, we'll try to gauge just how far Cyc has advanced toward the former goal.

REFERENCES

Copeland, Jack (1993), Artificial Intelligence: A Philosophical Introduction, Blackwell.

Copeland, Jack (1997), "CYC: A Case Study in Ontological Engineering," http://ejap.louisiana.edu/EJAP/1997.spring/copeland976.2.html

Dreyfus, Hubert L. (1972/79), What Computers Can't Do: The Limits of Artificial Intelligence, rev. ed., New York NY: Harper & Row.

Dyer, Michael G. (1989), "Knowledge Interactions and Integrated Parsing for Narrative Comprehension," in David Waltz, ed., Semantic Structures: Advances in Natural Language Processing, Hillsdale NJ: Erlbaum.

Guha, R. V. and Douglas B. Lenat (1990), "Cyc: A MidTerm Report," AI Magazine (Fall). http://web.media.mit.edu/~push/cyc-midterm.pdf.

Lenat, Douglas B. (1984), "Computer Software for Intelligent Systems," Scientific American, September. http://www.scientificamerican.com/article/computer-software-for-intelligent-s/

Lenat, Douglas B., Manyak Prakash, and Mary Shepherd (1985), "Cyc: Using Common Sense Knowledge to Overcome Brittleness and Knowledge Acquisition Bottlenecks," AI Magazine, vol. 6, no. 4. http://www.aaai.org/ojs/index.php/aimagazine/article/view/510/446.

Lenat, Douglas B. and R. V. Guha (1990), Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project, Reading MA: Addison-Wesley.

Love, Dylan (2014), "The Most Ambitious Artificial Intelligence Project In The World Has Been Operating In Near Secrecy For 30 Years," Business Insider, July 2. http://www.businessinsider.com/cycorp-ai-2014-7.

Pratt, Vaughan (1994), "CYC Report," Stanford University. http://boole.stanford.edu/cyc.html.

Schank, Roger and Robert Abelson (1977), Scripts, Plans, Goals, and Understanding: An Inquiry into Human Knowledge Structures, Hillsdale NJ: Lawrence Erlbaum.

Stipp, David (1995), "2001 is just around the corner. Where's HAL?" Fortune (November) http://archive.fortune.com/magazines/fortune/fortune_archive/1995/11/13/207673/index.htm