02/10/2015 04:54 pm ET Updated Apr 12, 2015

Reading as Construction in Humans and Machines -- Part I: Humans

Do stories exist only in our heads?

Put the question another way: Is it possible that the "overt story" -- the surface text physically presented to us, whether on parchment or computer screen -- is really only a string of cues intended to guide the reader in reconstructing the full narrative? The answer will have implications for the likelihood of success in such perennial artificial intelligence pursuits as the automated acquisition of knowledge from textual corpora.

In these days of fear and trembling over the imminence of AI apocalypse on the one hand, and rapturous anticipation of a technological singularity on the other, it perhaps behooves us to ask whether current efforts in the knowledge acquisition field are marching inexorably toward some sort of CYClopean culmination, or simply tracing a more circuitous route down into the abyss of yet another "AI-winter" debacle.

The underlying issue here seems to me to be: whether, and to what degree, reading and understanding can be seen as processes of active construction -- of filling in the blanks, so to speak -- as opposed to mere passive assimilation. And, to the extent that they are, what can be done to automate them?

First things first, though. We need to begin by looking at --

The Human Side of the Story

In his "Pierre Menard, Author of the Quixote" [Borges 1939], Jorge Luis Borges imagines an ideal reader, one so devoted to a literary work as to aspire to make it his own in the most radical sense -- by writing it:

He did not want to compose another Quixote -- which is easy -- but the Quixote itself. Needless to say, he never contemplated a mechanical transcription of the original; he did not propose to copy it. His admirable intention was to produce a few pages which would coincide, word for word and line for line, with those of Miguel de Cervantes.

... Nor would he seek to imitate Cervantes by converting to Catholicism, fighting the Moors, endeavoring to forget all of European history after 1602, etc.:

To be, in some way, Cervantes and reach the Quixote seemed less arduous to him -- and, consequently, less interesting -- than to go on being Pierre Menard and reach the Quixote through the experiences of Pierre Menard.

Enacting this extended metaphor of reader-as-creator, Menard produces a few fragments of text which, though "verbally identical" to the original, are "almost infinitely richer":

It is a revelation to compare the Don Quixote of Menard with that of Cervantes.

The latter, for instance, wrote (Part One, Chapter Nine): "...truth, whose mother is history, who is the rival of time, depository of deeds, witness of the past, example and lesson to the present, and warning to the future."

Written in the 17th century, written by the "ingenious layman" Cervantes, this enumeration is a mere rhetorical eulogy of history.

Menard, on the other hand, writes "... truth, whose mother is history, who is the rival of time, depository of deeds, witness of the past, example and lesson to the present, and warning to the future."

History, the mother of truth! The idea is astounding. Menard, a contemporary of William James, does not define history as an investigation of reality, but as its origin.

The difference is not in the "verbally identical" texts, but in the effort which Menard the reader must expend to recover the meaning embedded in Cervantes' words -- an effort "almost infinitely" greater, Borges suggests, than that which it took to write them in the first place.

(Don Quixote [Cervantes 1605], not incidentally, offers an especially apt target for the fictional Menard's obsession, since Cervantes' woebegone knight is himself an avid reader, who "buried himself in his books" on chivalry, until "from little sleep and too much reading, his brain dried up, and he lost his wits" and sallied forth to reconstruct reality along the more congenial lines of literature.)

Be that as it may, the more perceptive writers have always acknowledged their readers as partners in the literary enterprise. As early as the eighteenth century, Laurence Sterne was cautioning his fellow authors against trying to shoulder all the work. Rather [Sterne 1760, vol. II, p. 68] --

[T]he truest respect which you can pay the reader's understanding, is to halve this matter amicably, and leave him something to imagine, in his turn, as well as yourself.

But even this fifty-fifty division of labor may not go far enough, as phenomenologist Wolfgang Iser's commentary on the above passage unintentionally reveals [Iser 1980, p. 51]:

Sterne's conception of a literary work is that it is something like an arena in which reader and author participate in a game of the imagination. If the reader were given the whole story, and there were nothing left for him to do, then his imagination would never enter the field; the result would be the boredom which inevitably arises when everything is laid out cut and dried before us.

Iser's assumption here seems to be that somehow everything can be "laid out cut and dried before us" -- that it is even possible to give the reader "the whole story" in a work of finite length.

It is just this assumption which cognitive science -- more particularly, that branch of it known as discourse psychology -- calls into question. In contradistinction to this "cut and dried" passive-reception model of reading, most discourse psychologists maintain that the "coherent interpretation" of a text is "strongly dependent upon the reader's additional knowledge" of the world [Iza & Ezquerro 2000, p. 227].

Take, for instance, the signature opening sentence of Thornton Wilder's The Bridge of San Luis Rey [Wilder 1927]:

On Friday noon, July the twentieth, 1714, the finest bridge in all Peru broke and precipitated five travelers into the gulf below.

Now imagine if the novelist, intent upon giving us the "whole story," had continued in the following vein:

...The reason those unfortunate travelers fell is that they were caught in the grip of gravity, which is a force of nature acting to accelerate any unsupported object in the direction of the earth's center of mass at a rate of 9.93 meters per second per second. ...

Fortunately for his Pulitzer Prize, Wilder could omit this and other clarifications, and count instead on his readers to invoke unbidden the appropriate "additional knowledge" as regards the dangers of crossing a collapsing bridge in an earth-normal gravitational field.

Still, it remains controversial to what degree, and at what level of consciousness, readers actually call upon such background knowledge in the normal course of affairs. The early 1990s witnessed a battle royal between the "constructivists," who held that conscious elaboration of a mental model was an essential aspect of reading, and the "minimalists" led by McKoon and Ratcliff [1992], who, as the name of their school suggests, proposed that readers usually got along with only a bare minimum of reasoning about what they were reading.

In support of their position, the minimalists could marshal at least some experimental evidence, including one finding that seems directly at odds with our Bridge of San Luis Rey example [Keenan 1992]:

The data show that many of the most obvious inferences do not seem to be drawn. For example, most instrument inference studies find that when college students read ... that "someone fell from the roof of a 14-story building," the concept "dead" is no more active than it is after reading a sentence that has nothing to do with injury or possible death.

So, do readers actively grapple with the text or not?

The answer seems to turn on the relative accessibility of the background knowledge that they are called upon to apply. Noordman and Vonk [1993] showed that, for difficult texts dealing with unfamiliar subject matter, "readers did not make the inferences during reading" but instead "made the inference when they had to verify the information after reading the text." On the other hand, for "text dealing with familiar topics... both the reading times and the verification times indicated that in this case the inferences were made during reading."

The harder the text, then, the less inferencing and/or mental modeling occurs while reading it. Instead, readers tend to defer the real work of interpretation until push comes to shove - that is, until they are interrogated about what they have read. Till then they adopt the strategy used by Linus, of Peanuts fame, to get through Dostoyevsky's Crime and Punishment: when you come to a word you don't understand, just "bleep" right through it.

All well and good, but what does this say about machines? What kind of reading strategy can and should an AI adopt?

We'll try taking that topic up in Part II.


Borges, Jorge Luis, (1939) "Pierre Menard, author of the Quixote," trans. James E. Irby, Labyrinths: selected stories and other writings, New York NY: New Directions, 1964, pp. 36-44.

Cervantes Saavedra, Miguel de (1605), El Ingenioso Hidalgo Don Quixote de la Mancha, Madrid.

Iser, Wolfgang (1980), "The Reading Process: A Phenomenological Approach," in Jane P. Tompkins, ed., Reader-Response Criticism: from Formalism to Post-Structuralism, Baltimore MD: Johns Hopkins, 1980.

Iza, Mauricio and Jesus Ezquerro (2000), "Elaborative Inferences," Anales de psicologia, vol. 16, no. 2, pp. 227-249.

Keenan, Janice M. (1992), "Thoughts about the Minimalist Hypothesis: Commentary on Garnham on Reading-Inference," psycoloquy.93.4.2.reading-inference.3.keenan Wednesday, February 10 1993. At:

McKoon, G. and R. Ratcliff (1992), "Inferences during Reading," Psychological Review, no. 99, pp. 440-446.

Noordman, Leo G. M. and Wietske Vonk (1993), "A More Parsimonious Version of Minimalism in Inferences," Psycoloquy: 4(08) Reading Inference (9).

Sterne, Laurence (1760), The Life and Opinions of Tristram Shandy, Gentleman, London UK: R & J Dodsley. At:

Wilder, Thornton (1927), The Bridge of San Luis Rey, New York NY, Harper.