Before Big Data took over the world, someone wrote a roman a clef about the 1992 Democratic primary. "Roman a clef," in case you don't know, is French for "novel with a key." It's based on real-life people but doesn't name them directly.
The novel in question was entitled Primary Colors and published by Random House under the author name Anonymous. It spent nine weeks on the New York Times bestseller list, and the circumstances of its publication, as well as its subsequent success, fed speculation about the identity of the author.
The author turned out to be political columnist Joe Klein. While some, including Clinton speechwriter David Kusnet, had guessed that it was Klein, the most convincing argument was put forth by Donald Wayne Foster, a professor of English at Vassar College who subjected the book to textual analysis, comparing his results to known non-fiction writings of Klein's.
Foster published his results in an article in New York magazine, and -- combined with handwriting analysis of notes on a manuscript -- the case was made and Klein was soon outed as the author.
What I found most interesting at the time was how Foster made a kind of scientific case, fingerprinting Klein's work, in a manner of speaking, by quantifying his use of certain words and phrases. Foster, in fact, has used similar tools to assist criminal investigations, including the case of Unabomber Ted Kaczynski.
I thought of the Primary Colors story when I recently read a Times article entitled "Solving the Equation of a Hit Film Script, With Data." It informs us that Hollywood studios have begun using the services of a wonk named Vinny Bruzzese. According to the article, "Mr. Bruzzese and a team of analysts compare the story structure and genre of a draft script with those of released movies, looking for clues to box-office success." The script gets adjusted accordingly.
This news followed by a few years an article about professors who have applied computing power to titles by English authors in the 19th Century in order to "offer fresh insight into the minds of the Victorians."
The two approaches have Big Data in common, but the key difference lies between using such analysis to look backwards (as with solving the Victorian "mind" or the Primary Colors author mystery) and using it as a guide to create future hits, as Bruzzese presumes to do. The former is a matter of literary criticism. But the latter suggests the automation of creative product.
Naturally, the first reaction a writer has about this is to question whether what he does can ever be done by a computer. That's also our second and third reactions. But do we have an elevated view of our self-worth?
I like to think not, but then again, I would, right? I majored in English, not computer science. If androids are taking everyone else's job, why not mine?
The answer, I think, lies in the distinction between form and formula -- the difference between a writer who moves people and a hack. The script expert Robert McKee notes that "All notions of paradigms and foolproof story models for commercial success are nonsense." His famous book Story largely deals with matters of structure -- of form -- but he takes great pains to distinguish that essential element of storytelling from formula.
In support of that view, award-winning screenwriter and teacher Jacob Krueger (whom I've used as a consultant for some of my work) notes that "you don't see a lot of paint-by-numbers paintings hanging on museum walls." That's because, he says, we are most interested in the "weirdness" of others -- the things that make them different from everyone else. Successful writers, he says, must tap "the things that are inherently unique about the way they see the world."
This gets to the heart of the matter. A computer only "knows" what it's been told, not what it experiences. As such, it is incapable of having a unique worldview, and therefore it can't be "original."
But, you say, doesn't every writer have a hard time being original? Indeed. Even many of the books and movies that defy the odds to get made demonstrate a lack of originality. And yet the true gems -- the stories that captivate us -- manage to show us a fresh glimpse of the human condition. The difference between a pat comedy and Little Miss Sunshine or between a formula drama and Breaking Bad are easily discerned by most people. Not in any technical sense but in the way they make us feel.
In Do Androids Dream of Electric Sheep? (which formed the basis for the film Blade Runner) Philip K. Dick imagined a world in which androids are barely distinguishable from humans and the only way to tell the difference is to administer an empathy test.
Empathy, not coincidentally, is the bridge that a great book or movie takes its audience across. It may be the one thing in our modern world that won't yield to numbers.
"I think," Krueger says, "that audiences can sense manipulation from a mile away. But everyone can connect to something that feels true."
He doesn't mean factually true -- he means emotionally true. Computers can count all the phrases and calculate all the formulas that they want. But, for my money, human is as human does.