During the 2011 NCAA Tournament, a budding sports website called StatSheet received an email from tournament officials thanking it for its coverage. They asked where to send press passes so StatSheet writers could attend the games for free. Humans, however, didn't write the articles; a computer did. The NCAA representatives just couldn't tell the difference.
StatSheet, a two-year-old subsidiary of Automated Insights, uses algorithms to turn numerical data into narrative articles and is comprised of 418 websites that provide game previews, recaps and other features for every Division I college basketball, NFL, and MLB team, as well as analysis of each team's top players.
StatSheet extols the work of its cyber authors as indistinguishable from human journalists'. Programmers can infuse the prose with different tones and styles, which, as NCAA officials discovered, creates the illusion of personality.
"Vocabulary and style are programmable," says Scott Frederick, chief operating officer of five-year-old Automated Insights. "We could make it sound like Dick Vitale had the stamina to cover every game in an entire year."
Like mathematicians plotting the processes necessary to solve an equation, computer scientists can now program software to replicate the mental steps a journalist might use to write an article. Though automation has historically displaced manufacturing jobs, it now threatens cognitive careers, like journalism, as well.
And not just sportswriters: Software developers, sometimes working hand-in-hand with journalists, have created programs that can produce stories about any data-heavy subject, from finance to real estate, more efficiently than any human.
To cover a baseball game, for example, a program uses the line score to determine the "winner" and "loser," the play-by-play data to figure out what plays most affected the games' outcome, and the aggregate information to answer a set of pre-programmed questions: Did a team establish a lead in the beginning and hold on to it? Lose their lead near the end of the game? Win because of one player's performance?
Analyzing the answers, a computer can determine the most noteworthy angle and construct a narrative that descends from the most important action to the least. Unlike a journalist crafting an "inverted pyramid," though, the computer can do this in a fraction of a second.
Narrative Science, Automated Insights' main competitor, gets game data after each quarter of Big Ten Conference games and writes a recap before the end of the commercial break, says Kristian Hammond, the company's chief technology officer and a professor of journalism at Northwestern University.
Hammond formed Narrative Science in 2010 after he challenged four students to program a computer to create stories from numerical data. The company began 2011 with two clients and ended with 35, collaborating with Builder Online, Forbes, and the Wall Street Journal, to name a few.
Most journalists believe no matter how these companies grow, their technology will have only a marginal effect on the news industry. Nicholas Lemann, dean of Columbia University's Graduate School of Journalism, says computers act as "re-write men," retooling preexisting information without independent research. Though he doesn't see automation jeopardizing the foundations of media organizations, he does think some journalists could be displaced if they don't have diversified skills.
"We've got to raise the bar on what skills people get in the education system, so journalists can differentiate themselves from machines," says Lemann. "If a journalist's job is utterly routine, then it probably will be automated."
Though Frederick and Hammond insist their intention is not to do to journalists what Johannes Gutenberg did to scribes, there is often an inadvertent causal relationship between advancing technology and displacement. Gutenberg didn't invent the printing press as an act of malice against those who hand-lettered documents, but the former's efficiency still undercut demand for the latter's services.
In Herman Melville's short story "Bartleby, the Scrivener," Bartleby refuses to work. He says he "would prefer not to" -- one reason machines make more competent workers than humans. Robots are impervious to inefficiencies like indolence and illness, unable to object to working.
Machines are also more expedient data processors. Sawbuck Realty, a three-year-old online real estate company with listings country-wide, recently hired Automated Insights to generate 10,000 to 20,000 articles per week for its website, says Guy Wolcott, Sawbuck's CEO. Frederick has worked with Wolcott for six months to create a program that combines in-house and publicly-listed data to spot a trend (like rising housing prices or declining sales), generate a story and post it online. The stories will start appearing this month.
"If I was going to pay five dollars an article and write 10,000 articles a week -- that's $50,000 a week!" says Wolcott. "I would never do that because I don't think the payoff would be worth it.
Though Automated Insights won't disclose how much it charges, Wolcott says it's under $10 per article. (Frederick says he generally charges clients $2 to $10 per article.) With economy of scale, Wolcott says these prices might eventually decrease.
Because software can write articles quicker and cheaper than journalists, some argue it could cause displacement. History has shown that, like the scrivener, once technology can do a task more efficiently than a human, employers will cut jobs to invest in it.
"I'm sure there will be always be some talented sports columnists, but it seems like there will eventually be a lot of boiler-plate sports coverage that's churned out by a program," says David Autor, an MIT economist who studies labor markets.
If automation does destroy some journalism jobs, openings created elsewhere could prevent aggregate unemployment. For example, because Narrative Science and Automated Insights make it financially possible for media companies to cover amateur sports, entrepreneurs might leverage that technology to expand their businesses.
Ted Sullivan, a former Duke University pitcher, co-founded the software company GameChanger Media Inc. three years ago. With 20 staffers, the company's mobile application and website distributes amateur baseball and softball news to subscribers by providing coaches with a template to upload game results. Sullivan hired Narrative Science to generate game-recap stories using this data.
"We cover millions of games each year, so I would need an army of sports writers to write with the scale of the technology," says Sullivan, adding that 30,000 teams used the product last year. He expects that number to more than double this year. "I don't know what it would cost to pay journalists to write that many articles, but I doubt we could afford it."
Companies like Automated Insights and Narrative Science were born from the proliferation of data that computer culture has enabled. There were 1.8 trillion gigabytes (1.8 zettabytes) of data stored in the world in 2011, up from 161 billion in 2006, according to EMC's Digital Universe Study, an increase of 1018 percent.
"That more data is coming online means that more data needs to be analyzed, which means more analysis needs to be communicated," says Hammond. "The bridge between data and a human understanding of it is what we do."
The two companies also work with companies that use the software to synthesize their data into less anesthetizing, more useful content. Narrative Science sends weekly reports to a major fast food company's franchisees, for instance, offering data-based advice on how to improve each local bottom line. For another client, it provides high school students with individual feedback, using data from their test results to explain what they should study to improve.
"Companies come to us with enormous data assets that are underleveraged, and we are able to generate narratives, charts and graphs that give them a better understanding of their information," says Frederick.
Though neither Automated Insights nor Narrative Science has yet fostered a lasting relationship with a newspaper, Hammond says their technology could automate 20 percent of a paper's content -- the financial, sports, and real estate sections plus some entertainment stories based on box office numbers -- for a fraction of the payroll cost.
Hammond believes this technology will eventually be common in newsrooms, and Frederick foresees a hybrid model where computers interpret data and journalists add their subjective analysis.
"I'm sure a journalist could do a better job writing an article than a machine," says Wolcott. "But what I'm looking for is quantity at a certain quality."
Follow Buster Brown on Twitter: www.twitter.com/o_b_wan