iPhone app iPad app Android phone app Android tablet app More

Featuring fresh takes and real-time analysis from HuffPost's signature lineup of contributors
Buster Brown

GET UPDATES FROM Buster Brown
 

Robo-Journos Put Jobs In Jeopardy

Posted: 07/19/2012 1:00 pm

During the 2011 NCAA Tournament, a budding sports website called StatSheet received an email from tournament officials thanking it for its coverage. They asked where to send press passes so StatSheet writers could attend the games for free. Humans, however, didn't write the articles; a computer did. The NCAA representatives just couldn't tell the difference.

StatSheet, a two-year-old subsidiary of Automated Insights, uses algorithms to turn numerical data into narrative articles and is comprised of 418 websites that provide game previews, recaps and other features for every Division I college basketball, NFL, and MLB team, as well as analysis of each team's top players.

StatSheet extols the work of its cyber authors as indistinguishable from human journalists'. Programmers can infuse the prose with different tones and styles, which, as NCAA officials discovered, creates the illusion of personality.

"Vocabulary and style are programmable," says Scott Frederick, chief operating officer of five-year-old Automated Insights. "We could make it sound like Dick Vitale had the stamina to cover every game in an entire year."

Like mathematicians plotting the processes necessary to solve an equation, computer scientists can now program software to replicate the mental steps a journalist might use to write an article. Though automation has historically displaced manufacturing jobs, it now threatens cognitive careers, like journalism, as well.

And not just sportswriters: Software developers, sometimes working hand-in-hand with journalists, have created programs that can produce stories about any data-heavy subject, from finance to real estate, more efficiently than any human.

To cover a baseball game, for example, a program uses the line score to determine the "winner" and "loser," the play-by-play data to figure out what plays most affected the games' outcome, and the aggregate information to answer a set of pre-programmed questions: Did a team establish a lead in the beginning and hold on to it? Lose their lead near the end of the game? Win because of one player's performance?

Analyzing the answers, a computer can determine the most noteworthy angle and construct a narrative that descends from the most important action to the least. Unlike a journalist crafting an "inverted pyramid," though, the computer can do this in a fraction of a second.

Narrative Science, Automated Insights' main competitor, gets game data after each quarter of Big Ten Conference games and writes a recap before the end of the commercial break, says Kristian Hammond, the company's chief technology officer and a professor of journalism at Northwestern University.

Hammond formed Narrative Science in 2010 after he challenged four students to program a computer to create stories from numerical data. The company began 2011 with two clients and ended with 35, collaborating with Builder Online, Forbes, and the Wall Street Journal, to name a few.

Most journalists believe no matter how these companies grow, their technology will have only a marginal effect on the news industry. Nicholas Lemann, dean of Columbia University's Graduate School of Journalism, says computers act as "re-write men," retooling preexisting information without independent research. Though he doesn't see automation jeopardizing the foundations of media organizations, he does think some journalists could be displaced if they don't have diversified skills.

"We've got to raise the bar on what skills people get in the education system, so journalists can differentiate themselves from machines," says Lemann. "If a journalist's job is utterly routine, then it probably will be automated."

Though Frederick and Hammond insist their intention is not to do to journalists what Johannes Gutenberg did to scribes, there is often an inadvertent causal relationship between advancing technology and displacement. Gutenberg didn't invent the printing press as an act of malice against those who hand-lettered documents, but the former's efficiency still undercut demand for the latter's services.

In Herman Melville's short story "Bartleby, the Scrivener," Bartleby refuses to work. He says he "would prefer not to" -- one reason machines make more competent workers than humans. Robots are impervious to inefficiencies like indolence and illness, unable to object to working.

Machines are also more expedient data processors. Sawbuck Realty, a three-year-old online real estate company with listings country-wide, recently hired Automated Insights to generate 10,000 to 20,000 articles per week for its website, says Guy Wolcott, Sawbuck's CEO. Frederick has worked with Wolcott for six months to create a program that combines in-house and publicly-listed data to spot a trend (like rising housing prices or declining sales), generate a story and post it online. The stories will start appearing this month.

"If I was going to pay five dollars an article and write 10,000 articles a week -- that's $50,000 a week!" says Wolcott. "I would never do that because I don't think the payoff would be worth it.

Though Automated Insights won't disclose how much it charges, Wolcott says it's under $10 per article. (Frederick says he generally charges clients $2 to $10 per article.) With economy of scale, Wolcott says these prices might eventually decrease.

Because software can write articles quicker and cheaper than journalists, some argue it could cause displacement. History has shown that, like the scrivener, once technology can do a task more efficiently than a human, employers will cut jobs to invest in it.

"I'm sure there will be always be some talented sports columnists, but it seems like there will eventually be a lot of boiler-plate sports coverage that's churned out by a program," says David Autor, an MIT economist who studies labor markets.

If automation does destroy some journalism jobs, openings created elsewhere could prevent aggregate unemployment. For example, because Narrative Science and Automated Insights make it financially possible for media companies to cover amateur sports, entrepreneurs might leverage that technology to expand their businesses.

Ted Sullivan, a former Duke University pitcher, co-founded the software company GameChanger Media Inc. three years ago. With 20 staffers, the company's mobile application and website distributes amateur baseball and softball news to subscribers by providing coaches with a template to upload game results. Sullivan hired Narrative Science to generate game-recap stories using this data.

"We cover millions of games each year, so I would need an army of sports writers to write with the scale of the technology," says Sullivan, adding that 30,000 teams used the product last year. He expects that number to more than double this year. "I don't know what it would cost to pay journalists to write that many articles, but I doubt we could afford it."

Companies like Automated Insights and Narrative Science were born from the proliferation of data that computer culture has enabled. There were 1.8 trillion gigabytes (1.8 zettabytes) of data stored in the world in 2011, up from 161 billion in 2006, according to EMC's Digital Universe Study, an increase of 1018 percent.

"That more data is coming online means that more data needs to be analyzed, which means more analysis needs to be communicated," says Hammond. "The bridge between data and a human understanding of it is what we do."

The two companies also work with companies that use the software to synthesize their data into less anesthetizing, more useful content. Narrative Science sends weekly reports to a major fast food company's franchisees, for instance, offering data-based advice on how to improve each local bottom line. For another client, it provides high school students with individual feedback, using data from their test results to explain what they should study to improve.

"Companies come to us with enormous data assets that are underleveraged, and we are able to generate narratives, charts and graphs that give them a better understanding of their information," says Frederick.

Though neither Automated Insights nor Narrative Science has yet fostered a lasting relationship with a newspaper, Hammond says their technology could automate 20 percent of a paper's content -- the financial, sports, and real estate sections plus some entertainment stories based on box office numbers -- for a fraction of the payroll cost.

Hammond believes this technology will eventually be common in newsrooms, and Frederick foresees a hybrid model where computers interpret data and journalists add their subjective analysis.

"I'm sure a journalist could do a better job writing an article than a machine," says Wolcott. "But what I'm looking for is quantity at a certain quality."

 
 
 

Follow Buster Brown on Twitter: www.twitter.com/BusterBrown125

FOLLOW TECH
During the 2011 NCAA Tournament, a budding sports website called StatSheet received an email from tournament officials thanking it for its coverage. They asked where to send press passes so StatSheet ...
During the 2011 NCAA Tournament, a budding sports website called StatSheet received an email from tournament officials thanking it for its coverage. They asked where to send press passes so StatSheet ...
 
 
  • Comments
  • 18
  • Pending Comments
  • 0
  • View FAQ
Comments are closed for this entry
View All
Favorites
Recency  | 
Popularity
This user has chosen to opt out of the Badges program
photo
Gary Amedee
Mea Culpa. Mea Maxima Cruenta Culpa
04:58 AM on 07/23/2012
I would have taken the free passes and kept my mouth shut.
12:02 PM on 07/20/2012
Fascinating read but I think the author fails to point out this technology only affects a small portion of journalism. Machines will never be able to interview people and they'll never be able to conduct investigations so, really, automation only affects a rote level of journalism. That said, if demand for decreases because a cheaper alternative becomes available than perhaps the supply of journalists will decrease too...
photo
HUFFPOST SUPER USER
LangstonA
Attempting to stand in the gap
10:23 PM on 07/19/2012
According to this article the machine "reads" the score to determine the winner or loser and also "reads" the play by play. That sounds to me as if the machine reads articles already written by humans and then snatches information from those articles. Unless the machine is sitting in the stands "watching" or listening to the announcer as the action is taking place and then formulating its sentences from what it sees and hears, all it is doing is taking human writing and rewriting it.
11:54 AM on 07/20/2012
My take was that the machine is just fed data -- as in, a computer is fed numbers which it then analyzes. You may be thinking about the word "read" to literally.
HUFFPOST SUPER USER
realitytrumpsbull
Two 'alves of coconut!
10:19 PM on 07/19/2012
I don't think the microchip designers have yet fully developed a 100% reliable B.S. generator, though. You set up a machine to start telling the news, and it's liable to be honest, following the who, what,when, where, how, why formula, and if you did enough of that, what would happen to the rest of the industry?
HUFFPOST SUPER USER
fiLthyLiberaLdotcom
Yes, it's a website for liberals.
09:20 PM on 07/19/2012
LOL! This is the end result of what the journalists have been doing for years now. Only a few give a credit to their job descriptions the others are formula pushers. Often they don't even bother to do basic research. A computer can probably do better than most of them, especially those at Fox.
photo
jeffrey678
You don't happen to make it. You make it happen.
07:16 PM on 07/19/2012
Americans do not trust TV news. The level of trust in broadcasts has fallen to an historic low. According to the findings of Gallup Institute, only 21 percent of viewers trust what they see. Kommersant FM personal correspondent in Washington Natalia Suvorova told host Anna Kazakova the details.
The fact is that even last year this number — the number of people who trust broadcast news — was around 27 percent and within just one year it fell 6 percent. That means now only 21 percent of Americans answered that they fully, or at least in a somewhat serious way, trust broadcasts. In principle, this is historically low and for the past 20 years these numbers have fallen by half. In 1993, when the Gallup organization started these surveys, 46 percent already held this opinion.
It’s a little different in America. The fact is that this question has, for example, interesting particularities in part because of different ideologies.

http://watchingamerica.com/News/166798/americans-are-viewing-mass-media-more-and-more-negatively/
06:08 PM on 07/19/2012
Have to agree with you .

As an AI engineer I always thought the easiest one to replace is David Brooks.

And the most fun.

But in reality the nature of news has changed.

The days of sensationized reporting are over.

And the beginning of analytical fact based journalism has just begun.
05:58 PM on 07/19/2012
As a former software developer, I found this article interesting. I would like to have a peek at their code to see just how they put this together. If it ends up spilling over into the political reporting arena, they would not have to go to the trouble of developing a data driven bias factor, just hard code it to the far left.
photo
HUFFPOST SUPER USER
froggythegremlin
I'll never do it again, I promise.
05:22 PM on 07/19/2012
Computers are great at organizing data, but they never have an original thought. Unfortunately, we get much of the same from journalists today.
HUFFPOST SUPER USER
fiLthyLiberaLdotcom
Yes, it's a website for liberals.
09:21 PM on 07/19/2012
Bingo.
photo
HUFFPOST SUPER USER
chemguy
Liberal, but not Democrat
04:22 PM on 07/19/2012
Journalist can join the club, along with retail salespeople, secretaries, bank tellers and travel agents.
04:04 PM on 07/19/2012
Given the competence of today's so-called journalists, including the arts of grammar and spelling, it is wholly conceivable that a robot could do as well if not better.
photo
JoeyDee2
I know what just passed here
02:35 PM on 07/19/2012
“She had even (an infallible mark of good reputation) been picked out to work in Pornosec, the sub-section of the Fiction Department which turned out cheap pornography for distribution among the proles. It was nicknamed Muck House by the people who worked in it, she remarked. There she had remained for a year, helping to produce booklets in sealed packets…” (Orwell, Nineteen Eighty-Four).

If you can do it with journalism, why not do it with every form of written matter? You don’t have to pay machines or, as I understand, bloggers who write for HP.
photo
HUFFPOST SUPER USER
nanoscare
02:28 PM on 07/19/2012
Journalists should probably have stopped acting like stenographers decades ago.
04:43 PM on 07/19/2012
Are you sure you know what a stenographer is? The article is about analyzing data and creating a narrative out of it -- not transcribing audio to print.
photo
HUFFPOST SUPER USER
nanoscare
05:39 PM on 07/19/2012
If a program is going to write like a specific journalist and all journalists do is transcribe.... what is the outcome?
photo
Auroran Bear
Poking the bear, never a good idea!
01:20 PM on 07/19/2012
Apparently they can also program a political candidate as is evidenced by Romney.