THE BLOG
01/02/2014 01:40 pm ET Updated Mar 04, 2014

Some High-tech firms Say Writers Gain From Seeing Readers' E-book Files. They're Wrong.

The headline of David Streitfeld's article in the New York Times on Dec. 24, 2013, nicely summarizes the latest high-tech scheme to hit the book business: "As New Services Track Habits, the E-Books Are Reading You."

After reading what Streitfeld and others have to say about the business model of companies like Scribd, Oyster, and Entitle, it seems clear that these companies are making different pitches to different audiences. The pitch to readers is that they will be able read as many e-books as they want for a flat monthly fee, while the pitch to authors is that they'll be able to see detailed analytics about those customers' reading habits. Both of these pitches have big problems.

What do readers get?

These e-book subscription services offer readers access to whatever books that are listed on that particular service at any given moment. This is far less than all the books in print. There are several major publishers missing from each of these services, and even the publishers that are participating may hold back many important books. Customers may not be able to read what they want, because many of the books they've heard about will not be available. This is not made clear on the websites of these services. The reader may be paying for an all-you-can eat buffet, but the selection on the table is pretty limited.

Nor are prospective readers told that their reading habits will be extensively data-mined. This sharing of "reader-analytics" is the inducement for publishers to participate in the first place. To be sure, reading an e-book always involves some loss of privacy -- any e-book seller can track the details of your reading habits, if it so chooses. But data-mining appears to be the main point behind this particular business model. These companies are apparently offering this data as a trade-off to publishers. Sign up your books, and we'll give you data that shows how our readers react to them. What kind of data? The Times article suggests this: "Did they skip or skim? Slow down or speed up when the end was in sight? Linger over the sex scenes?" Goodereader.com chimes in with this possibility: "When reading a steamy erotica, are you lingering over the sex scenes? Do readers ever finish the books they start, or skip right to the end?"

But how about the writers? Do they get anything of value? Not really -- they're more likely to end up with problems.

Start with the money. Authors' royalties are set by contract, but the payment per book is often drastically reduced when the publisher makes this type arrangement with a subscription service. What can the author expect? Not much. If, for example, the subscription service has 10,000 customers who each read 10 titles a month out of a database of 100,000 titles, then the average number of readers per month for each book is -- one. If that reader is spending less than $10 per month to read those ten titles, then the amount of that fee that is allocated to each book is probably less than one dollar. When you divide that one dollar between the publisher, the subscription service, and the author, the amount the author gets is likely to be in the mid two-figures -- and both of them are on the right-hand side of the decimal point.

Well, maybe authors aren't in this deal for the revenue. They're really looking for information. But will they get it? Maybe yes; maybe no. The subscription service's contract is normally with the publisher. Whether the publisher decides to share that information with the author is a matter of the publisher-author contract. But assuming the publisher shares the information, what do you, the author, get?

What you get is a statistician's nightmare. There's simply no way that information is representative of anything meaningful.

First, the information comes solely from e-book readers. Unless your book is published only as an e-book, your sample excludes close to 75 percent of all readers. This would be okay, if e-book readers were typical of book readers in general. But they're not. According to the Book Industry Study Group, the demographics of the e-book market are changing rapidly. And not only do e-book readers represent a different demographic, but a recent article in Scientific American argues that the physical reading experience between print books and e-books is quite different.

Second, when you're getting your data from an e-book subscription service, you're not even getting a representative sampling of e-book readers. The bulk of e-books are bought by readers who subscribe to Kobo, Nook, Zola, Kindle, and other e-book sellers. The sample you would be getting in this scenario includes only those who decided to join a particular subscription service. And that sample is further self-limited because it involves a group of readers that is willing to select books only from those books that the subscription service has to offer.

But even if the process is statistically flawed, but wouldn't at least some of the data be useful? To test that, let's double the number of average readers for any given book -- from one to two -- and see what their reading habits prove. Reader #1 takes two weeks to finish a book that typical readers in that genre finish in five days. Does that prove that your book is slow-moving? Maybe, but it could just as easily show that the reader had other demands on her time and couldn't spend much time reading books. But let's say reader #2 did something more specific: he went back and re-read that "steamy sex scene" a second time. Now that is hard data that proves something about the book -- or does it? Did he go back and read it because it was so compelling he couldn't let it go? Or did he go back and read it because it was so confusing he didn't understand it the first time? Or, maybe, he fell asleep while he was reading and couldn't remember where he left off. The so-called reader-analytics in this instance either prove (a) the author is brilliant at writing sex scenes, (b) she's terrible at writing such scenes, or (c) the reader needs to get to bed earlier.

Good writers know that this is nonsense. They don't need to monitor a reader's viewing habits to tell them what to write. They develop their own vision and their own style, and they know that the most important thing they have to offer is their authenticity.

It's tempting to dismiss these so-called reader-analytics as simply a GIGO situation -- garbage in; garbage out. But it may be more than that. Even if you -- the author -- are sensible and sensitive enough to brush this stuff off as silly, this type of data can get in your publisher's file and take on a life of its own. You run the risk that it will brand you, type-cast your writing ability, and limit your future career. Getting false or misleading information out of someone else's files can often be painful and difficult. Would you subject yourself to a medical test that produced 50 percent or more false-positives?

After the Times story ran, the Times' readers responded in eloquent fashion. I'll give them the last word.

Here's writer Mark Slouka:

Art is a supremely individual expression. It doesn't ask permission; it doesn't take an exit poll and adjust accordingly. Artists say what they know, paint what they see -- they have no choice in the matter -- and it's our privilege to be brought into their world, so distinct from our own, and to be altered by that experience. Once artists start asking how many "likes" they've garnered, or listening to customer-satisfaction surveys to increase their sales, they're no longer making art; they're moving product.

And here's editor Bruce Joshua Miller:

It doesn't surprise me that many companies are eager to profit from easily acquired electronic data about reading habits or anything else, but what does surprise me is that writers would openly embrace the idea of tracking their readers as they read so that the writers might modify their writing in hopes of increasing sales. Writers such as these could be described as "tech savvy," or known by an adjective that predates the digital age: hacks.