THE BLOG
08/05/2013 10:48 am ET Updated Oct 05, 2013

Are Some Liberals the New Luddites on Big Data Science and the NSA?

When you find yourself as a progressive liberal on the same side as Rand Paul and a whole lot of Tea Party activists and a large part of the Republican House caucus, the words "think again" might come to mind.

This is especially true when the progressives have made an effective, major issue of the right wing's obdurate denial of climate science, with their leader Rush Limbaugh claiming repeatedly that climate change has now been proven to be a "hoax" perpetrated by liberals and their media friends in order to justify more intrusive government controls of personal lives and decisions. Progressives rightly call these climate science deniers Luddites with good reason. But they should be careful that the term does not boomerang against them as they take on the NSA's metadata telephone record collection program.

First, it's not clear that NSA program critics know what the term "metadata" means. Terms like "spying on citizens" and "monitoring our phone calls" and "seizing our personal data" are thrown around with utterly no basis in reality. Nobody's phone calls are being tapped or listened to in any way under the NSA metadata collection program. Period. Nobody's calls are being "monitored" in the common sense meaning of that word. Anybody who finished grade school knows what a "hall monitor" does -- they watch and listen. That's not what NSA does when it just collects the computerized records of all U.S. phone calls from the telephone service providers.

And, by the way, such records are not "your" data in any event. They belong to the telephone company that is going to bill you for its services. It's your phone, but it's their record because the call is made using their service. They keep it to complete the deal you make with them to pay for the service.

And the NSA program really doesn't care, when it comes to collecting these gazillions of records, who "you" are anyway. Yes, it can look it up if they get suspicious because of a call pattern, but until then, "your" records are just a bunch of digital numbers. At bottom, they aren't spying on "you" because "you" are irrelevant at the collection phase.

The program's analytical phase is where the real data analytic science story takes over, and where the critics of the program seem to veer off into Luddite territory. This startins with the well known right wing "don't bother me with the facts" refusal to even educate themselves as to what metadata collection actually involves.

A good start on such education would be to read the recent book Big Data -- an Amazon best seller -- which outlines the new science of data analytics. This new science is one of the fastest growing professional jobs in the American economy. Big data science turns the scientific and statistical analysis method on its head in ways that have already materially and irreversibly changed the fields of marketing, medicine, advertising and even Amazon's business model.

The core of scientific method involved hypotheses tested against well-designed samples or "focus groups." Trial and error and retrial prevailed because that was the best we could do. In terms of intelligence gathering, we spied on people based on suspicions that prompted us to check them out. We looked for the needle in the haystack as best we could -- essentially, on our hands and knees.

But now digital science has progressed to the point where we can actually capture the "whole haystack" in its most minute parts. When we do that, the needle just sticks out like a sore thumb. We can see the whole chess board and all the possible moves at once. That's why Big Blue can now match the world's best grand masters. As "Big Data" points out, for a variety of socially useful (or just plain commercially useful) purposes, we can now sometimes bypass the samples and focus groups and look at the whole field of inquiry and single out the behavior patterns that interest us. Walmart, for example, could look at all customer purchases in hurricane watch areas and learn that, in addition to water and batteries, one of the most popular pre-storm buys turned out to be PopTarts. So it put the Tarts next to the water and batteries and sold even more. Or the CDC could look at the behaviors that correlated with regional out breaks of flu and thus predict with more accuracy where flu outbreaks would emerge.

And the key word here is correlation -- not causality. We can guess but not scientifically know why hurricane forecasts bring out our taste for PopTarts, but that doesn't matter. It just correlates, and for marketing purposes, that's enough. Same with the flu. Same with, as it happens, phone calls involving terrorists (not you or me).

Critics of the NSA program all say "Why doesn't the Agency just start with the phones of those it already suspects to be terrorists?" Of course they also say, whenever terror strikes, why didn't we have that guy on our list? But the tough part is getting the guy we don't already know about, isn't it? If we stay with the old scientific method of actual telephone spying, we can hit on the phone calls of known suspects. But what if the terrorist is not in our phone system -- say, in Saudi Arabia. What good is our warrant there?

But if we can now look at the record of all U.S. phone calls we can see all the calls to and from that Saudi Arabian number, and also see which of the numbers that called that number also called each other, etc. That's what meta data is; that's the tool it gives us to replace actual spying and monitoring until we see reason, in the suspicion patterns of data, to go get a warrant and proceed to real "monitoring, wiretapping, listening in etc. We can find the needle in the haystack, and even "connect the needles." It's much better than spying, and less intrusive, not more, on "you" and "I."

Or at least no more so than Walmart, your favorite credit card company, Google, Facebook, and probably your favorite bank already are with "our" metadata. (Not to mention the phone company!) The NSA program need not have been kept secret. In today's world where information about money is more valuable than money, it's no "meta deal" if properly understood. And it should not be subject to constraints that have no basis in fact and that deny the benefits of new data analytic science -- not if you want to argue climate change with a straight face.