Who's Afraid of Big Bad Data?

06/03/2015 11:46 am ET | Updated Jun 03, 2016

thumbnail-1Perhaps you've heard the saying that a tool is only as useful as the person who wields it. I believe that's true for a lot of things -- including high-speed data analytics.

The term "high-speed data analytics" refers to the leveraging of raw data and converting it into actionable knowledge and insights. The most prized data comes from customers and from the products and services customers use, data that businesses have begun to accumulate in unprecedented amounts in order to improve and innovate in ways that better suit customer needs. This accumulated data -- big data, as we've come to know it -- is stored in servers and captured by a rapidly increasing number of connected sensors and machines making up what we are calling the Internet of Things (IoT). 

High-speed data analytics have garnered a lot of digital ink in the last half-decade or so. And that makes perfect sense. This technology has already become a game changer in many industries. Companies are increasingly able to garner and exploit insights into their customer bases at a scale that was unprecedented a few years ago.

But all that media praise runs the risk of curdling into hype, giving business leaders unrealistic expectations often obscuring just how challenging it can be to separate good data from bad data and then actually convert it into successful results.

As big data continues to build up and high-speed data analytics evolve proportionally, the major challenge will be refining all this data. It has become increasingly difficult to separate the "good," useful, pertinent data from the "bad," unimportant, unusable data -- the noise from the static, as it were.

High-speed data analytics have to meet this challenge by not only separating the good data from the bad, but by accounting for how the bad data might impact your overall conclusions. Even if your data is 90 percent good and 10 percent bad, your actual results might look drastically different from your desired ones.

Perfecting this type of data analysis has become a science -- but it's hardly perfect. Whenever you use high-speed data analytics to draw insights from big data, no matter how advanced the data analysis is, your conclusions might still be off by sole virtue of the fact that you're dealing with such huge, unwieldy amounts of both structured and unstructured data.

When you record a consumer's every click, mouse movement, purchase, email opening and search query, you're going to be able to build a fairly reliable portrait of that customer, and be able to serve them that much better. But if your high-speed data analytic technology is not capable of separating the good data from the bad before developing refined results -- or if you read the data wrongly or fail to understand how the bad data you've collected has impacted the good data -- you might end up with an incorrect assessment of your customer, and in the end, damage your bottom line.

Okay. So what if you show someone the wrong targeted ad? Retail isn't life or death.

But this becomes a much bigger problem when big data mismanagement rears its ugly head in fields such as the health care and insurance sectors. Insurance providers have been using high-speed data analytics to gather broader and deeper insights into every aspect of policyholders' lives, behaviors and health in order to assess risk and price premiums with near-atomic precision. But as data builds up and the good becomes mixed with the bad, the insurance and health-care fields will feel the impact accordingly.

In the past, if you received a health record from a doctor visit that was inconsistent with other records or simply incorrect, it was a relatively easy matter to get the record revised and resubmitted to your insurer.

Today, the massive size of big data can make the recording and alteration of health information severely error prone. I visited my doctor just the other week for a routine checkup, during which I happened to glance over at his computer. Something read wrong: My records indicated a health condition I didn't actually have. I noticed and was able to have him correct the mistake -- but what if I hadn't? How might an erroneous diagnosis have affected my insurance policy?

I got lucky in noticing this mistake in my medical records. How many people won't be so fortunate? How can we be sure of our health care when a simple error might go unnoticed and affect us in huge ways, and how can we solve these problems when we're so far removed from the technology that collects and analyzes our data.

Part of the problem is the tool is only as good as the one who uses it; the tool itself will invariably need to keep evolving.

It's a hard trend that big data will continue to see massive, exponential increases. The big data of today will look like small data within the next couple years.

On the first day of its release, a million Apple Watches were sold -- these watches are particularly adept at collecting health information, like your heart rate, how many steps you have taken, and even your blood oxygen level, contributing countless terabytes to the ever-growing nexus of big data.

The simple future fact here is that big data will keep getting bigger, the amount of faulty or "bad" data collected will invariably grow as well, and sorting the good data from the bad and refining the results will become increasingly difficult and important.

By remaining aware of how errors in the collection and analysis of information can occur and building in safeguards to make sure you have only good data collected, you can get -- and stay -- in front of the looming big data wave.

©2015 Burrus Research, Inc. All Rights Reserved.