Surviving the Data Deluge: a Call to Action for the 21st Century

03/18/2011 02:55 pm ET | Updated May 25, 2011

In August of last year, a IT-centric research blog called Wikibon released a stunning infographic representing the total amount of data stored online in 2010. The astronomical figure cited -- 1.2 zettabytes, or 1 billion terabytes -- could have filled a stack of 16GB iPads (75 billion, to be precise) that would nearly reach the peak of Mount Everest. Other estimates have put the annual quantity of total information transacted (i.e. not necessarily stored) closer to 3 or 4 zettabytes. And those figures are growing by leaps and bounds each day.

In the pre-digital era, storing information was costly and, not surprisingly, usually reserved for only what was perceived as the most critical data (e.g. astronomical observations, census calculations, commodity prices). The entire notion of publicly storing images, thoughts and records of the trivial moments in our lives (the raison d'être of Facebook, Twitter, Flickr, and the like) would have been incomprehensible just a few generations ago.

Thanks to both the ever-plummeting cost of storing data and the growing number of platforms for gathering data, we are facing a new, challenging conundrum: making sense of this exponentially growing data deluge.

So how do we better prepare ourselves for a future governed by data?

Make data available & accessible

One of the most fascinating evolutionary developments of our brains is the ability to deal with numbers -- not only in a physical, concrete manner (e.g. comparing the quantity of apples in two baskets) but also in a theoretical, conceptual manner (e.g. recognizing that 3 + 3 = 6). And yet, despite the indoctrination of mathematics throughout primary and secondary education, we are by and large afraid of data.

Take tax returns, for example. Most of us get a general sense of whether or not a particular year was a good year, financially-speaking, by looking at our income and our expenses. But start throwing in deductions, write-offs, penalties, tax brackets, filing statutes, and it's time to turn it all over to a CPA.

One solution for better understanding data is to convert it into forms that our minds can better comprehend. The Wikibon graphic referenced above is the perfect example of such a feat. Whereas figures like terabytes and zetabytes are essentially meaningless in isolation, the conceptual comparison of a stack of iPads and a mountain range is immediately visceral.

A plethora of innovative visualization tools are finally being made accessible, thanks to open-license and open-source platforms, to individuals who aren't career statisticians or math geniuses.

Non-profit companies like Ushahidi are empowering individuals with frameworks to not only collect data but also organize, analyze and create stunning visualizations. Within just a few hours of the catastrophic Japan earthquake, volunteers working at Tufts University launched a program based on the Ushahidi platform that enabled individuals on the ground in Japan to post information about available supplies, danger zones, and the locations of trapped individuals in need of assistance.

Even Google has jumped into the field with products such as its Public Data Explorer, a small but growing collection of publicly available data sets that can be customized and presented in a variety of graphical forms. Public Data Explorer also enables users to transform any uploaded data set into awesome visualizations with the help of a bit of XML code.

In the bygone era of punchcards and mainframes, data remained abstruse and distant. Thanks to the growing number of readily-available visualization tools, data is finally becoming understandable and even -- gasp! -- fun.

But tools themselves are not enough. We need to instill a cultural appreciation for information.

Teaching the science -- and art -- of data

From a conceptual standpoint, data is like a language; the earlier you study it, the more likely you will achieve fluency.

In order to enable future generations to handle the surging importance of data in all fields, we must begin to shape academic curricula today. Innovative courses that merge statistics with the broader role of data will help shed the stigma of boredom and uselessness that often plague the study of statistics.

We must also break down the walls that surround data analysis by emphasizing the growing roles that non-mathematicians play in making data useful: graphic designers, history and civics majors, software engineers, and medical practitioners are just a few of the many individuals whose careers will be closely intertwined with data in the years to come.

By encouraging both the development of accessible data analysis tools and data-centric education, we can lay the proper foundation for the societies of the future.