Big Data* is creating lots of buzz these days -- especially in the humanitarian sector. In the last few years, governments, businesses, humanitarian organizations and citizens have been using Big Data to accomplish feats ranging from analyzing Google search queries to predict flu outbreaks, to helping the U.S. government better understand the needs of people impacted by natural disasters, like Hurricane Sandy.
In this regard, I was fortunate to participate in a recent discussion at George Washington University in D.C. hosted by the World Economic Forum's Global Agenda Councils on Catastrophic Risks, Future of the Internet and Data Driven Development, and the United Nations' Office for the Coordination of Humanitarian Affairs (UN OCHA), which together examined: "The Role of Big Data In Catastrophic Resilience."
"Big Data" by Pawel Dewulit
Participants at this event explored the use of data available today and how it could help decision makers prevent, prepare for and recover from catastrophic events.
Three key trends appeared as part of these discussions:
1) The Relevance of Data
In his opening talk of the day, former Secretary of Homeland Security, The Honorable Tom Ridge, questioned: "Are we to be data rich and knowledge poor?" In a world where zettabytes** of information are being produced from our cell phones, credit cards, computers, homes -- and even the sensor-equipped cars, trains, and the buildings that make up our cities -- the problem isn't a matter of quantity of data, but the relevance of it all.
This notion was further explored in Dr. Erik Wetter's talk, co-founder of the public health not-for-profit organization Flowminder, who shared his "magic potion" fallacy, which contends that 'big' data isn't necessarily 'good' data that will be the answer to all of the world's problems. In order for Big Data to be relevant, we need to be able to aggregate and harmonize data in a way that makes sense. Without the proper context, it's easy to draw false correlations about what we think is happening in any given situation.
2) Structuring Previously Unstructured Data
With all the information out there, we're faced with the unavoidable challenge of how to find important information to improve situational awareness, such as which gas stations are open and have fuel to make effective decisions. In his talk, Brian Forde, Senior Advisor to the U.S. Chief Technology Officer at the White House, spoke of the need to harness social media and open government data to better inform policy makers, survivors and first responders with actionable information. Even more fundamental to this, is the need to structure unstructured data.
Forde believes one of the best ways to do this is through the publication of standardized hashtags through major media outlets like The Weather Channel. Imagine during an emergency you could collect social media posts with hashtags from those that need help via social media curation. For example, earlier this year the White House, the Federal Emergency Management Agency, and the Department of Energy launched standardized hashtags (#PowerLineDown #NoFuel and #GotFuel) to enable citizens to report important emergency information, such as downed powerlines or whether a gas station has fuel, across social media platforms during disasters in coordination with The Weather Channel.
3) Turning Knowledge into Action
While there are many new ways to collect and decipher Big Data, one of the key challenges will be to turn that knowledge into action. According to Mark Dalton, the officer-in-charge of the Information Services Branch at UN OCHA, we need an iterative, agile way to coordinate the testing and piloting of the applications that we intend to use before crisis situations.
This notion was further reinforced by Scott Aronson, the Senior Director of National Security Policy at the Edison Electric Institute, who argued that the importance is not information sharing in crisis situations, but rather that there are technical, legal and cultural gaps in how we actually coordinate using that information. To be more effective, industry and government both need to knit together their joint tools and technologies, and coordinate closely to get the lights back on in crisis situations. This is not just a matter of practicality, but it is their joint responsibility.
Ultimately, Big Data is not a "magical potion" that will solve the planet's problems. Nor is it a standalone solution to support decision-making. Rather, Big Data will only be as meaningful as we make it. This requires sustained effort from the public and private sectors to structure data and make reasonable sense of it for use by decision-makers. Only then can Big Data lead to concerted action in crisis situations to help prevent catastrophes, save lives, rebuild cities and improve the state of the world.
*Big Data is defined here as bridging traditional quantitative data sets -- like census data collected by our governments -- with previously unquantifiable, qualitative information -- like social media updates -- produced by masses of people interacting with one another across various technologically enabled tools.
**A petabyte is the equivalent of 1,000 terabytes, or a quadrillion bytes. One terabyte is a thousand gigabytes. One gigabyte is made up of a thousand megabytes. There are a thousand thousand -- i.e., a million -- petabytes in a zettabyte. Source: "Why Big Data is a Big Deal" in Harvard Magazine by Jonathan Shaw (March-April 2014).