The world of (official) data and statistics; not yet dead!
Promising news coming from the first UN World Data Forum
Recently, it has become popular to question the relevance of "statistics" to make sense of the evolving nature of today's world. A recent article in the Guardian started off with the claim that "rather than diffusing controversy and polarisation, it seems as if statistics are actually stoking them". The credibility of "official statistics" on GDP, poverty or migration flows gets questioned as they seem to not always be in line with what citizens experience in their daily lives nor do they seem able to capture how a large part of the population "feels" about societal developments. Policy makers start to complain that official numbers come in too late and are not granular enough - making it difficult to use them for decision making. So, a good question to ask is: Why should we actually care about official data and statistics in a world that, thanks to the data revolution, seems in abundance of numbers coming from multiple sources going far beyond the traditional administrative, census and household survey data and which seems unable to adapt or relate to the new realities of fast evolving societies around the world?
It was hence very timely that the community of official statisticians from all over the world "stepped up, forward and on the gas," as John Pullinger the UK National Statistician put it, at organising last week's first ever UN World Data Forum. With more than 1,400 participants coming from more than 100 countries and with very different backgrounds - official statisticians, data journalists, academics, private sector and civil society representatives - this event was a truly global and inclusive exercise allowing for a wide exchange of ideas and under the auspices of the UN. Discussed during the forum were a variety of issues relating to data, from technical issues such as interoperability using data coming from multiple sources to more fun aspects of data such as refereeing, data poetry, cartoons and very engaging data visualisation techniques.
The first World Data Forum was clearly dominated by one main topic: the implementation of the Sustainable Development Agenda and how data and statistics can be produced for the different goals and targets. A key issue that emerged to the surface was the idea of strengthening capacities of the data producers, users and donors as well as the complete re-thinking of how capacity development in this new emerging data world should be done; the call was made to complement the "Industry 4.0" agenda with a "Capacity 4.0" agenda that shifts its focus from mainly supply side considerations (how to produce data) to much more demand side aspects (use, and impact of data). Capacity 4.0 should push statisticians and data scientists to become story tellers who can better relate their empirical evidence to the reality of citizens' feelings and emotions and move away from theoretical statements such as "on average, income has increased by x amount".
The sheer amount of sessions at the UN World Data Forum, with topics and debates in various formats, makes it difficult to select key findings but, for me, three things stood out:
First, there is a lot of positive energy and excitement from both (official) data producers and users to engage in a debate on questions related to the role of data and statistics in the world, ranging from questions on how to produce the data that is needed to monitor the Sustainable Development Goals and open data which is still a new subject for the official data community to questions of legitimacy of public statistics seen as "elitist and technocratic" which are not able to relate to the everyday concerns of citizens. The Forum was a great opportunity for intermingling, engaging and listening to different points of views and to learn about the abundant amount of data-focused initiatives such as the World Cities Alliance, the DataRepublica Project from CEPEI, the development of open algorithms and application programming interfaces (APIs).
Second, things are messy, complicated and can require informed choices on trade-offs. As an example, the 2030 development agenda has as a credo "leave no one behind" to reach out to the disenfranchised part of the population - could be the "ultra-poor", handicapped people, elderly etc. To make targeted policy interventions for these groups, data will need to be disaggregated according to sex, age, location etc. Producing this data already requires a considerable effort in data collection techniques as well as a need to guarantee privacy and confidentiality; indeed, if such data fall into the wrong hands, the door is wide open for misuse. The data deluge and easy access to all sorts of data in our daily lives requires a new skill set to handle and manage this constant data flow - the call for a global program on data literacy is one possible way that could better equip citizens with key numerical skills and information overflow.
Third, for societies to embrace the data revolution, we need to build trust and partnerships. Managing the identified trade-offs between the data deluge on the one hand and the data scarcity on the other hand - in particular when it comes to lower income countries and fragile states - requires data producers and users to engage in an open dialogue about topics ranging from privacy and confidentiality to the openness of data, use and impact as well as the funding modalities of a public good such as official statistics. Access to new data sources requires the management of incentives and the handling of risks which can only be done in a mutually trustful and inclusive environment; in this sense the Forum was a promising example.
What next? While the first Forum just ended, preparations for the second one, which will take place in two years' time in Dubai, need to start quickly in order to build upon this positive momentum. Looking forward, we will need to bring in more policy makers, in particular those at the highest levels, to the party - a crowd that was largely absent last week in Cape Town. Clearly as it has been shown repeatedly at the Forum, statisticians and data crunchers are able to get out of their comfort zone and remain open and ready to engage in trusted dialogue - this open invitation should be now accepted and embraced by those people who can make a real difference and by everybody else too.