Rigorous Impact Evaluation Is Not a Luxury: Scrutinizing the Millennium Villages

10/15/2010 04:27 pm ET | Updated May 25, 2011

Back in 2004, a major new development project started in Bar-Sauri, Kenya. This Millennium Village Project (MVP) seeks to break individual village clusters free from poverty with an intense, combined aid package for agriculture, education, health, and infrastructure. The United Nations and Columbia University began the pilot phase in Bar-Sauri and have extended it to numerous village clusters in nine other countries. They hope to scale up the approach across much of Africa.

But wait: Before we consider blanketing a continent with any aid intervention, we have to know whether or not it works. We have to know if different things have happened in Bar-Sauri than have happened in nearby Uranga, which was not touched by the project. And we have to know if those differences will last. This matters because aid money is scarce, and the tens of millions slated for the MVP are tens of millions that won't be spent on other efforts.

I ask and seek to answer these questions in a new research paper that I wrote with Gabriel Demombynes of the World Bank. In the paper (and a detailed summary here), we show how easy it can be to get the wrong idea about the MVP's impacts when careful, scientific impact evaluation methods are not used. And we detail how the impact evaluation could be done better, at low cost.

In the paper, we compare trends inside the Millennium Villages to trends outside them. In June 2010, the MVP issued its midterm evaluation, which a leader of the MVP called a "major scientific report" here on HuffPost. That report shows you positive trends within the Millennium Villages, in development indicators like access to sanitation, water, and cell phones. The project claims responsibility for those changes, by calling them "impacts" of the project. But those trends don't tell you the impact of the project, because it's possible that some or all of those changes would have happened if the project hadn't happened.

For example, anyone who has recently spent time in Africa knows that the continent is undergoing a mobile phone revolution. For two of the three intervention sites we study, cell phone ownership has been growing just as fast in the large regions around the Millennium Villages as it has been growing inside the Millennium Villages. Just stating that increases in mobile phone ownership constitute an "impact" of the project, as the mid-term evaluation does, cannot be a careful measurement of the project's impact. The same is true for several other development outcomes reported by the MVP.

Our analysis shows that rigorous estimates of the project's impact depend heavily on how one does the evaluation. In our paper, we detail exactly how a proper impact evaluation could be done, at a cost per village not much higher than the cost of the current evaluation, and in a way that remedies all five of the above weaknesses. The proposal is for random assignment of treatment between about 20 matched pairs of village clusters, where both members of each pair are monitored over 15 years.

The paper has two ultimate purposes. One is to highlight the need in general for rigorous impact evaluation when it is feasible. The second is to argue for such an evaluation of the MVP going forward. (Although it's too late for the current crop of MVP sites, there is no obstacle to undertaking a rigorous evaluation for the next 20 sites.) The paper deliberately makes no conclusion about the wisdom or effectiveness of the intervention itself, except for the modest claim that the intervention shouldn't be massively scaled up until its effects have been reliably estimated. To me that's uncontroversial; Africans have urgent needs, but they have urgent needs for things that work, and many of them have been disappointed by well-intended outsiders in the past.