THE BLOG
12/20/2012 03:47 pm ET Updated Feb 19, 2013

Do School 'Reformers' Need to Keep Two or Three Sets of Books?

My old joke about No Child Left Behind (NCLB) always prompted groans. The only way that NCLB could work was if districts kept two sets of books -- one for accountability reports and a private, accurate and meaningful set of data for decision-making purposes. The worst case scenario, however, would occur when districts convinced themselves that their NCLB numbers weren't meaningless.

Sure enough, systems took advantage of the law's loopholes to exclude test scores of highly mobile and other more challenging students when issuing the reports that would be read by the public. NCLB also encouraged bogus "credit recovery" programs and other tricks to make attendance and student performance data look good. Before long, NCLB produced the "bubble" where states claimed miraculous growth on their tests, even though the more reliable NAEP scores for secondary schools were largely flat.

Veteran educators knew how education's culture of compliance would result in fearful systems "juking the stats." Especially in districts that lacked the capacity to approach their utopian targets, job #1 would be making numbers look good. But "reformers" seemed to be shocked! Their anger at those inevitable statistical tricks contributed to the next wave of blood-in-their-eyes "reforms."

The Obama administration doubled down on the worst of NCLB when targeting teachers and principals for test-driven accountability. Not surprisingly, central offices were less committed to protecting individuals in schools from invalid metrics. For instance, New York demands that teachers be accountable for raising test scores for chronically absent students. Even though it had been obvious that schools should not be scapegoated for mobile and truant students, the even more absurd policy of holding teachers accountable for students who they don't see in class has become the new normal.

Now, in Florida and elsewhere accountability reports are being required to do multiple tasks that are incompatible with each other. Oklahoma's troubled effort to copy Jeb Bush's A-F Report Card raises a new question. Do we now need three sets of books in order to present data for accountability, as well as decision-making?

The Tulsa World filed a Freedom of Information Act request and documented the private discord prompted by the new accountability rubric. I will not get into the weeds of the methodological dispute. Suffice it to say that the first conflict was caused by an attempt to combine the information that parents want with the data that some policy-makers (for a reason that I don't understand) think they need for sanctions. None of this bitter conflict would have occurred if the state had merely issued a straight-forward Report Card #1, which would be a Consumers Report-type metric to tell parents how well their schools are doing in terms of student achievement.

The first problem is that such a metric is not valid for evaluating schools and educators. Also, their punitive measures were said to require an estimate of how much achievement was raised by classroom instruction. Then, the state combined those contradictory tasks into one report card.

So, systems have devised experimental growth measures that could be valuable -- if used properly. As long as no stakes were attached to those numbers, a Report Card #2 on school-level growth could also be made public. If estimates of how much schools were increasing test scores were used to punish, however, this potentially invaluable data would be corrupted in the same way that NCLB statistics were.

We will always need a third type of metric -- the private evaluation of each educator's performance. Report #3 must have stakes and it must not be public. And, we should never allow management, alone, to interpret whether the failure of an educator to meet a growth target was the fault of the individual's shortcomings, a statistical model's flaws, or the system's ineffectiveness. In other words, persons who set policies should never be allowed to determine whether it was their decisions or an educator's performance that made it impossible to meet statistical targets.

I'm joking, of course, when I propose that school systems should keep three sets of books. The far better approach would be to return to the traditional method of evaluating schools and educators, and holding them accountable for what they actually do. Educators should be rewarded or punished based on their behavior and on the way they deal with circumstances that are under their control. At the same time, we should use data and experimental data models in a transparent, diagnostic way to enhance decision-making.

But, perhaps there is a simpler method of holding the "reformers" accountable. The World, the New Jersey Star Ledger and the Chicago Tribune have recently published confidential documents that provide glimpses of the motives of these accountability hawks. All three sets of documents reveal their deep suspicion of the public and of public schools, and provide evidence that the "reform" movement is morphing into an effort to privatize schools.

Rather than getting into that issue, however, I am limiting my speculations into whether we could benefit from a completely different Report #4, which evaluates "reformers'" efforts in implementing their policies. Perhaps we should give up on a data-driven process and merely publish all of their emails! What quantitative report would give us a better picture of the process of implementing Oklahoma A-F Report Card than the candid qualitative analysis of the state Secretary of Education? The World reported that, "she was 'embarrassed it has gone badly, not just bumps in the road,' particularly because the Oklahoma Business and Education Coalition were backers of the reform and the governor had arranged an earlier visit to Oklahoma by Jeb Bush."

An email acknowledged, "a large part of the problem is in the handling of people at every level. Alienating people is never a good long term strategy." It explained, "the art of consensus building and compromise is either missing, or has been a hollow process ... The public surely must be losing confidence that the reform is soundly implemented, which will make grades less meaningful."

Read beyond the headlines and the email is full of the type of wisdom that is supposed to be derived from accountability reports. It includes a thoughtful discussion of the "cookie cutter" methods being imposed by the federal government and national reform organizations. It discusses capacity problems that exist "even in states like TN with gobs of RttT money." It recounts the role of fear and mixed signals, and why reformers need to understand how "compliance minded enforcers" react to top down mandates.

Seriously, this veteran educator's emails contain wisdom that was lost on the architects of NCLB and today's test-driven "reformers." Education is a people process. Impose crude data-driven evaluations on people and they will react predictably. As long as policy wonks try to design people-proof metrics, systems will find ways to outlast those efforts. We must stop trying to devise a Swiss army knife-style of report cards that combine numbers for a full range of diagnostic purposes and the myriad of ways that "reformers" want to punish schools and educators. In fact, if we really want to help kids, we should craft a fair and private system of performance evaluations and build a fire wall between that rubric and other statistical systems. We should then concentrate on public data systems to help schools improve.