08/03/2012 12:53 pm ET Updated Oct 03, 2012

Bumps in the Road? Or Signs of Derailment?

This blog is co-authored by Noelle Ellerson, Assistant Director for Policy Analysis & Advocacy at the American Association of School Administrators.

Elaine Weiss:

Recent news out of Tennessee about "bumps" in the teacher evaluation component of its Race to the Top education reform efforts are emblematic of larger, systemic concerns about RttT generally. Lessons learned, therefore, should be timely and useful for other grantee states, for districts looking to apply in the next round, and for states granted waivers under NCLB who have promised to implement similar reform measures.

Developing evaluation measures that make logical sense and accurately reflect teacher effectiveness is difficult, particularly when stakes are high. Like other high-stakes systems, Tennessee's provides incentives for teachers to game it so they appear more effective than they might otherwise. For example, in the absence of other measures, the state initially suggested that teachers in subjects that do not employ standardized tests could choose to apply tests taken by students in other subjects and/or other grades. The result was mass selection of classes that did well in a particular subject and would thus inflate a teacher's grade, but had nothing to do with that teacher's instruction. A newer proposal might let high-scoring teachers choose to have their entire evaluation based on the score, which would eliminate observations for no reason while artificially increasing the gap between high- and low-scorers.

These poor decisions regarding system-building may reflect lack of capacity and insufficient time to implement changes appropriately. As the Tennessean reported, "Some educators criticized the system as being unfair, time-consuming and rushed into place, and they unsuccessfully pushed for the first year's results to be considered a trial run."

Perhaps most fundamental, however, is the reality that any time we assume the effectiveness of an unproven "reform," we are bound to witness the kind of cognitive dissonance taking place now in Tennessee when such success fails to materialize. If any state were to implement an effective Value-Added system for evaluating teachers, surely Tennessee would be it. It was one of only two states to win top marks for its application two years ago, and Professor William Sanders is the foremost pioneer of the Value-Added methodology, having developed the original Tennessee Value-Added Assessment System (TVAAS). Yet it has been clear for some time that the state is having real trouble getting accurate, reliable measures of teacher effectiveness out of student test scores.

This should not be surprising; half the state's evaluations are based on those scores, a system that numerous scholars have called inappropriate due to an "Imprecise" set of estimates and "insufficient" research base. Members of the Board on Testing and Assessment of the National Research Council of the National Academy of Sciences caution that "VAM estimates of teacher effectiveness should not be used to make operational decisions because such estimates are far too unstable to be considered fair or reliable." Rather than admit that scores might be off, however, when principal observations failed to find as many teachers lacking as did Value-Added scores, the assumption was that principals don't know how to assess teachers' ability.

It may well be that some principals are giving teachers higher marks than they deserve, and it is also likely that, given the substantial added burden shouldered by principals who may now have to observe dozens of teachers, their capacity to do so comprehensively is compromised. Without knowing whether either of those is the case, however, we should not discount an altogether different explanation, posed by Carol Shmook of the Tennessee Teachers Association: "It just could be that all the processes we have in place of preparing teachers may be working." The state Department of Education may believe that Value-Added scores and principal evaluations should be better aligned, but the research base does not back that up, nor does it suggest that the principals are necessarily wrong.

Noelle Ellerson:

In her opening paragraph, Elaine described how the "bumps" of RttT implementation come with lessons that states and districts can learn from as they apply for and received ESEA waivers and additional RttT grants, respectively. I think the lessons are relevant to an even wider audience: As we watch federal education policy shift its weight behind student data-driven teacher evaluation, it becomes crucial that education stakeholders at the federal level pay close attention to what is going on in the states already implementing teacher evaluation systems. There are strong implications for codification within federal statute: the ESEA reauthorization bills from the House and Senate both include language requiring teacher evaluation systems. Federal policy-makers would be well-served in paying attention to the opportunities and obstacles states and districts face in implementing these RttT reforms.

The lessons can be good or bad; the concept is the same: there is no need to recreate the wheel, whether it works or not. Just as we don't want states and districts to struggle with the same problem again and again, we don't need them to necessarily recreate the wheel every time, either. The current fiscal climate puts unique funding pressures on state and local budgets, and if collaboration or sharing can support implementation of stronger policies and programs, then states and districts can and should be able to share their models or form partnerships and collaborations.

Elaine and Noelle:

The "bumps" may bruise, but they ultimately also inform, and it is critical that state and district leaders, as well as federal policy makers, pay attention to the lessons -- lessons about reliability, capacity, and time, among others -- from RttT. Moreover, when promised improvements fail to happen, all of those leaders should reflect on the need to think broader and bolder about what constitutes reform.

Since 2007, Noelle Ellerson has worked on public policy for AASA, where she focuses on both policy and advocacy. Noelle handles research/analysis supporting AASA's advocacy work for public education, including AASA policy-related surveys and research to help school administrators better understand federal policy and inform federal education policy decisions. She also represents AASA's advocacy priorities on Capitol Hill, including funding and appropriations, ESEA, child nutrition, rural, and charters/vouchers.