THE BLOG
10/01/2011 12:37 pm ET Updated Dec 01, 2011

Good Evaluation Requires Data -- And Sound Judgment

The superintendents and school district human resource officers I spoke with this week said they are struggling with the political and technical aspects of linking student achievement data to teacher evaluation. The policy analysts and labor economists that discussed incorporating data into teacher evaluation at PIE-Net this week admitted that it's a challenge.

Teacher effectiveness is the dominant edreform frame these days that makes teacher evaluation a timely topic. States applying for giant Race to the Top grants were required to outline how they'd overhaul evaluation. And, most recently, states applying for waivers from the widely despised No Child Left Behind must install new evaluation systems that use test data to measure teacher effectiveness by the 2014-15 school year.

The NYTimes said in today's lead editorial, "It seems imprudent to rush the states into bringing these complex new evaluations systems and high-quality tests on line by 2014, given that they will also be expected to adopt new core curriculums."

EdWeek's Michele McNeil explains that peer reviewers will attempt to verify state commitment to using data to improve teacher evaluations by asking questions including:

Is student growth a significant enough part of the new evaluation system to differentiate among teachers who have made "significantly different contributions" (emphasis added) to student growth or closing achievement gaps? Will evaluations be frequent enough? Is there a plan for differentiated professional development based on evaluations? Will the state's plan ensure that local school districts will actually be able to put these new evaluation systems into place by 2013-14 (as a pilot), and 2014-15 (full implementation)?

There is little guidance from the feds to reviewers or states about what it means to have a new evaluation system and how it will impact personnel decisions. To the challenging political and technical aspects of using year-end standardized tests to measure the effectiveness of individual teachers add uncertainty.

I have a different concern; I'm afraid that most of these efforts intend to apply outdated psychometric tools to obsolete school models. The problem is that the sector is still approaching this topic from a data poverty mindset. About the time the labor economists optimize approaches for using cheap multiple choice tests for evaluating individual teachers there will be a flood of data informing teams of educators working in environments that blend online and onsite learning.

Blended learning environments will often leverage the leadership of lead teachers across hundreds of students and several junior staff members. Their work will be informed by rich student profiles. It will be much easier to track student growth in each subject area but more difficult to attribute causality to individual contributors.

It will remain necessary to apply judgment using a variety of observations and outcomes in order to evaluate educator effectiveness. Because staffing patterns and assessment techniques will evolve over the next few years, states and districts should plan on a series of temporary agreements incorporating all available data.

It's obviously a good thing to consider all forms of evidence when providing performance feedback to employees. But teaching is not like golf, you can't reduce it to a single score for one contributor. That's been true and will increasingly be the case as learning environments blend a variety of strategies and technologies. Good evaluations will always apply sound judgment.