01/27/2014 10:26 am ET Updated Mar 29, 2014

Teachers Evaluate Their Evaluations and Evaluators


By a unanimous vote, the governing board of New York State United Teachers (NYSUT), the umbrella organization of local unions that represents teachers in New York State, called on the state Board of Regents to remove the Education Commissioner. In a statement issued by NYSUT President Dick Iannuzzi, he accused the education commissioner, John King, of pursuing "policies that repeatedly ignore the voices of parents and educators who have identified problems and called on him to move more thoughtfully."

The NYSUT board also withdrew its support for the Common Core learning standards and demanded a three-year moratorium on the high-stakes testing of students and the use of student test scores in the evaluation of teacher performance. These positions taken by NYSUT represent a major new development in the nationwide battle over Common Core, high-stakes testing, Race to the Top, and newly mandated teacher evaluation systems.

In a June 2013 Huffington Post I asked, "Who is Charlotte Danielson and Why Does She Decide How Teachers are Evaluated?" It was one of my most widely circulated posts with 7,942 "Likes," 1,526 Facebook "shares," 361 "Emails," and 163 "Tweets." The Danielson Framework for Teaching is a teacher evaluation system imposed on the New York City to satisfy the requirements of the New York State's federal Race to the Top grant. It is one of the Annual Professional Performance Review (APPR) check-off systems now being used in the United States.

A September 2013 report issued by the United States Government Accountability Office (GAO) found that by school year 2012-13 "6 of 12 Race to The Top (RTT) states" had "fully implemented their evaluation systems (i.e., for all teachers and principals in all RTT districts)." While the results varied by state and school district to district, the problems with implementation of APPR were pretty standard.

"Officials in most RTT states cited challenges related to developing and using evaluation measures, addressing teacher concerns, and building capacity and sustainability. State officials said it was difficult to design and implement rigorous student learning objectives--an alternate measure of student academic growth. In 6 states, officials said they had difficulty ensuring that principals conducted evaluations consistently. Officials in 11 states said teacher concerns about the scale of change, such as the use of student academic growth data and consequences attached to evaluations, challenged state efforts. State and district officials also discussed capacity challenges, such as too few staff or limited staff expertise and prioritizing evaluation reform amid multiple educational initiatives. Officials in 10 states had concerns about sustaining their evaluation systems."

According to the GAO report (p. 2), its authors interviewed state and district officials and "officials from unions or organizations representing teachers or principals in Maryland, New York, and North Carolina to obtain their perspectives on design and implementation challenges," but the views of three groups intimately involved in the education process were conspicuously absent from the report: parents, students, and most surprising, working classroom teachers.

In my June Huffington Post, I urged teachers "to use Huffington Post to document what is going on with teacher evaluations in their schools." Teachers have consistently raised that the evaluation system, as it is implemented across schools, is not differentiated to reflect the student population of a school or classroom, which creates a built-in injustice. A relatively inexperienced teacher at a top academic high school will have a much easier time meeting the standards than a veteran teacher teaching less academically-oriented students in a school where many students have not mastered basic life skills such as coming to school on time, being prepared, paying attention, and concentrating, let alone the more academic skills required under the new standards. It is also not clear that administrators in different districts view the criteria for evaluating teachers in the same way.

Using the Danielson rubric, teachers are evaluated in twenty-two areas on a four-point scale ranging from ineffective, to developing, effective, and highly effective (HEDI). I have heard reports from urban districts that the default grade in their schools is "developing" and a teacher must provide evidence that they are "effective." Meanwhile in some suburban districts evaluators start from the assumption that teachers are "effective" unless they see evidence that indicates otherwise.

In this Post I am publishing what I consider representative comments sent to me by four veteran teachers from different parts of the United States who want their negative experiences with new teacher evaluation systems publicized to support changes in the evaluation process and also to support other teachers who feel isolated and threatened by the way they are being graded. One of the teachers I included teaches pre-school, one teaches elementary school, one teaches middle school, and one teaches high school. They are all veteran teachers from minority communities who are highly regarded by their colleagues and school based administrators. Each of the teachers requested anonymity so they could not be identified by school and district administrators. I carefully edited what they sent me to avoid references to specific schools and districts, otherwise their comments are theirs.

The underline themes in all four teacher reports is that the APPR systems do not identify best practice in teaching; that it is impossible to follow these guidelines in real classrooms even if a committed teacher tries to do it; that sincere school based administrators and curriculum specialists are just as flummoxed by these requirements as are the teachers; and that the mandated teacher evaluations are harming students in their classes.

From an early childhood teacher: I was recently observed teaching math. The feedback I got represents mammoth issues with Charlotte Danielson's teacher rating system. There is no dialogue there, there is no "trickle up" of the great ideas teachers craft from stale curricula and make real. It is now too risky to individualize, craft and squeeze out the subtleties, as keeping closely to the published curriculum is safest, even when all stakeholders recognize that the publisher has not responded to curricular holes in their product. When a teacher modifies and adapts curricular material, she takes a risk. She may enrich the lesson in various ways that would be inscrutable to any outsider coming in abruptly without any pre-planning together. When supervisors pinpoint normal early grade behavior in their reports such as children playing with shoes or asking to be excused to go to the bathroom, the conversation about education shuts down. There is no place where a teacher can describe the subtle shift in how children arrive at an answer if all the supervisor is focused on in the comments is the volume of the teachers voice There is apparently some rule that administrators must not ever give in to a teacher's request to reset the conversation when an administrator simply has missed the point. Now I avoid interaction with administrators at all costs. Is this in any way what Charlotte Danielson envisioned? I especially object to the way my students are questioned by my supervisors, who take notes and leave the child wondering why this adult seems to be writing about him. Danielson's rating system is just not appropriate for the early grades. I would like to know how many hours Danielson spent in her adult life interacting with 5 year olds? The only thing I have gotten out of my feedback is a heightened desire to make clear to my youngest learners that I love them and I will help.

From an elementary school teacher: What looks great on paper, does not always work in practice. In my district students face many obstacles in their lives that impact on their ability to perform well, yet there are often more than thirty students in a class so they cannot receive the support that they need. Math coordinators came into our classrooms and we were instructed NOT to DEVIATE from the script of the EngageNY Math Module. The Application Problem at the start of lessons is allowed three minutes. Students are supposed to collaborate to solve the problem and then we are supposed to go over it as a class. But it is impossible to complete the Application Problem in the time allotted. Many times the concept taught during the lesson is based on the Application Problem, so if students do not understand the application, they cannot understand the concept. Although we are required to differentiate instruction, the modules do not permit teachers the flexibility they need to assist students who need extra help or to have them work in small groups when this would be advantageous. When teachers in my school raised our concerns about the modules, the math coordinator came in to do a model lesson. Even with three teachers and a teaching assistant circulating around the room helping the students, the coordinator, following the script, could not complete the lesson within the allotted time frame. Another one of our concerns is that the modules do not appear to be in correct mathematical sequence. The third grade Module 1 starts with multiplication. In the future, if the students have the necessary skills, possibly that could work, but right now it does not. In ELA we followed Common Core last year and it was a disaster. This year we are using an older reading program as our base. However even though the teachers are familiar with this reading program, we are being forced to follow rigid timelines and scripts.

From a middle school teacher: While I was only observed for the last 15 to 20 minutes of a class period, I was evaluated on all the components of the lesson, including the parts that were not observed. Since the evaluator did not actually observe the first half of the lesson, they evaluated it based on my lesson plan, and responded negatively when they felt I did not specify enough about what I was going to do. The evaluator could have asked me to explain how I group students, how this connects to previous or future lessons, how I assess their learning, and how I differentiate instruction, or even asked the students what they were doing, what they were learning and why. In my case it was not the lesson, but the lesson plan that was being evaluated. When I got my write up, I was evaluated on twelve out of twenty-two aspects of the four Danielson domains. I received the following scores.

Five Highly Effective: 2a. Creating an environment of respect and rapport; 2b. Establishing a culture for learning; 2c. Managing classroom procedures; 2d. Managing Student behavior; and 2e. Organizing physical space.

Four Effective:1a. Demonstrating knowledge of content pedagogy; 1e. Designing coherent instruction; 3b. Using questioning and discussion techniques; 3c. Engaging students in learning.

Three Developing: 1b. Demonstrating knowledge of students; 1c. Setting instructional outcomes; 1f. Designing student assessments.

Although this is basically a positive evaluation, the Danielson framework really contradicts what I do in the classroom. Primarily, I was evaluated on how I write a lesson plan. What has to be included in the lesson plan format is now constantly being added to and is nearly impossible to get done in a 40+ minute class period. These evaluations may make life easier for middle-management style administrators, many with little or no classroom experience, because all they have to do is check off what matches up with their checklist. But that is not good teaching or supervision. After the observation, I also received these comments.

1. "Parts of the lesson plan you submitted were unclear or general. The connection and link of the lesson to the previous day was unclear. The connection must connect or tie back to the previous lesson." (1a-Effective).
2. "It was unclear as to how the groups of students were assembled since the lesson plan you submitted only cited 'Guided Readings' it is unclear HOW the sources were distributed to each group." (1b-Developing)
3. "Your lesson plan cited that the lesson was differentiated with guided readings but did not further explain...the instruction should reflect several different types of learning..." (1c-Developing)
4. "The quick check component (monitoring student understanding PRIOR to moving on into the work period) is not included in your lesson plan." (1f-Developing)
5. "While you used some low-level challenged students to justify their thinking and engaged most of the students in the discussion. You used open-ended questions." (3b.-Effective).

My question is: Why observe us at all? Why not just take our lesson plans and save time?

From a high school teacher: I was observed during the first day of an eight-day project that culminated with student groups presenting TV shows about the Constitution, followed after each performance with student-to-student feedback and discussion. The first days of an extended project are "set up" days, however, where students are presented with information in a more teacher-centered fashion, and also work in groups planning their creation. I was observed during the first "set-up" day for 15 minutes. During almost all of that time, students worked in groups and I shuttled from group to group to answer questions. The AP who observed me told me in post-observation conference that on the part of the Danielson rubric devoted to students taking leadership roles in whole class discussion, I would get a "developing." Of course, he was using the rubric and had accurately recorded what took place. So, despite the fact that the entire Constitution TV show project (and my practice as a whole) is predicated upon student voice, student leadership and student-to-student discussion, one fourth of my evaluation as a teacher indicates I am "developing" in that area! The new evaluation system is geared towards teaching that takes place in discrete, period-long units, with the expectation that the wide range of Danielson outcomes should be achieved within that period-long framework. This flies in the face of the wisdom and knowledge possessed by those who actually teach inner-city children, up to 34 in a room, and sometimes, like in my school, with students sitting in groups of four or five around tables that encourage them to focus and socialize within the group and to ignore the larger class setting. Meaningful student-to-student, whole class discussion requires a great deal of preparation and structure, and MUST be done in more extended fashion across multiple class periods.

Post-It Note: There is some question whether the New York City local of the teachers' union, the United Federation of Teachers (UFT), which represents approximately 40% of the teachers in the state, will support NYSUT's challenge to Common Core, high-stakes testing, and the evaluation of teachers based on student test scores. At this time, the leadership of the UFT is supporting an opposition slate that is challenging the current officers of NYSUT. Members of the UFT need to make their opinions known to the local union leaders.