THE BLOG

For New Federal Law, We Should Be Asking Why and How We Test, Not Just How Often

04/13/2015 08:09 pm ET | Updated Jun 13, 2015

For more than a decade, the Congress has been unable to reauthorize the increasingly unpopular No Child Left Behind Act (NCLB), passed in 2002 in an act of unity after the trauma of 9/11. The bill set noble goals for educational improvement and equity. However, its methods led to criticism of federal overreach, as it sought to guide school reform by setting annual targets and sanctions for schools attached to students' standardized test scores.

In pursuit of the unattainable aim of "100 percent proficiency" for all students by 2014, nearly every public school in the United States has been deemed "in need of improvement" or "failing," and is meant to be in some form of intervention. Other USDOE initiatives have further expanded testing by requiring states that sought relief from the law to implement teacher evaluation based on student test scores -- raising the total number of tests for students into the hundreds in some districts, and triggering an "opt out" movement by parents.

Despite the intense focus on testing tied to consequences, achievement gains have slowed in the NCLB era, and achievement gaps have remained stubbornly large. By far, the largest gains for African American and Latino students on the National Assessment of Educational Progress (NAEP) - and the greatest reduction of the gaps in reading and math - occurred before NCLB, in the 1970s and early '80s, when ESEA, and the nation, focused on investing rather than testing. Gains were also larger in the 1990s than they have been since. For example, the NAEP long term data trend report finds the following:

  • Black 9 year old students gained 19 points in reading between 1971 and 1980, and 15 points between 1994 and 2004, but only 9 points between 2004 and 2012.
  • Hispanic 9 year olds gained 19 points in reading between 1994 and 2004 and only 9 points between 2004 and 2012.
  • Virtually all the progress for 13 year olds and every bit of the progress for 17 year olds happened between 1971 and 1988 as the children from the Great Society years made their way through school.
  • The greatest gains and the greatest gap closing for Black and Hispanic students in math also occurred before the NCLB era - generally in the 1970s and into the '80s as those who were young children in the 1970s made their way through school.

Furthermore, between 2000 and 2012, U.S. scores on the international PISA tests, which measure higher order thinking skills and applications of knowledge, declined in math, reading, and science. The NCLB strategy, which led to a focus on low-level tests of basic skills, has clearly not worked to raise achievement or close the gaps.

Nonetheless, despite repeated attempts, the Congress has been unable to reach agreement to redesign the law. This week the Senate may take an important step toward changing that, as Senate HELP committee chairman Lamar Alexander (R-TN) and his colleague Patty Murray (D-WA) bring a bipartisan bill to markup that has a real chance of success.

In recent weeks, one of the hottest debates has been about how often federally-mandated testing should occur. Some argue, with good reason, that annual assessments are needed to ensure that students are making progress and steps are taken when they are not. Others argue that the "every child, every year" testing requirement has reduced test quality and focused the curriculum too narrowly on low-level skills that can be measured with multiple-choice items. They prefer a return to state testing once in each grade span, as was true in the 1990s.

But so far, this debate is missing the most important question: What kinds of assessments should be used when, how, and for what purposes if we want high-quality learning to occur?

As policymakers debate the role of testing in ESEA, it will be vital that these discussions envision the end-game we're aiming for: classrooms that engage all students in meaningful, engaging learning that prepares them for college and careers in our complex modern world. Students need to learn how to be critical thinkers, problem solvers, collaborators, and life-long learners. They also need to be supported by teaching that addresses their specific learning needs.

Of course parents and teachers need information every year about student learning and progress. And states regularly need information about how schools and districts are doing in attaining important curriculum objectives and closing achievement gaps.

Unfortunately, these needs are not well-served by the same tests. Most state tests offer only a single number to describe a student's learning in a given area - rather than rich descriptions of what a student knows and can do with respect to a range of skills. And because competitive comparisons have been emphasized by the law, the tests must be given in the same time window at the end of the year under highly restricted conditions, rather than allowing students to take them throughout the year so that teachers can adapt instruction as they move along.

Furthermore, state tests, as they have been shaped by recent federal mandates, are restricted to assessing grade-level standards, so they cannot measure achievement or growth for the large share of students who are above or below grade level. (This also means that teachers often feel compelled or are instructed to teach only the grade level standards, even when some students could move ahead in their learning and others need instruction that will help them catch up to their peers.)

Even the new assessments from the Smarter Balanced Assessment Consortium and the Partnership for Assessment of Readiness for College and Careers (PARCC), which offer more open-ended items and tasks, are affected by these restrictions. States participating in these consortia (around 30) could do much more to modernize these tests and to embed them in productive systems if ESEA would let them.

In a word we are stuck with a set of old ideas about testing that are holding back much more productive approaches. Although states developed innovative assessments of student performance during the 1990s - including computer-adaptive technology-based tests that students could take anytime, portfolios of written work, and performance assessments measuring research, investigation, collaboration, and communication skills, all of these advances were ended by NCLB.

While other countries have been moving ahead with new approaches, NCLB pushed U.S. testing back to the 1950s when multiple-choice scantron tests were considered a modern technology. While our children are bubbling in answers at the end of each year to questions with five pre-determined choices, young people in Singapore are conducting collaborative projects that are part of the examination system; students in Australia are designing and completing science investigations; children in England and New Zealand are evaluated on a set of authentic reading, writing, speaking and listening tasks that provide extensive information about their developing literacy skills; and those in Hong Kong are demonstrating their understanding of physics problems in hands-on tasks as well as extended essays.

In these other countries, externally-administered tests are less frequent (usually once or twice before high school, plus examinations at the end of high school to inform college and career decisions), but much deeper than in the U.S. These open-ended exams, which feature essay questions and complex problems, often include project-based components completed during the school year and scored by teachers who are trained to ensure consistency. Rather than treating tests as black boxes to use as hammers for sanctions, these countries understand that assessments of, as, and for learning should be integrated into instruction and support better teaching.

With the right mix of assessments, students can be engaged in exciting learning that will prepare them for their futures without being over-tested. Teachers can have data that informs them about how students are learning as well as what they know. Districts and states can have data about how different groups of students are doing in different areas of the curriculum, so they can invest wisely in curriculum development, professional learning, and instructional supports.

To achieve this vision, several changes in federal education law are needed:

  • Assessment results should be reported and used for information and improvement, rather than for labeling schools or administering sanctions, a purpose for which they were never intended.

  • Federal law should no longer prescribe technical features of tests - how they are designed and administered -- in ways that prevent innovation and change.
  • States should be invited to create integrated systems of state- and locally-administered assessments that provide information for the multiple purposes they need to serve, combining rich assessments to describe annual student learning and progress in ways that can inform teaching, complemented by less time-intensive samples for large-scale reporting, so that the end result is an instructionally-useful, cost-effective system.
  • ESEA should encourage accountability systems based on multiple measures of student success, as well as students' opportunities to learn. That will encourage states and districts to close the gaps in students' access to resources and high-quality curriculum offerings as the most important means to improving outcomes.
  • New accountability systems should also gauge student learning by measures that extend beyond tests, such as: successful completion of challenging courses of study, such as International Baccalaureate, Early College, Advanced Placement and Linked Learning, and portfolios that assess coursework and performance like those offered by the New York Performance Standards Consortium -all of which do a better job predicting college and career success than a one-day, make-or-break test.
  • Testing should never be the be-all and end-all of an accountability system, but thoughtful assessments can play an important role. Allowing states to create intelligent systems of assessment will in the long run better support student learning than the one-size-fits-all model we've struggled with for the last decade.

    Linda Darling-Hammond is Charles E. Ducommun Professor of Education at Stanford University and Faculty Director of the Stanford Center for Opportunity Policy in Education. Her most recent book is Beyond the Bubble Test: How Performance Assessments Support 21st Century Learning.