Huffpost Education
The Blog

Featuring fresh takes and real-time analysis from HuffPost's signature lineup of contributors

Tom Vander Ark Headshot

How Intelligent Scoring Will Create an Intelligent System

Posted: Updated:
Print Article

I want kids to write a lot every day. But, many high school classes only require three writing assignments in a semester. With a big class load, it's extremely time consuming for a teacher to grade hundreds of paper each week.

State tests don't typically demand much writing because it is difficult, expensive and time consuming to score. When state tests focus on recall rather than writing it reinforces coverage over competence.

Here's the problem: the standards movement was initially animated by a rich vision of competency-based learning, authentic assessment, performance demonstrations, and student portfolios. That got expensive and unwieldy fast so we ended up with bubble sheet multiple-choice end of year exam. And because we haven't had much data, we've tried to use the same cheap tests to improve instruction, evaluate teachers, manage matriculation, and hold school accountable.

Problem number two: every state has their own standards and tests. They are not comparable and many of them don't reflect real college and work ready expectations. Common Core State Standards and the tests the two state consortia (PARCC and SBAC) are building will go a long way toward solving this second problem -- they will set real college and work ready standards and making results more comparable.

The real question is, how good will these new tests be? Will they reinforce what kids really need to know and do? Administering them online will make them less expensive. The results will be available quickly. But will they better reflect what we want kids to know and be able to do?

It would be easy and cheap to administer online multiple choice tests. But if we take seriously the demands of the idea economy and the associated expectations of the Common Core, we must to do better.

What if state tests required students to write essays, answer tough questions, and compare difficult passages of literature? What if tests provided quick feedback quickly? What if the marginal cost was close to zero? What if the same capability to provide performance feedback on student writing was available to support every day classroom learning?

The William and Flora Hewlett Foundation is sponsoring a competition that will demonstrate that automated essay scoring is already pretty good -- on most traits, it is as good as expert human graders. This prize competition will make it better. The reason Hewlett is sponsoring the competition is that they want to promote deeper learning -- mastery of core academic content, critical reasoning and problem solving, working collaboratively, communicating effectively and learning how to learn independently.

"Better tests support better learning," said Barbara Chow, Education Program Director at the Hewlett Foundation. "Rapid and accurate automated essay scoring will encourage states to include more writing in their state assessments. And the more we can use essays to assess what students have learned, the greater the likelihood they'll master important academic content, critical thinking and effective communication."

The competition kicks off today with a demonstration of capabilities of current testing vendors. They will spend the next two weeks using almost 14,000 essays (which were gathered from state testing departments) to train their scoring algorithms. On Jan. 23 they will receive another batch of more than 5,000 essays and will have two days to score them. The competition hosts will report back to the testing consortia in February with a description of current scoring capabilities.

Also launching today is an open competition with $100,000 in prize money. Computer scientists worldwide are invited to join this competition and attempt to best well know testing firms including AIR, CTM McGraw-Hill, ETS, and Pearson. To give the upstarts more time to attack the data, the open competition runs through April.

The competition has the opportunity to influence the extent to which state tests incorporate authentic assessment rather than rely solely on inexpensive multiple choice items. To a great extent, state assessments influence the quality and focus of classroom instruction. This competition has the potential to improve the quality of state assessments and, as a result, classroom instruction in this country for the next decade.

The academic adviser for the competition, Mark Shermis, notes:

In the area of high-stakes assessment, it will take a few years before the technology is 'trusted' enough to make assessment decisions without at least one human grader in place. Most current high-stakes implementations use one human grader and one machine grader to evaluate an essay.

Lower stakes tests are likely to just use automated scoring -- like entrance exams for medical school, law school, and business school.

In his recent book on the subject, Dr. Shermis suggests that automated essay scoring will give states attractive options including the use a portfolio of scored classroom essays to supplement or replace an end-of-course or end-of-year exam.

"Currently the technology can tell you if you are on topic, how much you are on topic, whether you have a writing structure, and whether you are doing a good job on the general mechanics of writing," said Dr. Shermis. "It cannot determine whether you have made a good or sufficient argument or whether you have mastered a particular topic." That would be really intelligent scoring and is probably a few years off.

In addition to better tests and consistently high expectations, the requirements to administer the new tests online will accelerate the transition to personal digital learning. While boosting computer access for testing, most states and districts will find it logical to shift from print to digital instructional resources. And most digital content will include embedded assessment.
Online essay scoring will improve the quality of state testing, but the real benefit will be the weekly use in classrooms. Teachers across the curriculum will be able to assign 1500 words a week -- not 1500 words a semester -- and know that students will receive frequent automated feedback as well as the all important and incisive teacher feedback.

The Hewlett sponsored Automated Student Assessment Prize (ASAP) will help states make informed decisions about testing. It will also accelerate progress toward a more intelligent education system that benefits teachers and students.