iPhone app iPad app Android phone app Android tablet app More

Featuring fresh takes and real-time analysis from HuffPost's signature lineup of contributors
Todd Farley

GET UPDATES FROM Todd Farley
 

Numbers Game

Posted: 05/17/2012 10:48 am

The scores for the writing portion of this year's FCAT (Florida Comprehensive Assessment Test) plummeted so precipitously that the abilities of the Sunshine State's student writers aren't being called into question, the very validity of those scoring statistics are. While I don't want to say "I told you so" regarding the dubiousness of those statistics, I did tell you so, as my 2009 book highlighted in great detail all the ways the numbers produced by the for-profit testing industry cannot be trusted.

Especially the stats produced at Pearson scoring centers around the country, where I worked for some 15 years.

On the first project I worked scoring student essays, I had to pass a qualifying exam to stay on the job. When I failed that qualifying exam (twice), I was unceremoniously fired. So were half the original hundred scorers who'd also failed the tests. Of course, when Pearson realized the next morning they no longer had enough scorers to complete the project on time, they simply lowered the "passing" grade on the qualifying test and put us flunkies right back on the job.

Yes, those of us considered unable to score student essays 12 hours before were welcomed back into the scoring center with open arms, deemed "qualified" after all.

Such duplicity was not an aberration in my experience, as for a decade and a half I saw every sort of corporate chicanery and statistical tomfoolery. The test-scoring industry seemed focused on getting deadlines met, projects completed, and scores put on tests, but only then was any thought given to meaningful scores being put on them.

I regularly saw unqualified people (myself included, apparently) keep their jobs scoring student responses even when they were altogether no good at the job, either when the acceptable qualifying grades were dropped so low that anyone could meet them, or when the correct answers to the qualifying exams were simply handed out before the tests were taken. I regularly saw statistics get doctored to make group reliability numbers (agreement between the scorers) look better than they really were, as high reliability stats were necessary to convince customers how standardized a job was being done and how "valid" the work really was. I regularly saw distribution numbers fixed to make score results look however a client might have wanted.

Once I attended a rangefinding meeting in Princeton with various test-scoring managers and English professors from around the country, the group having convened to figure out how to score writing samples for a national test. After that bunch of experts had finally hammered out a consensus regarding the writing rubric and writing samples we'd been reviewing, we were told we were scoring "wrong." We test-scoring pros and writing teachers were told our scoring wasn't matching the predictions of the omniscient psychometricians (statisticians/testing gurus), and we were told we had to match those predictions even though the pyschometricians had never actually seen the student essays.

When the next year I read in the New York Times that student writing scores had ended up exactly in the middle of the psychometricians' predictions, I can't say I was surprised: We'd made sure they did.

And that's the thing: In my experience, the for-profit test-scoring industry could produce results on demand. There was no statistic that couldn't be doctored, no number that couldn't be fudged, no figure that couldn't be bent to our collective will. Once, when a state Department of Education didn't like the distribution of essay scores we'd been producing over the first two weeks of a project, we simply followed their instruction to give more higher scores. "More 3's!" became our battle cry on that project, even if randomly giving more 3's was fundamentally unfair to the students whose essays had been assessed differently before.

I guess I'm saying no one really need worry too much about this year's falling FCAT scores, because they're only a number. If it's different numbers that state is after next year, they should just ask. I'm sure Pearson can just make more.

 
FOLLOW EDUCATION
The scores for the writing portion of this year's FCAT (Florida Comprehensive Assessment Test) plummeted so precipitously that the abilities of the Sunshine State's student writers aren't being called...
The scores for the writing portion of this year's FCAT (Florida Comprehensive Assessment Test) plummeted so precipitously that the abilities of the Sunshine State's student writers aren't being called...
 
 
  • Comments
  • 5
  • Pending Comments
  • 0
  • View FAQ
Comments are closed for this entry
View All
Favorites
Recency  | 
Popularity
03:48 AM on 06/08/2012
To paraphrase Groucho Marx: "If you don't like my scores... I have others!"
photo
HUFFPOST SUPER USER
C Karen Stopford
07:54 AM on 06/03/2012
Precisely why "no child left behind" and "racing to the top" are political sleight-of-hand. Judging the quality of education through the use of standardized tests leads to multiple abuses and ALWAYS results in reduction in the quality of education. Take the time to read through the many comments on any politically charged article, and you will see the results of a system driven by the numbers. I am appalled at the devolution of public education; at a wee bit over the age of 50, I'm ever so grateful that I received mine at a time when teachers were allowed to teach and curriculum design took student interests and developmental needs into consideration. There is no such thing as a "well-rounded" curriculum anymore, unless you look at some of the more exclusive private schools - out of reach for the average citizen.
10:54 PM on 05/21/2012
Todd's excellent book, "Making the Grades: My Misadventures in the Standardized Testing Industry," is a must-read for educators and school administrators. It's actually a very funny read. Todd is a good writer. I worked for a short time in testing and realized very quickly that there are some serious problems with the testing industry. In some ways it hurts the smarter, more imaginative students. I left the job in tears because I realized what a horror the whole testing thing is large-scale. What people don't realize is that unqualified people are reading these kids' essays, and they're rewarded based on how quickly they can zip through an essay and slap a score on it. Grading is not a careful, thoughtful process, I assure you. Buy Todd's book!!!!!!
This user has chosen to opt out of the Badges program
photo
05:30 PM on 05/21/2012
I wish this article wasn't buried far below other less urgent educational articles. This is an indictment of corrupt for profit test making corporations whose phony practices are determining educational policy in our country and ruining good hard working people's lives. If test scores can be doctored up they can be doctored down as well, especially if the money for the desired results is coming from people interested in laying off, closing down, privatizing schools.
photo
HUFFPOST COMMUNITY MODERATOR
blindjester
English and ESL teacher
10:10 PM on 05/17/2012
Excellent last line. Well written. Sorry it's true.