How Biased Measures Lead to False Conclusions

One hopeful development in evidence-based reform in education is the improvement in the quality of evaluations of educational programs. Because of policies and funding provided by the Institute of Education Sciences (IES) and Investing in Innovation (i3) in the U.S. and by the Education Endowment Foundation (EEF) in the U.K., most evaluations of educational programs today use far better procedures than was true as recently as five years ago. Experiments are likely to be large, to use random assignment or careful matching, and to be carried out by third-party evaluators, all of which give (or should give) educators and policy makers greater confidence that evaluations are unbiased and that their findings are meaningful.

Despite these positive developments, there remain serious problems in some evaluations. One of these relates to measures that give the experimental group an unfair advantage.

There are several ways in which measures can unfairly favor the experimental group. The most common is where measures are made by the creator of the program and are precisely aligned with the curriculum taught in the experimental group but not the control group. For example, a developer might reason that a new curriculum represents what students should be taught in, say, science or math, so it's all right to use a measure aligned with the experimental program. However, use of such measures gives a huge advantage to the experimental group. In an article published in the Journal of Research on Educational Effectiveness, Nancy Madden and I looked at effect sizes for such over-aligned measures among studies accepted by the What Works Clearinghouse (WWC). In reading, we found an average effect size of +0.51 for over-aligned measures, compared to an average of +0.06 for measures that were fair to the content taught in experimental and control groups. In math, the difference was +0.45 for over-aligned measures, -0.03 for fair ones. These are huge differences.

A special case of over-aligned measures takes place when content is introduced earlier than usual in students' progression through school in the experimental group, but not the control group. For example, if students are taught first-grade math skills in kindergarten, they will of course do better on a first grade test (in kindergarten) than will students not taught these skills in kindergarten. But will the students still be better off by the end of first grade, when all have been taught first grade skills? It's unlikely.

One more special case of over-alignment takes place in relatively brief studies when students are pre-tested, taught a given topic, and then post-tested, say, eight weeks later. The control group, however, might have been taught that topic earlier or later than that eight-week period, or might have spent much less than 8 weeks on it. In a recent review of elementary science programs, we found many examples of this, including situations in which experimental groups were taught a topic such as electricity during an experiment, while the control group was not taught about electricity at all during that period. Not surprisingly, these studies produce very large but meaningless effect sizes.

As evidence becomes more important in educational policy and practice, we researchers need to get our own house in order. Insisting on the use of measures that are not biased in favor of experimental groups is a major necessity in building a body of evidence that educators can rely on.

Your Loyalty Means The World To Us

Dear HuffPost Reader

Thank you for your past contribution to HuffPost. We are sincerely grateful for readers like you who help us ensure that we can keep our journalism free for everyone.

The stakes are high this year, and our 2024 coverage could use continued support. Would you consider becoming a regular HuffPost contributor?

Dear HuffPost Reader

Thank you for your past contribution to HuffPost. We are sincerely grateful for readers like you who help us ensure that we can keep our journalism free for everyone.

The stakes are high this year, and our 2024 coverage could use continued support. If circumstances have changed since you last contributed, we hope you'll consider contributing to HuffPost once more.

Support HuffPost

Education Education Reform education research investing-in-innovation

Submit a tip

What's Hot

How Biased Measures Lead to False Conclusions

Our 2024 Coverage Needs You

It's Another Trump-Biden Showdown — And We Need Your Help

The Future Of Democracy Is At Stake

Our 2024 Coverage Needs You

Your Loyalty Means The World To Us

Related

Popular in the Community

From Our Partner

What's Hot

What's Hot

Our 2024 Coverage Needs You

It's Another Trump-Biden Showdown — And We Need Your Help

The Future Of Democracy Is At Stake

Our 2024 Coverage Needs You

Your Loyalty Means The World To Us

Related

Popular in the Community

From Our Partner

What's Hot

More In Education