Many years ago, I moved to Paris with only high school French to sustain me. Parisians have a reputation for -- shall we say -- brusqueness, and I had no shortage of embarrassing encounters that I interpreted as scorn for my weak French and wretchedly obvious American-ness. Over time as my French improved and my wardrobe shifted to the local norm, I noticed that I was still encountering a heckuva lot of brusqueness. But as I developed the ability to understand the conversations going on around me, I figured something out: Parisians weren't singling me out for my accent or my clothes -- they were hard on everyone, including each other. Made me feel better, somehow.
I was reminded of those years recently when I read the editorial in the July 4, 2014 issue of Science magazine in which editor-in-chief Marcia McNutt announced the establishment of a new quality control measure -- a Statistical Board of Reviewing Editors, or SBoRE. Why would Science do such a thing?
To quote Dr. McNutt: "Readers must have confidence in the conclusions published in our journal." Why might Science think that readers are losing confidence? How about the following headlines?
- Why Has the Number of Scientific Retractions Increased?
- In Cancer Science, Many "Discoveries" Don't Hold Up
- Why Most Published Research Findings Are False
The studies described under these headlines identified many kinds of problems with published research results. One could be forgiven for drawing the conclusion that there is something rotten in the research enterprise. Personally, I don't think the problem is as dire as the headlines might lead you to believe. Indeed, I would argue that the very publication of such studies suggests that there is something very much right going on.
Here at the National Center for Science Education, we come across a lot of claims by anti-evolutionists and climate change deniers that the scientific establishment cannot be trusted because it refuses to countenance opposition. Data supporting the "alternate views" are hidden. Papers that question the status quo are summarily rejected, while those that support it sail through. Researchers who don't toe the line are denied research funding. The scientific enterprise is portrayed as a totalitarian monolith, brooking no dissent.
Well, I ask you. If the scientific enterprise is so paranoid and self-protective, how is it that there is such a lively debate among scientists about misconduct, reproducibility and reliability? I would argue that, like Parisians, the answer is that scientists are harder on themselves than any outside critic. They know their own credibility and the public's support for science depend on maintaining confidence in the quality of published research. Moreover, published results are the stepping stones that guide scientists to new concepts and new experiments, from which new knowledge emerges. If those results are wrong, scientists know that they will be wasting precious time and resources.
So when evidence builds up that there's a problem with the system, scientists do what scientists do: they examine the evidence, and if it's credible, they act on it. In this case, a theme that arose in many of the studies criticizing the reliability of published results was that of inappropriate use of statistics. Using the wrong statistical test, over-interpreting results, inadequate sample size, poor study design -- these problems cropped up repeatedly. Why? I'm sure there are lots of reasons, from the occasional case of outright fraud to wishful thinking to incompetence to inexperience. Many biomedical researchers have fairly limited training in statistics and computational analysis -- peer reviewers with expertise in the topical area may be no more expert in data analysis. Thus Science's new board, will, in McNutt's words: "provide better oversight of the interpretation of observational data." Papers flagged by SBoRE members will get additional scrutiny by experts in statistics to ensure that the conclusions drawn are justifiable.
No doubt there will be complaints about this new layer of scrutiny: it will slow publication of results and lead to demands for revisions with which authors may not agree, but over time, the evidence should show that fewer papers are retracted, more of them can be reproduced, and research conclusions will be worthy of greater confidence. If such evidence does not materialize, I can confidently predict that other means will be brought to bear to ensure that published results are as reliable as possible. Because you see, those of you who complain that science is dismissive of your evidence of "intelligent design" or a cooling planet, you're not being singled out: you are simply being held to the same standard to which scientists hold each other.