The current way many researchers apply p-values to draw conclusions on statistical significance is incorrect and unhelpful, three scientists argue in a Nature commentary published yesterday (March 20). The authors urge the research community to drop the concept of statistical significance altogether, and more than 800 statisticians and scientists have signed on to the idea.
Scientists have often misused p-values to make claims about what hypotheses their statistically significant or insignificant results “prove,” leading to hyped claims or artificial conflict between studies, the authors say. Because of the bias in journals towards publishing findings with p-values below 0.05, scientists may ignore interesting results that don’t meet the bar and may pick data or methods to try to surpass the threshold.
The authors say they are not trying to ban p-values. Rather, “we are calling for a stop to the use of P values in the conventional, dichotomous way—to decide whether a result refutes or supports a scientific hypothesis,” they write.
P-values can force results into a binary context that doesn’t reflect the complexity of the world. The authors challenge the research community to embrace uncertainty. They suggest that “confidence intervals” should be renamed “compatibility intervals” and urge more thoughtfulness in interpreting data.
“Whatever the statistics show, it is fine to suggest reasons for your results, but discuss a range of potential explanations, not just favoured ones,” they write. “Factors such as background evidence, study design, data quality and understanding of underlying mechanisms are often more important than statistical measures such as P values or intervals.”
The American Statistical Association is also pushing for an end to statistical significance. “Regardless of whether it was ever useful, a declaration of ‘statistical significance’ has today become meaningless,” the statisticians write in an editorial published yesterday in a special issue of The American Statistician devoted to this debate. “[N]o p-value can reveal the plausibility, presence, truth, or importance of an association or effect. Therefore, a label of statistical significance does not mean or imply that an association or effect is highly probable, real, true, or important,” the authors of the editorial explain.
Still, not all scientists are ready to abandon statistical significance. “Banning the word ‘significance’ may well free researchers from being held accountable when they downplay negative results” and otherwise manipulate their findings, Deborah Mayo, a philosopher of science at Virginia Tech, tells NPR. “We should be very wary of giving up on something that allows us to hold researchers accountable,” she says.
Journals, too, are not yet ready to throw out the concept. Nature published an accompanying editorial acknowledging how engrained statistical significance is in research and stating that it is “not seeking to change how it considers statistical analysis in evaluation of papers at this time.”