Many scholars, even those working in disciplines such as mathematics, do not properly understand commonly used statistical methods in science, a study suggests.
About 90 per cent of researchers and students surveyed for the study in China failed to correctly interpret the use of p-values and confidence intervals, two of the most common statistical tools used to analyse scientific results.
Almost 1,500 people, from undergraduates to postdoctoral researchers, were given a series of false statements about the interpretation of p-values and confidence intervals and asked to judge if any were correct.
A total of 89 per cent of the participants made at least one error on p-values, and 93 per cent made at least one mistake when considering the correct interpretation of confidence intervals.
The proportion incorrectly interpreting the two methods did not vary much across disciplines, with even 85 per cent to 90 per cent of those working in maths and statistics failing to spot that all the statements were wrong.
Even when looking at only postgraduates and researchers, the proportion misunderstanding the methods remained high, although a slightly smaller share of those with a PhD made an error interpreting p-values.
Respondents to the survey were also asked to indicate how confident they were in making their decisions on a scale of one to five. Based on the results, the researchers and students were “generally confident about their (incorrect) judgements”.
“These results suggest that researchers generally do not have a good understanding of these common statistical indices,” the paper says. That, it goes on, might indicate that the embedded “ritual” of using such methods wrongly “is not limited to psychology or social science but also [extends] to the entire scientific community”.
The paper, published in the Journal of Pacific Rim Psychology, adds to the growing evidence about the problems of using tools such as p-values. Last year, there was a major call by statisticians to stop using them as a way to deem results as “statistically significant”.
Chuan-Peng Hu, a postdoctoral researcher at the Leibniz Institute for Resilience Research in Germany, and co-author of the new study, said giving undergraduates better training in statistical inference would help to counter the problem, but there also needed to be “constant learning” among scholars at “all levels”.
In addition, he warned, incentives for researchers had to change. “The current system doesn’t care so much about being correct; instead, we are rewarded [for being] productive,” he said. “Changing the culture would be the long-term goal.”
The analysis did find that those whose highest degree had been obtained outside mainland China had a slightly lower error rate on interpreting p-values.
“The only available explanation for this scenario might be that the replication crisis was discussed more in the English media than in the Chinese media. Therefore, students who had studied overseas were more familiar with this topic than their local counterparts,” the paper says.
Register to continue
Why register?
- Registration is free and only takes a moment
- Once registered, you can read 3 articles a month
- Sign up for our newsletter
Subscribe
Or subscribe for unlimited access to:
- Unlimited access to news, views, insights & reviews
- Digital editions
- Digital access to THE’s university and college rankings analysis
Already registered or a current subscriber? Login