How do we know which scientific results to trust? Research published in peer-reviewed academic journals has typically been considered the gold standard, having been subjected to in-depth scrutiny – or so we once thought. In recent years, our faith in peer-reviewed research has been shaken by the revelation that many published findings don’t hold up when scholars try to reproduce them. The question of which science to trust no longer seems straightforward.
Concerns about scientific validity and reproducibility have been on the rise since John Ioannidis, a professor at Stanford School of Medicine, published his 2005 article, “Why Most Published Research Findings are False.” Ioannidis pointed to several sources of bias in research, including the pressure to publish positive findings, small sample sizes, and selective reporting of results.
In the years since, a wave of scholars have dug deeper into these issues across a number of disciplines. Brian Nosek at the Center for Open Science and Elizabeth Iorns of Science Exchange spearheaded attempts to repeat past studies in their respective fields, psychology and cancer biology, with discouraging results.1 Economists encountered trouble in merely repeating the analyses reported in papers using the original data and code.
By 2016, when Nature surveyed 1,500 scholars, over half expressed the view that there is a significant “reproducibility crisis” in science. This crisis comes at an uncomfortable time, when some skeptical voices question even well-grounded scientific claims such as the effectiveness of vaccines and humans’ role in climate change.
Given this hostility, there’s a concern that reproducibility issues may undermine public confidence in science or lead to diminished funding for research.2 What is clear is that we need a more nuanced message than “science works” or “science fails.” Scientific progress is real, but can be hindered by shortcomings that diminish our confidence in some results, and need to be addressed.
There has been plenty of coverage about the reproducibility crisis and its implications (including debate over whether to call it a “crisis”) in both scientific publications and mainstream outlets like the New York Times, Atlantic, Slate, and FiveThirtyEight. But somewhat less attention has been paid to the question of how to move forward. To help chip away at this question, we’re publishing a series of articles from researchers leading initiatives to improve how academics are trained, how data are shared and reviewed, and how universities shape incentives for better research. After this essay, the rest of the series will be appearing on the Rethinking Research blog on Inside Higher Ed.
Why the Shaky Foundation?
The reproducibility problem is an epistemological one, in which reasons for doubt undermine the foundations of knowledge. One source of doubt is the lack of visibility into the nuts and bolts of the research process. The metaphor of “front stage” and “back stage” (borrowed from sociologist Erving Goffman, who used it in different context) may be helpful here.
If the front stage is the paper summarizing the results, the back stage holds the details of the methodology, data and statistical code used to calculate those results. All too often, the back stage is known only to the researchers, and other scholars cannot peer behind the curtain to see how the published findings were produced.
Another big issue is the flexibility scholars have in choosing how to understand and analyze their research. It’s often possible to draw many different conclusions from the same data, and the current system rewards novel, positive results. When combined with a lack of transparency, it can be difficult for others to know which results to trust, even if the vast majority of researchers are doing their work in good faith.
As Joseph Simmons, Leif D. Nelson, and Uri Simonsohn write in their article on “researcher degrees of freedom”: “It is common (and accepted practice) for researchers to explore various analytic alternatives, to search for a combination that yields ‘statistical significance,’ and to then report only what ‘worked,’… This exploratory behavior is not the by-product of malicious intent, but rather the result of two factors: (a) ambiguity in how best to make these decisions and (b) the researcher’s desire to find a statistically significant result.”
Given the potential for biased or flawed research, how can we encourage greater transparency and put the right incentives in place to promote reliable, reproducible research? Three big questions we’ll be looking at in this series are: How are researchers trained? What resources and support do they receive? How do institutions respond to and reward to their work?
Training the Next Generation of Researchers
Lack of proper training in research methods and data management skills can contribute to reproducibility problems. Graduate students are sometimes left on their own in learning how to manage data and statistical code. As they merge datasets, clean data, and run analyses, they may not know how to do this work in an organized, reproducible fashion. The “back stage” can become extremely messy, making it hard to share their materials with others or even double-check their own findings. As the students advance in their professions, they may not have the time (or the right incentives) to develop these skills.
In the 2016 survey conducted by Nature, researchers identified an improved understanding of statistics and better mentoring and supervision as the two most promising strategies for making research more reproducible.
A number of organizations are tackling this issue by offering workshops for graduate students and early-career researchers in how to conduct reproducible research, manage data and code, and track research workflow. Among them are trainings offered by Software Carpentry and Data Carpentry, the Center for Open Science, and the Berkeley Initiative for Transparency in the Social Sciences (BITSS). There are even courses available online from institutions such as Johns Hopkins.
Resources and Support for Better Science
While proper training is essential, researchers also need resources that support reproducibility and transparency. One critical piece of infrastructure are data repositories, online platforms that make it easier for scholars to organize research materials and make them publicly available in a sustainable, consistent way.
Repositories like Dataverse, Figshare, ICPSR, and Open Science Framework provide a place for researchers to share data and code, allowing others to evaluate and reproduce their work. There are also repositories tailored to qualitative research, such as the Qualitative Data Repository.
Universities are also enhancing their services and support for reproducible research practices. For example, the Moore-Sloan Data Science Environments initiative offers resources to support data-driven research at three universities, including software tools and training programs. Dozens of universities also have statistical consulting centers that offer advice to students and researchers on research design and statistical analysis. Some disciplinary associations are also convening groups to develop guidelines and standards for reproducible research.
Creating Incentives for Reproducible Research
Researchers often face career and institutional incentives that do little to encourage reproducibility and transparency, and can even work against those goals at times. Academic achievements like winning grants and earning tenure are linked primarily to publishing numerous papers in highly ranked journals. There’s little professional reward for the time-consuming work of sharing data, investigating and replicating the work of others, or even ensuring one’s own research is reproducible.
Institutions are beginning to shift these incentives through policies and funding that encourage reproducible research and transparency, while reducing some of the flexibility that can allow biases to creep in. Funders such as the Arnold Foundation3 and the Netherlands government have set aside money for scientists to conduct replications of important studies. Some have offered incentives for scientists to pre-register their studies, meaning they commit to a hypothesis, methodology, and data analysis plan ahead of data collection.
Increasingly, funding agencies and academic journals are adopting transparency policies that require data-sharing, and many journals have endorsed Transparency and Openness Promotion guidelines that serve as standards for improving research reliability.
In another interesting development, some journals have shifted to a model of “registered reports,” in which an article is accepted based on the research question and method, rather than the results. Recently, Cancer Research UK formed a partnership with the journal Nicotine and Tobacco Research to both fund and publish research based on the “registered reports” approach.
All of these initiatives are important, but the path to academic career advancement also needs to shift to reward research activities other than just publishing in prestigious journals. While change on this front has been slow, a few institutions like the University Medical Center Utrecht in the Netherlands have started to expand the criteria used in their tenure and promotion review process.
From Vision to Reality
The driving vision of these initiatives is a system that trains, supports, and rewards scientists for research that is transparent and reproducible, resulting in reliable scientific results. To learn how this vision is being put into practice, we’ve partnered with contributors on a series of articles about how they are working to improve research reliability in their fields.
None of these solutions is a silver bullet. Improving research reliability will depend on change in many parts of the academic ecosystem and by many actors – researchers, funders, universities, journals, media, and the public. Taking the next steps will require openness to new ways of doing things and an attempt to discern what’s effective for improving research.
In many ways, we’re still in the early stages of realizing that there’s a problem and taking steps to improve. The good news is that there’s an ever-growing segment of the research community, and of the public, who are aware of the need for change and willing to take steps forward.
This article is part of a series on how scholars are addressing the “reproducibility crisis” by making research more transparent and rigorous. The series was produced by Footnote and Stephanie Wykstra with support from the Laura and John Arnold Foundation. It was published on Footnote and Inside Higher Ed.
Endnotes
- Hundreds of psychologists who collaborated on the Reproducibility Project in Psychology were only able to reproduce about half the studies they analyzed. The Reproducibility Project in Cancer Biology has encountered similar difficulties reproducing results though a more recent release of two replications showed a higher rate of success (experiments are ongoing).
- This concern was raised by several speakers at a National Academy of Sciences’ conference on reproducibility in March 2017.
- This series was supported with a grant from the Arnold Foundation.