‘Preliminary research’ on COVID has been surprisingly solid

Preprint studies have been critical for fast-moving science and public accessibility during the pandemic—but not without criticism.
A pile of books to represent scientific journals and COVID preprint studies
Preprints have been a key source of early COVID research. But do they hold up in the court of scientific peer-review? Deposit Photos

Before the COVID pandemic, peer-review was the beating heart of scientific publishing. In order for studies to enter the body of scientific knowledge, the expectation was that researchers would submit them to academic journals, which would send the papers out to other experts for edits and revisions before publishing.

But it’s a process that wasn’t well-suited to the urgency of the COVID pandemic, when early research could save lives. Peer-review often takes months, and it asks for huge amounts of unpaid labor on the part of the scientists who scrutinize papers. In early 2020, growing numbers of scientists began to post research on open-access databases, called preprint servers, before those preprints had been formally reviewed.

New research suggests that scientific norms are still operating on preprint servers. As physician and reporter Trisha Pasricha wrote in the Washington Post over the weekend, “when a group of authors puts any study in the public domain … they are placing their reputations on the line.”

Preprints, which are often referred to as “preliminary research” in the news, had already been gaining popularity in early-adopting fields like genomics and neuroscience—but the time pressures of the pandemic gave them a new primacy. Over the first year of the pandemic, preprint servers hosted 7,000 COVID papers, while journals published about 12,500 formal papers. (There was some overlap.) Unlike many journals, preprint servers are free for anyone to access, and researchers don’t have to pay to post on. Many of those early papers do end up going through peer-review: The co-founder of two key preprint servers recently wrote on Twitter that half of all 2020 COVID-preprints have now been formally published. Regardless, preprints have become central to the science of COVID and how it’s covered in the media. 

That’s been a source of controversy. To critics of preprints, they’re a repository of questionable science. “The limitation is that any idiot can publish any idiotic stuff on a platform that doesn’t have pre-publication peer review,” as one former journal editor put it to a New York Times columnist last month. But according to two new analyses shared in the (peer-reviewed) journal PLOS Biology, preprints as a whole contain much of the same information and interpretations as peer-reviewed research.

In one paper, a computational biologist developed a tool to analyze thousands of pre-pandemic preprints and peer-reviewed work for linguistic differences. In the second, a group of scientists manually examined all 184 papers published both as preprints and with peer-review from December 2020 through April 2020.

Both analyses found that changes between preprints and reviewed publications rarely involved wholesale revisions of a paper’s conclusions. Most of the time, the researchers found only small grammatical edits. “I think what our findings do prompt is a reevaluation of the role of peer-review,” says Jonathon Coates, a postdoctoral researcher in immunology at the William Harvey Research Institute and an author on the second review. “Is the amount of time and money [both scientists and taxpayers] put into peer-review worth it?”

For the large-scale computational analysis, David Nicholson, a PhD candidate at the University of Pennsylvania’s School of Medicine, started with a broad question: “How do people use preprints?” But Nicholson’s team soon realized that the tool he’d developed could also be used to measure how peer-review affects scientific writing. “Peer-review is time consuming and long, but does it also equate to changes that we might see in papers?” he says.

Nicholson’s team compared 3 million articles in the National Institute of Health’s open-access research library to the 98,000 articles published as of February 2020 on the preprint server BioRxiv. (BioRxiv and its medicine-specific spinoff MedRxiv host the vast majority of COVID preprints.) They also matched preprints with their published versions to analyze the changes.

“What David sees is that the things that changed are typesetting marks, like the plus or minus symbol, the em-dash, as well as words like ‘additional,’ ‘supplementary,’ and ‘file,’” says Casey Greene, a computational biologist at the University of Colorado School of Medicine and an author on the paper. “That suggests people probably aren’t dramatically changing the text as they’re publishing it, they’re adding additional support to key claims, and their stuff is getting typeset.”

Meanwhile, the second, more-granular review showed what those small changes looked like in practice. Coates’ team found only one instance where authors had reversed a conclusion during peer-review, although 17 percent of COVID papers and 7 percent of non-COVID papers had “major changes” in their conclusions. “One of the things we did notice between preprint and paper is not a difference in the conclusion, but how the conclusion was worded,” Coates says. For example, a noun would be swapped out, or the certainty would be dialed back. “The peer-review process is saying, yeah you’re right, but tone your language down a little bit,” Coates adds.

The takeaway, Coates and his co-authors argue, is not to suddenly trust the reliability of preprints. It’s to be equally willing to question peer-reviewed work. “You should trust the peer-reviewed literature as much as you trust a preprint,” he says. “That, I think, just comes down to common sense. Whenever you read a paper you should be sort of doing your own peer-review; you should be asking other people what they think.”

[Related: How to tell science from pseudoscience]

MedRxiv and BioRxiv do screen for plagiarized, non-scientific, and obviously false work. And projects like the Preprint Review Club, a group of early-career immunologists across universities, have sprung up to provide standardized reviews for work that hasn’t been formally published.

“The end goal is to make [publishing] a more collaborative project that will benefit us all as a scientific community,” says Ester Gea-Mallorquí, an immunologist at the University of Oxford, and a coordinator for the Preprint Club. “We aim to contribute to this independent and more transparent process of review, providing feedback, and constructive criticism.”

Gea-Mallorquí says that based on her experience, “most preprints are quite finalized when they are uploaded.” But, she says, to see such small changes was a surprise—and one that should encourage more openness in the publishing process.

Peer-review hasn’t always been a perfect shield, either. A now-debunked but widely cited paper on hydroxychloroquine working as a treatment for COVID was published in the International Journal of Microbial Agents. The point of the process is to filter out plagiarism or faulty logic, not necessarily to settle on correct answers.

During the pandemic, scientific journals have adapted in some ways. Publication times for COVID-related research were shortened, and Coates says that reviewers were less likely to ask for additional experiments and results, in recognition that it was physically hard to get into a lab during lockdowns. The prominent journal publisher Taylor & Francis also began selling “accelerated publication” options to potential authors—charging $7,000 to publish articles in three to five weeks after submission.

Ultimately, preprint servers take decision making power out of the hands of scientific journals and give it to researchers. “It just switches when research is shared,” says Coates. “Instead of waiting for an editor and some random peer reviewers to say this is acceptable to share, the scientist makes that decision.”

That’s particularly valuable to early-career researchers like himself. “[Preprint servers] cook down that year between having finished research and having it published, so we can apply for grants,” Coates says. What’s more, they make studies accessible to a wider audience. “It’s really good for the general public as well because if you’re interested in a bit of research, you can actually read it instead of paying up to $10,000 just to get access in a journal,” he explains.

Still, Gea-Mallorqui says the immediacy of preprints can have drawbacks: “It is also true that as experts on the field, when looking at preprints we do already filter the ones that [are] more significant.” Even though preprint servers have been quick to remove misleading studies—she raises the recent “deltacron” paper as an example—they can still make a splash in the media.

This pair of PLOS Biology papers is itself a demonstration of how preprints can change the nature of scientific dialogue. The separate teams posted their work as preprints on BioRxiv last spring. Rather than repeat each other’s work, they then decided to publish the bigger picture at the same time.

Without that collaboration, Coates says, it’s also likely that his group’s work would not have been formally peer-reviewed: They didn’t have the money for publication fees, which Greene’s lab ended up being willing to cover.

Update (February 3, 2022): This story has been updated with comments from a researcher who wasn’t an author on either of the new preprint analyses.