bioRxiv – Close Read

From an article entitled ‘The risks of swiftly spreading coronavirus research‘ published by Reuters:

A Reuters analysis found that at least 153 studies – including epidemiological papers, genetic analyses and clinical reports – examining every aspect of the disease, now called COVID-19 – have been posted or published since the start of the outbreak. These involved 675 researchers from around the globe. …
Richard Horton, editor-in-chief of The Lancet group of science and medical journals, says he’s instituted “surge capacity” staffing to sift through a flood of 30 to 40 submissions of scientific research a day to his group alone.
… much of [this work] is raw. With most fresh science being posted online without being peer-reviewed, some of the material lacks scientific rigour, experts say, and some has already been exposed as flawed, or plain wrong, and has been withdrawn.
“The public will not benefit from early findings if they are flawed or hyped,” said Tom Sheldon, a science communications specialist at Britain’s non-profit Science Media Centre. …
Preprints allow their authors to contribute to the scientific debate and can foster collaboration, but they can also bring researchers almost instant, international media and public attention.
“Some of the material that’s been put out – on pre-print servers for example – clearly has been… unhelpful,” said The Lancet’s Horton.
“Whether it’s fake news or misinformation or rumour-mongering, it’s certainly contributed to fear and panic.” …
Magdalena Skipper, editor-in-chief of Nature, said her group of journals, like The Lancet’s, was working hard to “select and filter” submitted manuscripts. “We will never compromise the rigour of our peer review, and papers will only be accepted once … they have been thoroughly assessed,” she said.

When Horton or Sheldon say some of the preprints have been “unhelpful” and that they cause panic among the people – which people do they mean? No non-expert person is hitting up bioRxiv looking for COVID-19 papers. They mean some lazy journalists and some irresponsible scientists are spreading misinformation, and frankly their habits represent a more responsible problem to solve instead of pointing fingers at preprints.

The Reuters analysis also says nothing about how well preprint repositories as well as scientists on social media platforms are conducting open peer-review, instead cherry-picking reasons to compose a lopsided argument against greater transparency in the knowledge economy. Indeed, crisis situations like the COVID-19 outbreak often seem to become ground zero for contemplating the need for preprints but really, no one seems to want to discuss “peer-reviewed” disasters like the one recently publicised by Elisabeth Bik. To quote from The Wire (emphasis added),

[Elisabeth] Bik, @SmutClyde, @mortenoxe and @TigerBB8 (all Twitter handles of unidentified persons), report – as written by Bik in a blog post – that “the Western blot bands in all 400+ papers are all very regularly spaced and have a smooth appearance in the shape of a dumbbell or tadpole, without any of the usual smudges or stains. All bands are placed on similar looking backgrounds, suggesting they were copy-pasted from other sources or computer generated.”
Bik also notes that most of the papers, though not all, were published in only six journals: Artificial Cells Nanomedicine and Biotechnology, Journal of Cellular Biochemistry, Biomedicine & Pharmacotherapy, Experimental and Molecular Pathology, Journal of Cellular Physiology, and Cellular Physiology and Biochemistry, all maintained reputed publishers and – importantly – all of them peer-reviewed.

On February 1, Anand Ranganathan, the molecular biologist more popular as a columnist for Swarajya, amplified a new preprint paper from scientists at IIT Delhi that (purportedly) claims the Wuhan coronavirus’s (2019 nCoV’s) DNA appears to contain some genes also found in the human immunodeficiency virus but not in any other coronaviruses. Ranganathan also chose to magnify the preprint paper’s claim that the sequences’ presence was “non-fortuitous”.

To be fair, the IIT Delhi group did not properly qualify what they meant by the use of this term, but this wouldn’t exculpate Ranganathan and others who followed him: to first amplify with alarmist language a claim that did not deserve such treatment, and then, once he discovered his mistake, to wonder out loud about whether such “non-peer reviewed studies” about “fast-moving, in-public-eye domains” should be published before scientific journals have subjected them to peer-review.

https://twitter.com/ARanganathan72/status/1223444298034630656

https://twitter.com/ARanganathan72/status/1223446546328326144

https://twitter.com/ARanganathan72/status/1223463647143505920

The more conservative scientist is likely to find ample room here to revive the claim that preprint papers only promote shoddy journalism, and that preprint papers that are part of the biomedical literature should be abolished entirely. This is bullshit.

The ‘print’ in ‘preprint’ refers to the act of a traditional journal printing a paper for publication after peer-review. A paper is designated ‘preprint’ if it hasn’t undergone peer-review yet, even though it may or may not have been submitted to a scientific journal for consideration. To quote from an article championing the use of preprints during a medical emergency, by three of the six cofounders of medRxiv, the preprints repository for the biomedical literature:

The advantages of preprints are that scientists can post them rapidly and receive feedback from their peers quickly, sometimes almost instantaneously. They also keep other scientists informed about what their colleagues are doing and build on that work. Preprints are archived in a way that they can be referenced and will always be available online. As the science evolves, newer versions of the paper can be posted, with older historical versions remaining available, including any associated comments made on them.

In this regard, Ranganathan’s ringing the alarm bells (with language like “oh my god”) the first time he tweeted the link to the preprint paper without sufficiently evaluating the attendant science was his decision, and not prompted by the paper’s status as a preprint. Second, the bioRxiv preprint repository where the IIT Delhi document showed up has a comments section, and it was brimming with discussion within minutes of the paper being uploaded. More broadly, preprint repositories are equipped to accommodate peer-review. So if anyone had looked in the comments section before tweeting, they wouldn’t have had reason to jump the gun.

Third, and most important: peer-review is not fool-proof. Instead, it is a legacy method employed by scientific journals to filter legitimate from illegitimate research and, more recently, higher quality from lower quality research (using ‘quality’ from the journals’ oft-twisted points of view, not as an objective standard of any kind).

This framing supports three important takeaways from this little scandal.

A. Much like preprint repositories, peer-reviewed journals also regularly publish rubbish. (Axiomatically, just as conventional journals also regularly publish the outcomes of good science, so do preprint repositories; in the case of 2019 nCoV alone, bioRxiv, medRxiv and SSRN together published at least 30 legitimate and noteworthy research articles.) It is just that conventional scientific journals conduct the peer-review before publication and preprint repositories (and research-discussion platforms like PubPeer), after. And, in fact, conducting the review after allows it to be continuous process able to respond to new information, and not a one-time event that culminates with the act of printing the paper.

But notably, preprint repositories can recreate journals’ ability to closely control the review process and ensure only experts’ comments are in the fray by enrolling a team of voluntary curators. The arXiv preprint server has been successfully using a similar team to carefully eliminate manuscripts advancing pseudoscientific claims. So as such, it is easier to make sure people are familiar with the preprint and post-publication review paradigm than to take advantage of their confusion and call for preprint papers to be eliminated altogether.

B. Those who support the idea that preprint papers are dangerous, and argue that peer-review is a better way to protect against unsupported claims, are by proxy advocating for the persistence of a knowledge hegemony. Peer-review is opaque, sustained by unpaid and overworked labour, and dispenses the same function that an open discussion often does at larger scale and with greater transparency. Indeed, the transparency represents the most important difference: since peer-review has traditionally been the demesne of journals, supporting peer-review is tantamount to designating journals as the sole and unquestionable arbiters of what knowledge enters the public domain and what doesn’t.

(Here’s one example of how such gatekeeping can have tragic consequences for society.)

C. Given these safeguards and perspectives, and as I have written before, bad journalists and bad comments will be bad irrespective of the window through which an idea has presented itself in the public domain. There is a way to cover different types of stories, and the decision to abdicate one’s responsibility to think carefully about the implications of what one is writing can never have a causal relationship with the subject matter. The Times of India and the Daily Mail will continue to publicise every new paper discussing whatever coffee, chocolate and/or wine does to the heart, and The Hindu and The Wire Science will publicise research published in preprint papers because we know how to be careful and of the risks to protect ourselves against.

By extension, ‘reputable’ scientific journals that use pre-publication peer-review will continue to publish many papers that will someday be retracted.

An ongoing scandal concerning spider biologist Jonathan Pruitt offers a useful parable – that journals don’t always publish bad science due to wilful negligence or poor peer-review alone but that such failures still do well to highlight the shortcomings of the latter. A string of papers the work on which Pruitt led were found to contain implausible data in support of some significant conclusions. Dan Bolnick, the editor of The American Naturalist, which became the first journal to retract Pruitt’s papers that it had published, wrote on his blog on January 30:

I want to emphasise that regardless of the root cause of the data problems (error or intent), these people are victims who have been harmed by trusting data that they themselves did not generate. Having spent days sifting through these data files I can also attest to the fact that the suspect patterns are often non-obvious, so we should not be blaming these victims for failing to see something that requires significant effort to uncover by examining the data in ways that are not standard for any of this. … The associate editor [who Bolnick tasked with checking more of Pruitt’s papers] went as far back as digging into some of Pruitt’s PhD work, when he was a student with Susan Riechert at the University of Tennessee Knoxville. Similar problems were identified in those data… Seeking an explanation, I [emailed and then called] his PhD mentor, Susan Riechert, to discuss the biology of the spiders, his data collection habits, and his integrity. She was shocked, and disturbed, and surprised. That someone who knew him so well for many years could be unaware of this problem (and its extent), highlights for me how reasonable it is that the rest of us could be caught unaware.

Why should we expect peer-review – or any kind of review, for that matter – to be better? The only thing we can do is be honest, transparent and reflexive.

Tag: bioRxiv

Distracting from the peer-review problem

Another controversy, another round of blaming preprints