A gentle push over the cliff

From ‘Rotavirus vaccine: tortured data analyses raise false safety alarm’, The Hindu, June 22, 2024:

Slamming the recently published paper by Dr. Jacob Puliyel from the International Institute of Health Management Research, New Delhi, on rotavirus vaccine safety, microbiologist Dr. Gagandeep Kang says: “If you do 20 different analyses, one of them will appear significant. This is truly cherry picking data, cherry picking analysis, changing the data around, adjusting the data, not using the whole data in order to find something [that shows the vaccine is not safe].” Dr. Kang was the principal investigator of the rotavirus vaccine trials and the corresponding author of the 2020 paper in The New England Journal of Medicine, the data of which was used by Dr. Puliyel for his reanalysis.

This is an important rebuttal. I haven’t seen Puliyel’s study but Bharat Biotech’s conduct during and since the COVID-19 pandemic, especially that of its executive chairman Krishna Ella, plus its attitude towards public scrutiny of its Covaxin vaccine has rendered any criticism of the company or its products very believable, even if such criticism is unwarranted, misguided, or just nonsense.

Puliyel’s study itself is a case in point: a quick search on Twitter reveals many strongly worded tweets, speaking to the availability of a mass of people that wants something to be true, and at the first appearance of even feeble evidence will seize on it. Of course The Hindu article found the evidence to not be feeble so much as contrived. Bharat Biotech isn’t “hiding” anything; Puliyel et al. aren’t “whistleblowers”.

The article doesn’t mention the name of the journal that published Puliyel’s paper: International Journal of Risk and Safety in Medicine. It could have because journals that don’t keep against bad science out of the medical literature don’t just pollute the literature. By virtue of being journals, and in this case claiming to be peer-reviewed as well, they allow the claims they publish to be amplified by unsuspecting users on social media platforms.

We saw something similar earlier this year in the political sphere when members of the Indian National Congress party and its allies as well as members of civil society cast doubt on electronic voting machines with little evidence, thus only undermining trust in the electoral process.

To be sure, we’ve cried ourselves hoarse about the importance of every reader being sceptical about what appears in scientific journals (even peer-reviewed) as much as news articles, but because it’s a behavioural and cultural change it’s going to take time. Journals need to do their bit, too, yet they won’t because who needs scruples when you can have profits?

The analytical methods Puliyel and his coauthor Brian Hooker reportedly employed in their new study is reminiscent of the work of Brian Wansink, who resigned from Cornell University five years ago this month after it concluded he’d committed scientific misconduct. In 2018, BuzzFeed published a deep-dive by Stephanie M. Lee on how the Wansink scandal was born. It gave the (well-referenced) impression that the scandal was a combination of a student’s relationship with a mentor renowned in her field of work and the mentor’s pursuit of headlines over science done properly. It’s hard to imagine Puliyel and Hooker were facing any kind of coercion, which leaves the headlines.

This isn’t hard to believe considering it’s the second study to have been published recently that took a shot at Bharat Biotech based on shoddy research. It sucks that it’s become so easy to push people over the cliff, and into the ravenous maw of a conspiracy theory, but it sucks more that some people will push others even when they know better.

Poonam Pandey and peer-review

One dubious but vigorous narrative that has emerged around Poonam Pandey’s “death” and subsequent return to life is that the mainstream media will publish “anything”.

To be sure, there were broadly two kinds of news reports after the post appeared on her Instagram handle claiming Pandey had died of cervical cancer: one said she’d died and quoted the Instagram post; the other said her management team had said she’d died. That is, the first kind stated her death as a truth and the other stated her team’s statement as a truth. News reports of the latter variety obviously ‘look’ better now that Pandey and her team said she lied (to raise awareness of cervical cancer). But judging the former news reports harshly isn’t fair.

This incident has been evocative of the role of peer-review in scientific publishing. After scientists write up a manuscript describing an experiment and submit it to a journal to consider for publishing, the journal editors farm it out to a group of independent experts on the same topic and ask them if they think the paper is worth publishing. (Pre-publishing) Peer-review has many flaws, including the fact that peer-reviewers are expected to volunteer their time and expertise and that the process is often slow, inconsistent, biased, and opaque.

But for all these concerns, peer-review isn’t designed to reveal deliberately – and increasingly cleverly – concealed fraud. Granted, the journal could be held responsible for missing plagiarism and the journal and peer-reviewers both for clearly duplicated images and entirely bullshit papers. However, pinning the blame for, say, failing to double-check findings because the infrastructure to do so is hard to come by on peer-review would be ridiculous.

Peer-review’s primary function, as far as I understand it, is to check whether the data presented in the study support the conclusions drawn from the study. It works best with some level of trust. Expecting it to respond perfectly to an activity that deliberately and precisely undermines that trust is ridiculous. A better response (to more advanced tools with which to attempt fraud but also to democratise access to scientific knowledge) would be to overhaul the ‘conventional’ publishing process, such as with transparent peer-review and/or paying for the requisite expertise and labour.

(I’m an admirer of the radical strategy eLife adopted in October 2022: to review preprint papers and publicise its reviewers’ findings along with the reviewers’ identities and the paper, share recommendations with the authors to improve it, but not accept or reject the paper per se.)

Equally importantly, we shouldn’t consider a published research paper to be the last word but in fact a work in progress with room for revision, correction or even retraction. Doing otherwise – as much as stigmatising retractions for reasons not related to misconduct or fraud, for that matter – on the other hand, may render peer-review suspect when people find mistakes in a published paper even when the fault lies elsewhere.

Analogously, journalism is required to be sceptical, adversarial even – but of what? Not every claim is worthy of investigative and/or adversarial journalism. In particular, when a claim is publicised that someone has died and a group of people that manages that individual’s public profile “confirms” the claim is true, that’s the end of that. This an important reason why these groups exist, so when they compromise that purpose, blaming journalists is misguided.

And unlike peer-review, the journalistic processes in place (in many but not all newsrooms) to check potentially problematic claims – for example, that “a high-powered committee” is required “for an extensive consideration of the challenges arising from fast population growth” – are perfectly functional, in part because their false-positive rate is lower without having to investigate “confirmed” claims of a person’s death than with.

The journal’s part in a retraction

This is another Ranga Dias and superconductivity post, so please avert your gaze if you’re tired of it already.

According to a September 27 report in Science, the journal Nature plans to retract the latest Dias et al. paper, published in March 2023, claiming to have found evidence of near-room-temperature superconductivity in an unusual material, nitrogen-doped lutetium hydride (N-LuH). The heart of the matter seems to be, per Science, a plot showing a drop in N-LuH’s electric resistance below a particular temperature – a famous sign of superconductivity.

Dias (University of Rochester) and Ashkan Salamat (University of Nevada, Las Vegas), the other lead investigator in the study, measured the resistance in a noisy setting and then subtracted the noise – or what they claimed to be the noise. The problem is apparently that the subtracted plot in the published paper and the plot put together using raw data submitted by Dias and Salamat to Nature are different; the latter doesn’t show the resistance dropping to zero. Meaning that together with the noise, the paper’s authors subtracted some other information as well, and whatever was left behind suggested N-LuH had become superconducting.

A little more than a month ago, Physical Review Letters officially retracted another paper of a study led by Dias and Salamat after publishing it last year – and notably after a similar dispute (and on both occasions Dias was opposed to having the papers retracted). But the narrative was more dramatic then, with Physical Review Letters accusing Salamat of obstructing its investigation by supplying some other data as the raw data for its independent probe.

Then again, even before Science‘s report, other scientists in the same field had said that they weren’t bothering with replicating the data in the N-LuH paper because they had already wasted time trying to replicate Dias’s previous work, in vain.

Now, in the last year alone, three of Dias’s superconductivity-related papers have been retracted. But as on previous occasions, the new report also raises questions about Nature‘s pre-publication peer-review process. To quote Science:

In response to [James Hamlin and Brad Ramshaw’s critique of the subtracted plot], Nature initiated a post-publication review process, soliciting feedback from four independent experts. In documents obtained by Science, all four referees expressed strong concerns about the credibility of the data. ‘I fail to understand why the authors … are not willing or able to provide clear and timely responses,’ wrote one of the anonymous referees. ‘Without such responses the credibility of the published results are in question.’ A second referee went further, writing: ‘I strongly recommend that the article by R. Dias and A. Salamat be retracted.’

What was the difference between this review process and the one that happened before the paper was published, in which Nature‘s editors would have written to independent experts asking them for their opinions on the submitted manuscript? Why didn’t they catch the problem with the electrical resistance plot?

One possible explanation is the sampling problem: when writing an article as a science journalist, the views expressed in the article will be a function of the scientists that I have sampled from within the scientific community. In order to obtain the consensus view, I need to sample a sufficiently large number of scientists (or a small number of representative scientists, such as those who I know are in touch with the pulse of the community). Otherwise, there’s a nontrivial risk of some view in my article being over- or under-represented.

Similarly, during its pre-publication peer-review process, did Nature not sample the right set of reviewers? I’m unable to think of other explanations because the sampling problem accounts for many alternatives. Hamlin and Ramshaw also didn’t necessarily have access to more data than Dias et al. submitted to Nature because their criticism emerged in May 2023 itself, and was based on the published paper. Nature also hasn’t disclosed the pre-publication reviewers’ reports nor explained if there were any differences between its sampling process in the pre- and post-publication phases.

So short of there being a good explanation, as much as we have a scientist who’s seemingly been crying wolf about room-temperature superconductivity, we also have a journal whose peer-review process produced, on two separate occasions, two different results. Unless it can clarify why this isn’t so, Nature is also to blame for the paper’s fate.

What’s with superconductors and peer-review?

Throughout the time I’ve been a commissioning editor for science-related articles for news outlets, I’ve always sought and published articles about academic publishing. It’s the part of the scientific enterprise that seems to have been shaped the least by science’s democratic and introspective impulses. It’s also this long and tall wall erected around the field where scientists are labouring, offering ‘visitors’ guided tours for a hefty fee – or, in many cases, for ‘free’ if the scientists are willing to pay the hefty fees instead. Of late, I’ve spent more time thinking about peer-review, the practice of a journal distributing copies of a manuscript it’s considering for publication to independent experts on the same topic, for their technical inputs.

Most of the peer-review that happens today is voluntary: the scientists who do it aren’t paid. You must’ve come across several articles of late about whether peer-review works. It seems to me that it’s far from perfect. Studies (in July 1998, September 1998, and October 2008, e.g.) have shown that peer-reviewers often don’t catch critical problems in papers. In February 2023, a noted scientist said in a conversation that peer-reviewers go into a paper assuming that the data presented therein hasn’t been tampered with. This statement was eye-opening for me because I can’t think of a more important reason to include technical experts in the publishing process than to wean out problems that only technical experts can catch. Anyway, these flaws with the peer-review system aren’t generalisable, per se: many scientists have also told me that their papers benefited from peer-review, especially review that helped them improve their work.

I personally don’t know how ‘much’ peer-review is of the former variety and how much the latter, but it seems safe to state that when manuscripts are written in good faith by competent scientists and sent to the right journal, and the journal treats its peer-reviewers as well as its mandate well, peer-review works. Otherwise, it tends to not work. This heuristic, so to speak, allows for the fact that ‘prestige’ journals like Nature, Science, NEJM, and Cell – which have made a name for themselves by publishing papers that were milestones in their respective fields – have also published and then had to retract many papers that made exciting claims that were subsequently found to be untenable. These journals’ ‘prestige’ is closely related to their taste for sensational results.

All these thoughts were recently brought into focus by the ongoing hoopla, especially on Twitter, about the preprint papers from a South Korean research group claiming the discovery of a room-temperature superconductor in a material called LK-99 (this is the main paper). This work has caught the imagination of users on the platform unlike any other paper about room-temperature superconductivity in recent times. I believe this is because the preprints contain some charts and data that were absent in similar work in the past, and which strongly indicate the presence of a superconducting state at ambient temperature and pressure, and because the preprints include instructions on the material’s synthesis and composition, which means other scientists can produce and check for themselves. Personally, I’m holding the stance advised by Prof. Vijay B. Shenoy of IISc:

Many research groups around the world will attempt to reproduce these results; there are already some rumours that independent scientists have done so. We will have to wait for the results of their studies.

Curiously, the preprints have caught the attention of a not insignificant number of techbros, who, alongside the typically naïve displays of their newfound expertise, have also called for the peer-review system to be abolished because it’s too slow and opaque.

Peer-review has a storied relationship with superconductivity. In the early 2000s, a slew of papers coauthored by the German physicist Jan Hendrik Schön, working at a Bell Labs facility in the US, were retracted after independent investigations found that he had fabricated data to support claims that certain organic molecules, called fullerenes, were superconducting. The Guardian wrote in September 2002:

The Schön affair has besmirched the peer review process in physics as never before. Why didn’t the peer review system catch the discrepancies in his work? A referee in a new field doesn’t want to “be the bad guy on the block,” says Dutch physicist Teun Klapwijk, so he generally gives the author the benefit of the doubt. But physicists did become irritated after a while, says Klapwijk, “that Schön’s flurry of papers continued without increased detail, and with the same sloppiness and inconsistencies.”

Some critics hold the journals responsible. The editors of Science and Nature have stoutly defended their review process in interviews with the London Times Higher Education Supplement. Karl Ziemelis, one of Nature’s physical science editors, complained of scapegoating, while Donald Kennedy, who edits Science, asserted that “There is little journals can do about detecting scientific misconduct.”

Maybe not, responds Nobel prize-winning physicist Philip Anderson of Princeton, but the way that Science and Nature compete for cutting-edge work “compromised the review process in this instance.” These two industry-leading publications “decide for themselves what is good science – or good-selling science,” says Anderson (who is also a former Bell Labs director), and their market consciousness “encourages people to push into print with shoddy results.” Such urgency would presumably lead to hasty review practices. Klapwijk, a superconductivity specialist, said that he had raised objections to a Schön paper sent to him for review, but that it was published anyway.

A similar claim by a group at IISc in 2019 generated a lot of excitement then, but today almost no one has any idea what happened to it. It seems reasonable to assume that the findings didn’t pan out in further testing and/or that the peer-review, following the manuscript being submitted to Nature, found problems in the group’s data. Last month, the South Korean group uploaded its papers to the arXiv preprint repository and has presumably submitted them to a journal: for a finding this momentous, that seems like the obvious next step. And the journal is presumably conducting peer-review at this point.

But in both instances (IISc 2019 and today), the claims were also accompanied by independent attempts to replicate the data as well as journalistic articles that assimilated the various public narratives and their social relevance into a cogent whole. One of the first signs that there was a problem with the IISc preprint was another preprint by Brian Skinner, a physicist then with the Massachusetts Institute of Technology, who found the noise in two graphs plotting the results of two distinct tests to be the same – which is impossible. Independent scientists also told The Wire (where I worked then) that they lacked some information required to make sense of the results as well as expressed concerns with the magnetic susceptibility data.

Peer-review may not be designed to check whether the experiments in question produced the data in question but whether the data in question supports the conclusions. For example, in March this year, Nature published a study led by Ranga P. Dias in which he and his team claimed that nitrogen-doped lutetium hydride becomes a room-temperature superconductor under a pressure of 1,000 atm, considerably lower than the pressure required to produce a superconducting state in other similar materials. After it was published, many independent scientists raised concerns about some data and analytical methods presented in the paper – as well as its failure to specify how the material could be synthesised. These problems, it seems, didn’t prevent the paper from clearing peer-review. Yet on August 3, Martin M. Bauer, a particle physicist at Durham University, published a tweet defending peer-review in the context of the South Korean work thus:

The problem seems to me to be the belief – held by many pro- as well as anti-peer-review actors – that peer-review is the ultimate check capable of filtering out all forms of bad science. It just can’t, and maybe that’s okay. Contrary to what Dr. Bauer has said, and as the example of Dr. Dias’s paper suggests, peer-reviewers won’t attempt to replicate the South Korean study. That task, thanks to the level of detail in the South Korean preprint and the fact that preprints are freely accessible, is already being undertaken by a panoply of labs around the world, both inside and outside universities. So abolishing peer-review won’t be as bad as Dr. Bauer makes it sound. As I said, peer-review is, or ought to be, one of many checks.

It’s also the sole check that a journal undertakes, and maybe that’s the bigger problem. That is, scientific journals may well be a pit of papers of unpredictable quality without peer-review in the picture – but that would only be because journal editors and scientists are separate functional groups, rather than having a group of scientists take direct charge of the publishing process (akin to how arXiv currently operates). In the existing publishing model, peer-review is as important as it is because scientists aren’t involved in any other part of the publishing pipeline.

An alternative model comes to mind, one that closes the gaps of “isn’t designed to check whether the experiments in question produced the data in question” and “the sole check that a journal undertakes”: scientists conduct their experiments, write them up in a manuscript and upload them to a preprint repository; other scientists attempt to replicate the results; if the latter are successful, both groups update the preprint paper and submit that to a journal (with the lion’s share of the credit going to the former group); journal editors have this document peer-reviewed (to check whether the data presented supports the conclusions), edited, and polished[1]; and finally publish it.

Obviously this would require a significant reorganisation of incentives: for one, researchers will need to be able to apportion time and resources to replicate others’ experiments for less than half of the credit. A second problem is that this is a (probably non-novel) reimagination of the publishing workflow that doesn’t consider the business model – the other major problem in academic publishing. Third: I have in my mind only condensed-matter physics; I don’t know much about the challenges to replicating results in, say, genomics, computer science or astrophysics. My point overall is that if journals look like a car crash without peer-review, it’s only because the crashes were a matter of time and that peer-review was doing the bare minimum to keep them from happening. (And Twitter was always a car crash anyway.)


[1] I hope readers won’t underestimate this the importance of editorial and language assistance that a journal can provide. Last month, researchers in Australia, Germany, Nepal, Spain, the UK, and the US had a paper published in which they reported, based on surveys, that “non-native English speakers, especially early in their careers, spend more effort than native English speakers in conducting scientific activities, from reading and writing papers and preparing presentations in English, to disseminating research in multiple languages. Language barriers can also cause them not to attend, or give oral presentations at, international conferences conducted in English.”

The language in the South Korean group’s preprints indicate that its authors’ first language isn’t English. According to Springer, which later became Springer Nature, the publisher of the Nature journals, “Editorial reasons for rejection include … poor language quality such that it cannot be understood by readers”. An undated article on Elsevier’s ‘Author Services’ page has this line: “For Marco [Casola, managing editor of Water Research], poor language can indicate further issues with a paper. ‘Language errors can sometimes ring a bell as a link to quality. If a manuscript is written in poor English the science behind it may not be amazing. This isn’t always the case, but it can be an indication.'”

But instead of palming the responsibility off to scientists, journals have an opportunity to distinguish themselves by helping researchers write better papers.

A Kuhnian gap between research publishing and academic success

There is a gap in research publishing and how it relates to academic success. On the one hand, there are scientists complaining of low funds, being short-staffed, low-quality or absent equipment, disoptimal employment/tenure terms, bureaucratic incompetence and political interference. On the other, there are scientists who describe their success within academia in terms of being published in XYZ journals (with impact factors of PQR), having high h-indices, having so many papers to their names, etc.

These two scenarios – both very real in India and I imagine in most countries – don’t straightforwardly lead to the other. They require a bridge, a systemic symptom that makes both of them possible even when they’re incompatible with each other. This bridge is those scientists’ attitudes about what it’s okay to do in order to keep the two façades in harmonious coexistence.

What is it okay to do? For starters, keep the research-publishing machinery running in a way that allows them to evaluate other scientists on matters other than their scientific work. This way, lack of resources for research can be decoupled from scientists’ output in journals. Clever, right?

According to a study published a month ago, manuscripts that include a Nobel laureate’s name among the coauthors are six-times more likely to be accepted for publication than those without a laureate’s name in the mast. This finding piles on other gender-related problems with peer-review, including women’s papers being accepted less often as well as men dominating the community of peer-reviewers. Nature News reported:

Knowledge of the high status of a paper’s author might justifiably influence a reviewer’s opinion: it could boost their willingness to accept a counterintuitive result, for example, on the basis of the author’s track record of rigour. But Palan’s study found that reviewers’ opinions changed across all six of the measures they were asked about, including the subject’s worthiness, the novelty of the information and whether the conclusions were supported. These things should not all be affected by knowledge of authorship, [Palan, one of the paper’s coauthors, said].

Palan also said the solution to this problem is for journals to adopt double-anonymised peer-review: the authors don’t who the reviewers and the reviewers don’t know who the authors are. The most common form of peer-review is the single-blind variety, where the reviewers know who the authors are but the authors don’t know who the reviewers are. FWIW, I prefer double-anonymised peer-review plus the journal publishing the peer-reviewers’ anonymised reports along with the paper.

Then again, modifying peer-review would still be localised to journals that are willing to adopt newer mechanisms, and thus be a stop-gap solution that doesn’t address the use of faulty peer-review mechanisms both inside journals and in academic settings. For example, given the resource-mininal context in which many Indian research institutes and universities function, hiring and promotion committees often decide whom to hire or promote based on which journals their papers have been published in and/or the number of times those papers have been cited.

Instead, what we need is systemic change that responds to all the problems with peer-review, instead of one problem at a time in piecemeal fashion, by improving transparency, resources and incentives. Specifically: a) make peer-review more transparent, b) give scientists the resources – including time and freedom – to evaluate each others’ work on factors localised to the context of their research (including the quality of their work and the challenges in their way), and c) incentivise scientists to do so in order to accelerate change and ensure compliance.

The scientometric numbers, originally invented to facilitate the large-scale computational analysis of the scientific literature, have come to subsume the purpose of the scientific enterprise itself: that is, scientists often want to have good numbers instead of want to do good science. As a result, there is often an unusual delay – akin to the magnetic hysteresis – between the resources for research being cut back and the resulting drop in productivity and quality showing in the researchers’ output. Perhaps more fittingly, it’s a Kuhnian response to paradigm change.

Yes, scientific journals should publish political rebuttals

(The headline is partly click-bait, as I admit below, because some context is required.) From ‘Should scientific journals publish political debunkings?’Science Fictions by Stuart Ritchie, August 27, 2022:

Earlier this week, the “news and analysis” section of the journal Science … published … a point-by-point rebuttal of a monologue a few days earlier from the Fox News show Tucker Carlson Tonight, where the eponymous host excoriated Dr. Anthony Fauci, of “seen everywhere during the pandemic” fame. … The Science piece noted that “[a]lmost everything Tucker Carlson said… was misleading or false”. That’s completely correct – so why did I have misgivings about the Science piece? It’s the kind of thing you see all the time on dedicated political fact-checking sites – but I’d never before seen it in a scientific journal. … I feel very conflicted on whether this is a sensible idea. And, instead of actually taking some time to think it through and work out a solid position, in true hand-wringing style I’m going to write down both sides of the argument in the form of a dialogue – with myself.

There’s one particular exchange between Ritchie and himself in his piece that threw me off the entire point of the article:

[Ritchie-in-favour-of-Science-doing-this]: Just a second. This wasn’t published in the peer-reviewed section of Science! This isn’t a refereed paper – it’s in the “News and Analysis” section. Wouldn’t you expect an “Analysis” article to, like, analyse things? Including statements made on Fox News?

[Ritchie-opposed-to-Science-doing-this]: To be honest, sometimes I wonder why scientific journals have a “News and Analysis” section at all – or, I wonder if it’s healthy in the long run. In any case, clearly there’s a big “halo” effect from the peer-reviewed part: people take the News and Analysis more seriously because it’s attached to the very esteemed journal. People are sharing it on social media because it’s “the journal Science debunking Tucker Carlson” – way fewer people would care if it was just published on some random news site. I don’t think you can have it both ways by saying it’s actually nothing to do with Science the peer-reviewed journal.

[Ritchie-in-favour]: I was just saying they were separate, rather than entirely unrelated, but fair enough.

Excuse me but not at all fair enough! The essential problem is the tie-ins between what a journal does, why it does them and what impressions they uphold in society.

First, Science‘s ‘news and analysis’ section isn’t distinguished by its association with the peer-reviewed portion of the journal but by its own reportage and analyses, intended for scientists and non-scientists alike. (Mea culpa: the headline of this post answers the question in the headline of Ritchie’s post, while being clear in the body that there’s a clear distinction between the journal and its ‘news and analysis’ section.) A very recent example was Charles Piller’s investigative report that uncovered evidence of image manipulation in a paper that had an outsized influence on the direction of Alzheimer’s research since it was published in 2006. When Ritchie writes that the peer-reviewed journal and the ‘news and analysis’ section are separate, he’s right – but when he suggests that the former’s prestige is responsible for the latter’s popularity, he’s couldn’t be more wrong.

Ritchie is a scientist and his position may reflect that of many other scientists. I recommend that he and others who agree with him consider the section from the PoV of a science journalist, when they will immediately see as we do that it has broken many agenda-setting stories as well as has published several accomplished journalists and scientists (Derek Lowe’s column being a good example). Another impression that could change with the change of perspective is the relevance of peer-review itself, and the deceptively deleterious nature of an associated concept he repeatedly invokes, which could as well be the pseudo-problem at the heart of Ritchie’s dilemma: prestige. To quote from a blog post in which University of Regensburg neurogeneticist Björn Brembs analysed the novelty of results published by so-called ‘prestigious’ journals, and published in February this year:

Taken together, despite the best efforts of the professional editors and best reviewers the planet has to offer, the input material that prestigious journals have to deal with appears to be the dominant factor for any ‘novelty’ signal in the stream of publications coming from these journals. Looking at all articles, the effect of all this expensive editorial and reviewer work amounts to probably not much more than a slightly biased random selection, dominated largely by the input and to probably only a very small degree by the filter properties. In this perspective, editors and reviewers appear helplessly overtaxed, being tasked with a job that is humanly impossible to perform correctly in the antiquated way it is organized now.

In sum:

Evidence suggests that the prestige signal in our current journals is noisy, expensive and flags unreliable science. There is a lack of evidence that the supposed filter function of prestigious journals is not just a biased random selection of already self-selected input material. As such, massive improvement along several variables can be expected from a more modern implementation of the prestige signal.

Take the ‘prestige’ away and one part of Ritchie’s dilemma – the journal Science‘s claim to being an “impartial authority” that stands at risk of being diluted by its ‘news and analysis’ section’s engagement with “grubby political debates” – evaporates. Journals, especially glamour journals like Science, haven’t historically been authorities on ‘good’ science, such as it is, but have served to obfuscate the fact that only scientists can be. But more broadly, the ‘news and analysis’ business has its own expensive economics, and publishers of scientific journals that can afford to set up such platforms should consider doing so, in my view, with a degree and type of separation between these businesses according to their mileage. The simple reasons are:

1. Reject the false balance: there’s no sensible way publishing a pro-democracy article (calling out cynical and potentially life-threatening untruths) could affect the journal’s ‘prestige’, however it may be defined. But if it does, would the journal be wary of a pro-Republican (and effectively anti-democratic) scientist refusing to publish on its pages? If so, why? The two-part answer is straightforward: because many other scientists as well as journal editors are still concerned with the titles that publish papers instead of the papers themselves, and because of the fundamental incentives of academic publishing – to publish the work of prestigious scientists and sensational work, as opposed to good work per se. In this sense, the knock-back is entirely acceptable in the hopes that it could dismantle the fixation on which journal publishes which paper.

2. Scientific journals already have access to expertise in various fields of study, as well as an incentive to participate in the creation of a sensible culture of science appreciation and criticism.

Featured image: Tucker Carlson at an event in West Palm Beach, Florida, December 19, 2020. Credit: Gage Skidmore/Wikimedia Commons, CC BY-SA 2.0.

Are preprints reliable?

To quote from a paper published yesterday in PLOS Biology:

Does the information shared in preprints typically withstand the scrutiny of peer review, or are conclusions likely to change in the version of record? We assessed preprints from bioRxiv and medRxiv that had been posted and subsequently published in a journal through April 30, 2020, representing the initial phase of the pandemic response. We utilised a combination of automatic and manual annotations to quantify how an article changed between the preprinted and published version. We found that the total number of figure panels and tables changed little between preprint and published articles. Moreover, the conclusions of 7.2% of non-COVID-19-related and 17.2% of COVID-19-related abstracts undergo a discrete change by the time of publication, but the majority of these changes do not qualitatively change the conclusions of the paper.

Later: “A major concern with expedited publishing is that it may impede the rigour of the peer review process.”

So far, according to this and one other paper published by PLOS Biology, it seems reasonable to ask not whether preprints are reliable but what peer-review brings to the table. (By this I mean the conventional/legacy variety of closed pre-publication review).

To the uninitiated: paralleling the growing popularity and usefulness of open-access publishing, particularly in the first year of the COVID-19 pandemic, some “selective” journals – to use wording from the PLOS Biology paper – and their hordes of scientist-supporters have sought to stress the importance of peer-review in language both familiar and based on an increasingly outdated outlook: that peer-review is important to prevent misinformation. I’ve found a subset of this argument, that peer-review is important for papers whose findings could save/end lives, to be more reasonable, and the rest just unreasonable and self-serving.

Funnily enough, two famously “selective” journals, The Lancet and the New England Journal of Medicineretracted two papers related to COVID-19 care in the thick of the pandemic – invalidating their broader argument in favour of peer-review as well as the efficiency of their own peer-review processes vis-à-vis the subset argument.

Arguments in favour of peer-review are self-serving because it has more efficient, more transparent and more workable alternatives, yet many journals have failed to adopt them, and have instead used this repeatedly invalidated mode of reviewing papers to maintain their opaque style of functioning, which in turn – and together with the purported cost of printing papers on physical paper – they use to justify the exorbitant prices they charge readers (here’s one ludicrous example).

For example, one alternative is pre-publication peer-review, in which scientists upload their paper to a preprint server, like arXiv, bioRxiv or medRxiv, and share the link with their peers and, say, on social media platforms. There, independent experts review the paper’s contents and share their comments. The paper’s authors can incorporate the necessary changes, with credit, as separate versions of the same paper on the server.

Further, and unlike ‘conventional’ journals’ laughable expectation of journalists to write about the papers they publish without fear of being wrong, journalists subject preprint papers to the same treatment that is due the average peer-reviewed paper as well: with reasonable and courteous scepticism, and to qualify its claims and findings with comments from independent experts – with an added caveat, though I personally think it unnecessary, that their subject is a preprint paper.

(Some of you might remember that in 2018, Tom Sheldon argued in a Nature News & Views article that peer-review facilitates good journalism. I haven’t come across an argument more objectionable in favour of conventional peer-review.)

However, making this mode of reviewing and publishing more acceptable has been very hard, especially for the demand to repeatedly push back against scientists whose academic reputation depends on having published and being able to publish in “selective” journals and the scientometric culture they uphold, and their hollow arguments about the virtues of conventional, opaque peer-review. (Making peer-review transparent could also help deal with reviewers who use the opportunity anonymity affords them to be sexist and racist.)

But with the two new PLOS Biology papers, we have an opportunity to flip these scientists’ and journals’ demand that preprint papers ‘prove’ or ‘improve’ themselves around to ask what the legacy modes bring to the table. From the abstract of the second paper (emphasis added):

We sought to compare the and contrast linguistic features within bioRxiv preprints to published biomedical test as a while as this is an excellent opportunity to examine how peer review changes these documents. The most prevalent features that changed appear to be associated with typesetting and mentions of supporting information sections or additional files. In addition to text comparison, we created document embeddings derived from a preprint-trained word2vec model. We found that these embeddings are able to parse out different scientific approaches and concepts, link unannotated preprint-peer-reviewed article pairs, and identify journals that publish linguistically similar papers to a given preprint. We also used these embeddings to examine factors associated with the time elapsed between the posting of a first preprint and the appearance of a peer-reviewed publication. We found that preprints with more versions posted and more textual changes took longer to publish.

It seems to me to be reasonable to ask about the rigour to which supporters of conventional peer-review have staked claim when few papers appear to benefit from it. The process may be justified in those few cases where a paper is corrected in a significant way, and that it may be difficult to identify those papers without peer-review – but pre-publication peer-review has an equal chance of identifying the same errors (esp. if we increase the discoverability of preprints the way journal editors identify eminent experts in the same field to review papers, instead of relying solely on social-media interactions that less internet-savvy scientists may not be able to initiate).

In addition, it appears that in most cases in which preprints were uploaded to bioRxiv first and were then peer-reviewed and published by a journal, the papers’ authors clearly didn’t submit papers that required significant quality improvements – certainly not to the extent to which conventional peer-review’s supporters have alluded to in an effort to make such review necessary.

So, why must conventional peer-review, in the broader sense, persist?

PeerJ’s peer-review problem

Of all the scientific journals in the wild, there are a few I keep a closer eye on: they publish interesting results but more importantly they have been forward-thinking on matters of scientific publishing and they’ve also displayed a tendency to think out loud (through blog posts, say) and actively consider public feedback. Reading what they publish in these posts, and following the discussions that envelope them, has given me many useful insights into how scientific publishing works and, perhaps more importantly, how the perceptions surrounding this enterprise are shaped and play out.

One such journal is eLife. All their papers are open access, and they also publish the papers’ authors’ notes and reviewers’ comments with each paper. They also have a lively ‘magazine’ section in which they publish articles and essays by working scientists – especially younger ones – relating to the extended social environments in which knowledge-work happens. Now, for some reason, I’d cast PeerJ in similarly progressive light, even though I hadn’t visited their website in a long time. But on August 16, PeerJ published the following tweet:

It struck me as a weird decision (not that anyone cares). Since the article explaining the journal’s decision appears to be available under a Creative Commons Attribution license, I’m reproducing it here in full so that I can annotate my way through it.

Since our launch, PeerJ has worked towards the goal of publishing all “Sound Science”, as cost effectively as possible, for the benefit of the scientific community and society. As a result we have, until now, evaluated articles based only on an objective determination of scientific and methodological soundness, not on subjective determinations of impact, novelty or interest.

At the same time, at the core of our mission has been a promise to give researchers more influence over the publishing process and to listen to community feedback over how peer review  should work and how research should be assessed.

Great.

In recent months we have been thinking long and hard about feedback, from both our Editorial Board and Reviewers, that certain articles should no longer be considered as valid candidates for peer review or formal publication: that whilst the science they present may be “sound”, it is not of enough value to either the scientific record, the scientific community, or society, to justify being peer-reviewed or be considered for publication in a peer-reviewed journal. Our Editorial Board Members have asked us that we do our best to identify such submissions before they enter peer review.

This is the confusing part. To the uninitiated: One type of the scientific publishing process involves scientists writing up a paper and submitting it to a journal for consideration. An editor, or editors, at the journal checks the paper and then commissions a group of independent experts on the same topic to review it. These experts are expected to provide comments to help the journal decide whether it should publish the paper, and if yes, if the paper can be improved. Note that they are usually not paid for their work or time.

Now, if PeerJ’s usual reviewers are unhappy with how many papers the journal’s asking them to review, how does it make sense to impose a new, arbitrary and honestly counterproductive sort of “value” on submissions instead of increasing the number of reviewers the journal works with?

I find the journal’s decision troublesome because some important details are missing – details that encompass borderline-unethical activities by some other journals that have only undermined the integrity and usefulness of the scientific literature. For example, the “high impact factor” journal Nature has asked its reviewers in the past to prioritise sensational results over glamorous ones, overlooking the fact that such results are also likelier to be wrong. For another example, the concept of pre-registration has started to become more recently simply because most journals used to refuse (and still do) negative results. That is, if a group of scientists set out to check if something was true – and it’d be amazing if it was true – and found that it was false instead, they’d have a tough time finding a journal willing to publish their paper.

And third, preprint papers have started to become an acceptable way of publishing research only in the last few years, and that too only in a few branches of science (especially physics). Most grant-giving and research institutions still prefer papers being published in journals, instead of being uploaded on preprint repositories, not to mention a dominant research culture in many countries – including India – still favouring arbitrarily defined “prestigious journals” over others when it comes to picking scientists for promotions, etc.

For these reasons, any decision by a journal that says sound science and methodological rigour alone won’t suffice to ‘admit’ a paper into their pages risks reinforcing – directly or indirectly – a bias in the scientific record that many scientists are working hard to move away from. For example, if PeerJ rejects a solid paper, to speak, because it ‘only’ confirms a previous discovery, improves its accuracy, etc. and doesn’t fill a knowledge gap, per se, in order to ease the burden on its reviewers, the scientific record still stands to lose out on an important submission. (It pays to review journals’ decisions assuming that each journal is the only one around – à la the categorical imperative – and that other journals don’t exist.)

So what are PeerJ‘s new criteria for rejecting papers?

As a result, we have been working with key stakeholders to develop new ways to evaluate submissions and are introducing new pre-review evaluation criteria, which we will initially apply to papers submitted to our new Medical Sections, followed soon after by all subject areas. These evaluation criteria will define clearer standards for the requirements of certain types of articles in those areas. For example, bioinformatic analyses of already published data sets will need to meet more stringent reporting and data analysis requirements, and will need to clearly demonstrate that they are addressing a meaningful knowledge gap in the literature.

We don’t know yet, it seems.

At some level, of course, this means that PeerJ is moving away from the concept of peer reviewing all sound science. To be absolutely clear, this does not mean we have an intention of becoming a highly-selective “glamour” journal publisher that publishes only the most novel breakthroughs. It also does not mean that we will stop publishing negative or null results. However, the feedback we have received is that the definition of what constitutes a valid candidate for publication needs to evolve.

To be honest, this is a laughable position. The journal admits in the first sentence of this paragraph that no matter where it goes from here, it will only recede from an ideal position. In the next sentence it denies (vehemently, considering in the article on its website, this sentence was in bold) its decision is a move that will transform it into a “glamour” journal – like Nature, Science, NEJM, etc. have been – nor, in the third sentence, that it will stop publishing “negative or null results”. Now I’m even more curious what these heuristics could be which specify that a) submissions have to have “sound science”, b) “address a meaningful knowledge gap”, and c) don’t exclude negative/null results. It’s possible to see some overlap between these requirements that some papers will occupy – but it’s also possible to see many papers that won’t tick all three boxes yet still deserve to be published. To echo PeerJ itself, being a “glamour” journal is only one way to be bad.

We are being influenced by the researchers who peer review our research articles. We have heard from so many of our editorial board members and reviewers that they feel swamped by peer review requests and that they – and the system more widely – are close to breaking point. We most regularly hear this frustration when papers that they are reviewing do not, in their expert opinion, make a meaningful contribution to the record and are destined to be rejected; and should, in their view, have been filtered out much sooner in the process.

If you ask me (as an editor), the first sentence’s syntax seems to suggest PeerJ is being forced by its reviewers, and not influenced. More importantly, I haven’t seen these bespoke problematic papers that are “sound” but at the same time don’t make a meaningful contribution. An expert’s opinion that a paper on some topic should be rejected (even though, again, it’s “sound science”) could be rooted either in an “arrogant gatekeeper” attitude or in valid reasons, and PeerJ‘s rules should be good enough to be able to differentiate between the two without simultaneously allowing ‘bad reviewers’ to over-“influence” the selection process.

More broadly, I’m a science journalist looking into science from the outside, seeing a colossal knowledge-producing machine that’s situated on the same continuum on which I see myself to be located. If I receive too many submissions at The Wire Science, I don’t make presumptuous comments about what I think should and shouldn’t belong in the public domain. Instead, I pitch my boss about hiring one more person on my team and, second, I’m honest with each submission’s author about why I’m rejecting it: “I’m sorry, I’m short on time.”

Such submissions, in turn, impact the peer review of articles that do make a very significant contribution to the literature, research and society – the congestion of the peer review process can mean assigning editors and finding peer reviewers takes more time, potentially delaying important additions to the scientific record.

Gatekeeping by another name?

Furthermore, because it can be difficult and in some cases impossible to assign an Academic Editor and/or reviewers, authors can be faced with frustratingly long waits only to receive the bad news that their article has been rejected or, in the worst cases, that we were unable to peer review their paper. We believe that by listening to this feedback from our communities and removing some of the congestion from the peer review process, we will provide a better, more efficient, experience for everyone.

Ultimately, it comes down to the rules by which PeerJ‘s editorial board is going to decide which papers are ‘worth it’ and which aren’t. And admittedly, without knowing these rules, it’s hard to judge PeerJ – except on one count: “sound science” is already a good enough rule by which to determine the quality of a scientist’s work. To say it doesn’t suffice for reasons unrelated to scientific publishing, and the publishing apparatus’s dangerous tendency to gatekeep based on factors that have little to do with science, sounds at least precarious.

India’s missing research papers

If you’re looking for a quantification (although you shouldn’t) of the extent to which science is being conducted by press releases in India at the moment, consider the following list of studies. The papers for none of them have been published – as preprints or ‘post-prints’ – even as the people behind them, including many government officials and corporate honchos, have issued press releases about the respective findings, which some sections of the media have publicised without question and which have quite likely gone on to inform government decisions about suitable control and mitigation strategies. The collective danger of this failure is only amplified by a deafening silence from many quarters, especially from the wider community of doctors and medical researchers – almost as if it’s normal to conduct studies and publish press releases in a hurry and take an inordinate amount of time upload a preprint manuscript or conduct peer review, instead of the other way around. By the way, did you know India has three science academies?

  1. ICMR’s first seroprevalence survey (99% sure it isn’t out yet, but if I’m wrong, please let me know and link me to the paper?)
  2. Mumbai’s TIFR-NITI seroprevalence survey (100% sure. I asked TIFR when they plan to upload the paper, they said: “We are bound by BMC rules with respect to sharing data and hence we cannot give the raw data to anyone at least [until] we publish the paper. We will upload the preprint version soon.”)
  3. Biocon’s phase II Itolizumab trial (100% sure. More about irregularities here.)
  4. Delhi’s first seroprevalence survey (95% sure. Vinod Paul of NITI Aayog discussed the results but no paper has pinged my radar.)
  5. Delhi’s second seroprevalence survey (100% sure. Indian Express reported on August 8 that it has just wrapped up and the results will be available in 10 days. It didn’t mention a paper, however.)
  6. Bharat Biotech’s COVAXIN preclinical trials (90% sure)
  7. Papers of well-designed, well-powered studies establishing that HCQ, remdesivir, favipiravir and tocilizumab are efficacious against COVID-19 🙂

Aside from this, there have been many disease-transmission models whose results have been played up without discussing the specifics as well as numerous claims about transmission dynamics that have been largely inseparable from the steady stream of pseudoscience, obfuscation and carelessness. In one particularly egregious case, the Indian Council of Medical Research announced in a press release in May that Ahmedabad-based Zydus Cadila had manufactured an ELISA test kit for COVID-19 for ICMR’s use that was 100% specific and 98% sensitive. However, the paper describing the kit’s validation, published later, said it was 97.9% specific and 92.37% sensitive. If you know what these numbers mean, you’ll also know what a big difference this is, between the press release and the paper. After an investigation by Priyanka Pulla followed by multiple questions to different government officials, ICMR admitted it had made a booboo in the press release. I think this is a fair representation of how much the methods of science – which bridge first principles with the results – matter in India during the pandemic.

A non-self-correcting science

While I’m all for a bit of triumphalism when some component of conventional publication vis-à-vis scientific research – like pre-publication anonymous peer review – fails, and fails publicly, I spotted an article in The Conversation earlier today that I thought crossed a line (and not in the way you think). In this article, headlined ‘Retractions and controversies over coronavirus research show that the process of science is working as it should’, the author writes:

Some people are viewing the retractions [by The Lancet and the New England Journal of Medicine] as an indictment of the scientific process. Certainly, the overturning of these papers is bad news, and there is plenty of blame to go around. But despite these short-term setbacks, the scrutiny and subsequent correction of the papers actually show that science is working. Reporting of the pandemic is allowing people to see, many for the first time, the messy business of scientific progress.

The retraction of the hydroxychloroquine paper … drew immediate attention not only because it placed science in a bad light, but also because President Trump had touted the drug as an effective treatment for COVID-19 despite the lack of strong evidence. Responses in the media were harsh. … [Their] headlines may have [had] merit, but perspective is also needed. Retractions are rare – only about 0.04% of published papers are withdrawn – but scrutiny, update and correction are common. It is how science is supposed to work, and it is happening in all areas of research relating to SARS-CoV-2.

If you ask me, this is not science working as it should. This is the journals that published the papers discovering that the mechanisms they’d adopted that they’d said would filter fraudulent papers letting fraudulent papers slip through.

But by the author’s logic, “this is science working as it should” would encompass any mistake that’s later discovered, followed by suitable corrective action. This is neither here nor there – and more importantly it allows broken processes to be subsumed under the logic’s all-encompassing benevolence. If this is scientific publishing as it should be, we wouldn’t have to think deeply about how we can fix anonymous pre-publication peer-review because it wouldn’t be broken. However, we know in reality that it is.

If anything, by advancing his argument, the author has cleverly pressed an argumentative tack that supporters of more progressive scientific publishing models in the service of preserving the status quo. Instead, we need to acknowledge that an important part of science, called science publishing, has evolved into a flawed creature – so that we can set about bending the moral arc towards fixing it. (We already know that if we don’t acknowledge it, we won’t fix it.)