A gentle push over the cliff

From ‘Rotavirus vaccine: tortured data analyses raise false safety alarm’, The Hindu, June 22, 2024:

Slamming the recently published paper by Dr. Jacob Puliyel from the International Institute of Health Management Research, New Delhi, on rotavirus vaccine safety, microbiologist Dr. Gagandeep Kang says: “If you do 20 different analyses, one of them will appear significant. This is truly cherry picking data, cherry picking analysis, changing the data around, adjusting the data, not using the whole data in order to find something [that shows the vaccine is not safe].” Dr. Kang was the principal investigator of the rotavirus vaccine trials and the corresponding author of the 2020 paper in The New England Journal of Medicine, the data of which was used by Dr. Puliyel for his reanalysis.

This is an important rebuttal. I haven’t seen Puliyel’s study but Bharat Biotech’s conduct during and since the COVID-19 pandemic, especially that of its executive chairman Krishna Ella, plus its attitude towards public scrutiny of its Covaxin vaccine has rendered any criticism of the company or its products very believable, even if such criticism is unwarranted, misguided, or just nonsense.

Puliyel’s study itself is a case in point: a quick search on Twitter reveals many strongly worded tweets, speaking to the availability of a mass of people that wants something to be true, and at the first appearance of even feeble evidence will seize on it. Of course The Hindu article found the evidence to not be feeble so much as contrived. Bharat Biotech isn’t “hiding” anything; Puliyel et al. aren’t “whistleblowers”.

The article doesn’t mention the name of the journal that published Puliyel’s paper: International Journal of Risk and Safety in Medicine. It could have because journals that don’t keep against bad science out of the medical literature don’t just pollute the literature. By virtue of being journals, and in this case claiming to be peer-reviewed as well, they allow the claims they publish to be amplified by unsuspecting users on social media platforms.

We saw something similar earlier this year in the political sphere when members of the Indian National Congress party and its allies as well as members of civil society cast doubt on electronic voting machines with little evidence, thus only undermining trust in the electoral process.

To be sure, we’ve cried ourselves hoarse about the importance of every reader being sceptical about what appears in scientific journals (even peer-reviewed) as much as news articles, but because it’s a behavioural and cultural change it’s going to take time. Journals need to do their bit, too, yet they won’t because who needs scruples when you can have profits?

The analytical methods Puliyel and his coauthor Brian Hooker reportedly employed in their new study is reminiscent of the work of Brian Wansink, who resigned from Cornell University five years ago this month after it concluded he’d committed scientific misconduct. In 2018, BuzzFeed published a deep-dive by Stephanie M. Lee on how the Wansink scandal was born. It gave the (well-referenced) impression that the scandal was a combination of a student’s relationship with a mentor renowned in her field of work and the mentor’s pursuit of headlines over science done properly. It’s hard to imagine Puliyel and Hooker were facing any kind of coercion, which leaves the headlines.

This isn’t hard to believe considering it’s the second study to have been published recently that took a shot at Bharat Biotech based on shoddy research. It sucks that it’s become so easy to push people over the cliff, and into the ravenous maw of a conspiracy theory, but it sucks more that some people will push others even when they know better.

The not-so-obvious obvious

If your job requires you to pore through a dozen or two scientific papers every month – as mine does – you’ll start to notice a few every now and then couching a somewhat well-known fact in study-speak. I don’t mean scientific-speak, largely because there’s nothing wrong about trying to understand natural phenomena in the formalised language of science. However, there seems to be something iffy – often with humorous effect – about a statement like the following: “cutting emissions of ozone-forming gases offers a ‘unique opportunity’ to create a ‘natural climate solution'”1 (source). Well… d’uh. This is study-speak – to rephrase mostly self-evident knowledge or truisms in unnecessarily formalised language, not infrequently in the style employed in research papers, without adding any new information but often including an element of doubt when there is likely to be none.

1. Caveat: These words were copied from a press release, so this could have been a case of the person composing the release being unaware of the study’s real significance. However, the words within single-quotes are copied from the corresponding paper itself. And this said, there have been some truly hilarious efforts to make sense of the obvious. For examples, consider many of the winners of the Ig Nobel Prizes.

Of course, it always pays to be cautious, but where do you draw the line before a scientific result is simply one because it is required to initiate a new course of action? For example, the Univ. of Exeter study, the press release accompanying which discussed the effect of “ozone-forming gases” on the climate, recommends cutting emissions of substances that combine in the lower atmosphere to form ozone, a compound form of oxygen that is harmful to both humans and plants. But this is as non-“unique” an idea as the corresponding solution that arises (of letting plants live better) is “natural”.

However, it’s possible the study’s authors needed to quantify these emissions to understand the extent to which ambient ozone concentration interferes with our climatic goals, and to use their data to inform the design and implementation of corresponding interventions. Such outcomes aren’t always obvious but they are there – often because the necessarily incremental nature of most scientific research can cut both ways. The pursuit of the obvious isn’t always as straightforward as one might believe.

The Univ. of Exeter group may have accumulated sufficient and sufficiently significant evidence to support their conclusion, allowing themselves as well as others to build towards newer, and hopefully more novel, ideas. A ladder must have rungs at the bottom irrespective of how tall it is. But when the incremental sword cuts the other way, often due to perverse incentives that require scientists to publish as many papers as possible to secure professional success, things can get pretty nasty.

For example, the Cornell University consumer behaviour researcher Brian Wansink was known to advise his students to “slice” the data obtained from a few experiments in as many different ways as possible in search of interesting patterns. Many of the papers he published were later found to contain numerous irreproducible conclusions – i.e. Wansink had searched so hard for patterns that he’d found quite a few even when they really weren’t there. As the British economist Ronald Coase said, “If you torture the data long enough, it will confess to anything.”

The dark side of incremental research, and the virtue of incremental research done right, stems from the fact that it’s non-evidently difficult to ascertain the truth of a finding when the strength of the finding is expected to be so small that it really tests the notion of significance or so large – or so pronounced – that it transcends intuitive comprehension.

For an example of the former, among particle physicists, a result qualifies as ‘fact’ if the chances of it being a fluke are 1 in 3.5 million. So the Large Hadron Collider (LHC), which was built to discover the Higgs boson, had to have performed at least 3.5 million proton-proton collisions capable of producing a Higgs boson and which its detectors could observe and which its computers could analyse to attain this significance.

But while protons are available abundantly and the LHC can theoretically perform 645.8 trillion collisions per second, imagine undertaking an experiment that requires human participants to perform actions according to certain protocols. It’s never going to be possible to enrol billions of them for millions of hours to arrive at a rock-solid result. In such cases, researchers design experiments based on very specific questions, and such that the experimental protocols suppress, or even eliminate, interference, sources of doubt and confounding variables, and accentuate the effects of whatever action, decision or influence is being evaluated.

Such experiments often also require the use of sophisticated – but nonetheless well-understood – statistical methods to further eliminate the effects of undesirable phenomena from the data and, to the extent possible, leave behind information of good-enough quality to support or reject the hypotheses. In the course of navigating this winding path from observation to discovery, researchers are susceptible to, say, misapplying a technique, overlooking a confounder or – like Wansink – overanalysing the data so much that a weak effect masquerades as a strong one but only because it’s been submerged in a sea of even weaker effects.

Similar problems arise in experiments that require the use of models based on very large datasets, where researchers need to determine the relative contribution of each of thousands of causes on a given effect. The Univ. of Exeter study that determined ozone concentration in the lower atmosphere due to surface sources of different gases contains an example. The authors write in their paper (emphasis added):

We have provided the first assessment of the quantitative benefits to global and regional land ecosystem health from halving air pollutant emissions in the major source sectors. … Future large-scale changes in land cover [such as] conversion of forests to crops and/or afforestation, would alter the results. While we provide an evaluation of uncertainty based on the low and high ozone sensitivity parameters, there are several other uncertainties in the ozone damage model when applied at large-scale. More observations across a wider range of ozone concentrations and plant species are needed to improve the robustness of the results.

In effect, their data could be modified in future to reflect new information and/or methods, but in the meantime, and far from being a silly attempt at translating a claim into jargon-laden language, the study eliminates doubt to the extent possible with existing data and modelling techniques to ascertain something. And even in cases where this something is well known or already well understood, the validation of its existence could also serve to validate the methods the researchers employed to (re)discover it and – as mentioned before – generate data that is more likely to motivate political action than, say, demands from non-experts.

In fact, the American mathematician Marc Abrahams, known much more for founding and awarding the Ig Nobel Prizes, identified this purpose of research as one of three possible reasons why people might try to “quantify the obvious” (source). The other two are being unaware of the obvious and, of course, to disprove the obvious.