Another controversy, another round of blaming preprints

On February 1, Anand Ranganathan, the molecular biologist more popular as a columnist for Swarajya, amplified a new preprint paper from scientists at IIT Delhi that (purportedly) claims the Wuhan coronavirus’s (2019 nCoV’s) DNA appears to contain some genes also found in the human immunodeficiency virus but not in any other coronaviruses. Ranganathan also chose to magnify the preprint paper’s claim that the sequences’ presence was “non-fortuitous”.

To be fair, the IIT Delhi group did not properly qualify what they meant by the use of this term, but this wouldn’t exculpate Ranganathan and others who followed him: to first amplify with alarmist language a claim that did not deserve such treatment, and then, once he discovered his mistake, to wonder out loud about whether such “non-peer reviewed studies” about “fast-moving, in-public-eye domains” should be published before scientific journals have subjected them to peer-review.

https://twitter.com/ARanganathan72/status/1223444298034630656
https://twitter.com/ARanganathan72/status/1223446546328326144
https://twitter.com/ARanganathan72/status/1223463647143505920

The more conservative scientist is likely to find ample room here to revive the claim that preprint papers only promote shoddy journalism, and that preprint papers that are part of the biomedical literature should be abolished entirely. This is bullshit.

The ‘print’ in ‘preprint’ refers to the act of a traditional journal printing a paper for publication after peer-review. A paper is designated ‘preprint’ if it hasn’t undergone peer-review yet, even though it may or may not have been submitted to a scientific journal for consideration. To quote from an article championing the use of preprints during a medical emergency, by three of the six cofounders of medRxiv, the preprints repository for the biomedical literature:

The advantages of preprints are that scientists can post them rapidly and receive feedback from their peers quickly, sometimes almost instantaneously. They also keep other scientists informed about what their colleagues are doing and build on that work. Preprints are archived in a way that they can be referenced and will always be available online. As the science evolves, newer versions of the paper can be posted, with older historical versions remaining available, including any associated comments made on them.

In this regard, Ranganathan’s ringing the alarm bells (with language like “oh my god”) the first time he tweeted the link to the preprint paper without sufficiently evaluating the attendant science was his decision, and not prompted by the paper’s status as a preprint. Second, the bioRxiv preprint repository where the IIT Delhi document showed up has a comments section, and it was brimming with discussion within minutes of the paper being uploaded. More broadly, preprint repositories are equipped to accommodate peer-review. So if anyone had looked in the comments section before tweeting, they wouldn’t have had reason to jump the gun.

Third, and most important: peer-review is not fool-proof. Instead, it is a legacy method employed by scientific journals to filter legitimate from illegitimate research and, more recently, higher quality from lower quality research (using ‘quality’ from the journals’ oft-twisted points of view, not as an objective standard of any kind).

This framing supports three important takeaways from this little scandal.

A. Much like preprint repositories, peer-reviewed journals also regularly publish rubbish. (Axiomatically, just as conventional journals also regularly publish the outcomes of good science, so do preprint repositories; in the case of 2019 nCoV alone, bioRxiv, medRxiv and SSRN together published at least 30 legitimate and noteworthy research articles.) It is just that conventional scientific journals conduct the peer-review before publication and preprint repositories (and research-discussion platforms like PubPeer), after. And, in fact, conducting the review after allows it to be continuous process able to respond to new information, and not a one-time event that culminates with the act of printing the paper.

But notably, preprint repositories can recreate journals’ ability to closely control the review process and ensure only experts’ comments are in the fray by enrolling a team of voluntary curators. The arXiv preprint server has been successfully using a similar team to carefully eliminate manuscripts advancing pseudoscientific claims. So as such, it is easier to make sure people are familiar with the preprint and post-publication review paradigm than to take advantage of their confusion and call for preprint papers to be eliminated altogether.

B. Those who support the idea that preprint papers are dangerous, and argue that peer-review is a better way to protect against unsupported claims, are by proxy advocating for the persistence of a knowledge hegemony. Peer-review is opaque, sustained by unpaid and overworked labour, and dispenses the same function that an open discussion often does at larger scale and with greater transparency. Indeed, the transparency represents the most important difference: since peer-review has traditionally been the demesne of journals, supporting peer-review is tantamount to designating journals as the sole and unquestionable arbiters of what knowledge enters the public domain and what doesn’t.

(Here’s one example of how such gatekeeping can have tragic consequences for society.)

C. Given these safeguards and perspectives, and as I have written before, bad journalists and bad comments will be bad irrespective of the window through which an idea has presented itself in the public domain. There is a way to cover different types of stories, and the decision to abdicate one’s responsibility to think carefully about the implications of what one is writing can never have a causal relationship with the subject matter. The Times of India and the Daily Mail will continue to publicise every new paper discussing whatever coffee, chocolate and/or wine does to the heart, and The Hindu and The Wire Science will publicise research published in preprint papers because we know how to be careful and of the risks to protect ourselves against.

By extension, ‘reputable’ scientific journals that use pre-publication peer-review will continue to publish many papers that will someday be retracted.

An ongoing scandal concerning spider biologist Jonathan Pruitt offers a useful parable – that journals don’t always publish bad science due to wilful negligence or poor peer-review alone but that such failures still do well to highlight the shortcomings of the latter. A string of papers the work on which Pruitt led were found to contain implausible data in support of some significant conclusions. Dan Bolnick, the editor of The American Naturalist, which became the first journal to retract Pruitt’s papers that it had published, wrote on his blog on January 30:

I want to emphasise that regardless of the root cause of the data problems (error or intent), these people are victims who have been harmed by trusting data that they themselves did not generate. Having spent days sifting through these data files I can also attest to the fact that the suspect patterns are often non-obvious, so we should not be blaming these victims for failing to see something that requires significant effort to uncover by examining the data in ways that are not standard for any of this. … The associate editor [who Bolnick tasked with checking more of Pruitt’s papers] went as far back as digging into some of Pruitt’s PhD work, when he was a student with Susan Riechert at the University of Tennessee Knoxville. Similar problems were identified in those data… Seeking an explanation, I [emailed and then called] his PhD mentor, Susan Riechert, to discuss the biology of the spiders, his data collection habits, and his integrity. She was shocked, and disturbed, and surprised. That someone who knew him so well for many years could be unaware of this problem (and its extent), highlights for me how reasonable it is that the rest of us could be caught unaware.

Why should we expect peer-review – or any kind of review, for that matter – to be better? The only thing we can do is be honest, transparent and reflexive.

Losing sight of the agricultural finish line

In The Guardian, Joanna Blythman pokes an important pin into the frustrating but unsurprisingly durable bubble of vegan cuisine and the low-hanging fruits of ethical eating:

These days it’s fashionable to eulogise plant foods as the secret for personal health and sound stewardship of our planet. But in the process of squaring up to the challenge of climate breakdown, we seem to have forgotten that plant foods too can be either badly or well produced. … As long as we demonise animal foods and eulogise plant foods, any prospect of a natural food supply is shattered. We are left to depend for sustenance on the tender mercies of the techno-food corporations that see a little green V and the word “plant” as a formula for spinning gold from straw through ultra-processing.

Hopefully – though I hope for far too much here! – her article will sufficiently puncture the global elite’s bloated righteousness over eating healthy, especially vegan and/or organic, in order to save the planet, when in fact it’s just another instance of doing the bare and suspiciously photogenic minimum to personally feel better.

My own grouse is directed at tech-driven agricultural targets that speak about the producer and the consumer as if there was nothing in between, such as R&D, processing, storage, supply, distribution and trade, all in turn resting on a wider substrate of political-economic issues. The defensive technologist and/or investor might say, “You have got to start somewhere,” but innovators frequently start by targeting a demographic for which the situation might never been too late, instead of the people for whom it already is. Even then, their rhetoric also quickly forgets how misguided and off-target their ambitions are, leave alone losing sight of the problemy problems in desperate need of resolution.

I do think vertical farms are an interesting idea but I also think their wealthy investors and wealthy publicists have made a habit of horribly overestimating the extent to which these contraptions are going to be part of the solution – which in turn has contributed to a widespread sense of complacency among the elite and blinded them to the need for more better and radical changes to the status quo.

Sure, pesticides suck; I am also familiar with accounts that describe how the world produces enough but wastes too much, the tactics of companies like Monsanto; and I recognise agriculture is arguably the oldest human activity contributing to global heating. However, most narratives that provide the counter-view, and some of which also offer supplementary alternatives, gloss over important features of modern agriculture like scale and cost-effectiveness, enabled in turn by the various -icides, as well as the ways in which it is enmeshed in the economies of the developing world.

Ideas like indoor farming have become increasingly trendy of late: just two startups in the US raised $300 million as of last year but their products seem to cater only to upper-class westerners content with a salad-centric diet, seemingly mindless of the millions in third-world countries grossly underprepared to deal with climate change, water shortage, undernourishment and deepening economic inequality at the same time. (Not to mention: the more it costs to produce something, the more it is going to cost to buy without subsidies.)

For many – if not most – of India’s children, eggs are often the sole affordable source of protein. As an elite, upper-caste Indian, I have both privilege and responsibility to change my lifestyle to reduce my as well as others’ carbon footprints1; but in addition, to what extent could I be expected to fight against non-free-range egg production in the absence of guarantees about alternative sources – including lab-grown ones – when ultimately human welfare is our shared concern?

1. I can reduce others’ carbon footprints by reducing the amount of materials I consume to maintain my lifestyle.

The midday meal programme for instance feeds more than 100 million children, with the per-plate cooking cost ranging from Rs 4 to Rs 7; each plate in turn needs to have 12-20 grams of protein. We know pesticide-fed agriculture works because (together with government subsidies) it makes these costs possible, not when it does not damage the world in whatever other ways.

More broadly, there is a limit to which concerns for the climate have the leeway to supersede crop and cattle-meat production in India when the government will not sufficiently protect members of these sectors, often belonging to the more marginalised sections of society, from poverty, insolvency, suicide and death. Axiomatically, “breakthroughs in the development of food” will not move the climate-action needle until they provide alternate livelihoods, upgrade storage and distribution infrastructure, improve access to capital and insurance, and retool the public distribution system – a slew of upstream and downstream changes whose complexity towers over the technological options we currently have on offer.

Fighting climate change is, among other things, about replacing unsustainable practices with sustainable alternatives without sacrificing human development. However, the most popular media and business narratives have given this ambition a Malthusian twist to suggest it is about saving the planet at all costs – and not out of desperation but sheer ignorance, albeit with the same consequences. The dietary movements that promote organic farming, anti-meat diets and, quite terribly, genetically modified foods among the rich are part of this rhetoric. The technologies they bank on are frequently riddled with hypocrisies, most of all concerning external costs, and their strategies are restricted to regimens with their own well-established economies of profitability, such as keto, paleo, detox, etc., over anaemic, stunted, malnourished, etc.

The story here is quite similar to that of electric vehicles. If you are driving an electric scooter in India today, you are still far from helping cut emissions because coal is still the biggest source of power in the country. So without undertaking efforts to produce cleaner power (an endeavour fraught with its own problems), all you have done is translocated your share of the emissions away from the city where you are driving the scooter and to the faraway power plant where more coal is being burnt to provide the power you need. Your purchase may have been a step in the right direction but celebrating that would be as premature as getting to Kathmandu and tweeting you are on your way to the top of Mt Everest.

Claiming to be on the path to resolving the world’s food crisis by putting food on the plate of the already well-fed is similarly laughable.

Google Docs: A New Hope

I suspect the Google Docs grammar bot is the least useful bot there is. After hundreds of suggestions, I can think of only one instance in which it was right. Is its failure rate so high because it learns from how other people use English, instead of drawing from a basic ruleset?

I’m not saying my grammar is better than everyone else’s but if the bot is learning from how non-native users of the English language construct their sentences, I can see how it would make the suggestions it does, especially about the use of commas and singular/plural referents.

Then again, what I see as failure might be entirely invisible to someone not familiar with, or even interested in, punctuation pedantry. This is where Google Docs’s bot presents an interesting opportunity.

The rules of grammar and punctuation exist to assist the construction and inference of meaning, not to railroad them. However, this definition doesn’t say whether good grammar is simply what most people use and are familiar with or what is derived from a foundational set of rules and guidelines.

Thanks to colonialism, imperialism and industrialism, English has become the world’s official language, but thanks to their inherent political structures, English is also the language of the elite in postcolonial societies that exhibit significant economic inequality.

So those who wield English ‘properly’ – by deploying the rules of grammar and punctuation the way they’re ‘supposed’ to – are also those who have been able to afford a good education. Ergo, deferring to the fundamental ruleset is to flaunt one’s class privilege, and to expect others to do so could play out as a form of linguistic subjugation (think The New Yorker).

On the other hand, the problem with the populist ontology is that it encourages everyone to develop their own styles and patterns based on what they’ve read – after all, there is no one corpus of popular literature – that are very weakly guided by the same logic, if they’re guided by any logic at all. This could render individual pieces difficult to read (or edit).

Now, a question automatically arises: So what? What does each piece employing a different grammar and punctuation style matter as long as you understand what the author is saying? The answer, to me at least, depends on how the piece is going to find itself in the public domain and who is going to read it.

For example, I don’t think anyone would notice if I published such erratic pieces on my blog (although I don’t) – but people will if such pieces show up in a newspaper or a magazine, because newsrooms enforce certain grammatical styles for consistency. Such consistency ensures that:

  1. Insofar as grammar must assist inference, consistent patterns ensure a regular reader is familiar with the purpose the publication’s styleguide serves in the construction of sentences and paragraphs, which in turn renders the symbols more useful and invisible at the same time;
  2. The writers, while bringing to bear their own writing styles and voices, still use a ‘minimum common’ style unique to and associated with the publication (and which could ease decision-making for some writers); and
  3. The publication can reduce the amount of resources expended to train each new member of its copy-editing team

Indeed, I imagine grammatical consistency matters to any professional publication because of the implicit superiority of perfect evenness. But where it gets over the top and unbearable is when its purpose is forgotten, or when it is effected as a display of awareness of, or affiliation to, some elite colonial practice.

Now, while we can agree that the populist definition is less problematic on average, we must also be able to recognise that the use of a ‘minimum common’ remains a good idea if only to protect against the complete dilution of grammatical rules with time. For example, despite the frequency with which it is abused, the comma still serves at least one specific purpose: to demarcate clauses.

In this regard, the Google Docs bot could help streamline the chaos. According to the service’s support documentation, the bot learns its spelling instead of banking exclusively on a dictionary; it’s not hard to extrapolate this behaviour to grammar and syntactic rules as well.

Further, every time you reject the bot’s suggested change, the doc displays the following message: “Thanks for submitting feedback! The suggestion has been automatically ignored.” This isn’t sufficient evidence to conclude that the bot doesn’t learn. For one, the doc doesn’t display a similar message when a suggestion is accepted. For another, Google tracks the following parameters when you’re editing a doc:

customer-type, customer-id, customer-name, storageProvider, isOwner, editable, commentable, isAnonymousUser, offlineOptedIn, serviceWorkerControlled, zoomFactor, wasZoomed, docLocale, locale, docsErrorFatal, isIntegrated, companion-guest-Keep-status, companion-guest-Keep-buildLabel, companion-guest-Tasks-status, companion-guest-Tasks-buildLabel, companion-guest-Calendar-status, companion-guest-Calendar-buildLabel, companion-expanded, companion-overlaying-host-content, spellGrammar, spellGrammarDetails, spellGrammarGroup, spellGrammarFingerprint

Of them, spellGrammar is set to true and I assume spellGrammarFingerprint corresponds to a unique ID.

So assuming further that it learns through individual feedback, the bot must be assimilating a dataset in the background within whose rows and columns an ‘average modal pattern’ could be taking shape. As more and more users accept or reject its suggestions, the mode could become correspondingly more significant and form more of the basis for the bot’s future suggestions.

There are three problems, however.

First, if individual preferences have diverged to such an extent as to disfavour the formation of a single most significant modal style, the bot is unlikely to become useful in a reasonable amount of time or unless it combines user feedback with the preexisting rules of grammar and composition.

Second, Google could have designed each bot to personalise its suggestions according to each account-holder’s writing behaviour. This is quite possible because the more the bot is perceived to be helpful, the likelier its suggestions are to be accepted, and the likelier the user is to continue using Google Docs to compose their pieces.

However, I doubt the bot I encounter on my account is learning from my feedback alone, and it gives me… hope?

Third: if the bot learns only spelling but not grammar and punctuation use, it would be – as I first suspected – the least useful bot there is.

India’s Delhi-only air pollution problem

I woke up this morning to a PTI report telling me Delhi’s air quality had fallen to ‘very poor’ on Deepavali, the Hindu ostensible festival of lights, with many people defying the Supreme Court’s direction to burst firecrackers only between 8 pm and 10 pm. This defiance is unsurprising: the Supreme Court doesn’t apply to Delhi because, and not even though, the response to the pollution was just Delhi-centric.

In fact, it’s probably only a problem because Delhi is having trouble breathing, despite the fact that the national capital is the eleventh-most polluted city in the world, behind eight other Indian ones.

The report also noted, “On Saturday, the Delhi government launched a four-day laser show to discourage residents from bursting firecrackers and celebrating Diwali with lights and music. During the show, laser lights were beamed in sync with patriotic songs and Ramayana narration.”

So the air pollution problem rang alarm bells and the government solved just that problem. Nothing else was a problem so it solved nothing else. The beams of light the Delhi government shot up into the sky would have caused light pollution, disturbing insects, birds and nocturnal creatures. The sound would no doubt have been loud, disturbing animals and people in the area. It’s a mystery why we don’t have familial, intimate celebrations.

There is a concept in environmental philosophy called the hyperobject: a dynamic super-entity that lots of people can measure and feel at the same time but not see or touch. Global warming is a famous hyperobject, described by certain attributes, including its prevalence and its shifting patterns. Delhi’s pollution has two hyperobjects. One is what the urban poor experiences – a beast that gets in the way of daily life, that you can’t wish away (let alone fight), and which is invisible to everyone else. The is the one in the news: stunted, inchoate and classist, it includes only air pollution because its effects have become unignorable, and sound and light don’t feature in it – nor does anything even a degree removed from the singular sources of smoke and fumes.

For example, someone (considered smart) recently said to me, “The city should collect trash better to avoid roadside garbage fires in winter.” Then what about the people who set those fires for warmth because they don’t have warm shelter for the night? “They will find another way.”

The Delhi-centrism is also visible with the ‘green firecrackers’ business. According to the CSIR National Environmental Engineering Research Institute (NEERI), which developed the crackers, its scientists “developed new formulations for reduced emission light and sound emitting crackers”. But it turns out the reduction doesn’t apply to sound.

The ‘green’ crackers’ novel features include “matching performance in sound (100-120dBA) with commercial crackers”. A 100-120 dBA is debilitating. The non-crazy crackers clock about 60-80 dBA. (dB stands for decibels, a logarithmic measure of sound pressure change; the ‘A’ corresponds to the A-setting, a scale used to measure sounds according to human loudness.)

In 2014, during my neighbours’ spate of cracker-bursting, I “used an app to make 300 measurements over 5 minutes” from a distance of about 80 metres, and obtained the following readings:

Min: 41.51 dB(A)
Max: 83.88 dB(A)
Avg.: 66.41 dB(A)

The Noise Pollution (Regulation and Control) Rules 2000 limit noise in the daytime (6 am to 10 pm) to 55 dB(A), and the fine for breaking the rules was just Rs 100, or $1.5, before the Supreme Court stepped up taking cognisance of the air pollution during Deepavali. This is penalty is all the more laughable considering Delhi was ranked the world’s second-noisiest city in 2017. There’s only so much the Delhi police, including traffic police, can do, with the 15 noise meters they’ve been provided.

In February 2019, Romulus Whitaker, India’s ‘snake man’, expressed his anguish over a hotel next door to the Madras Crocodile Bank Trust blasting loud music that was “triggering aberrant behaviour” among the animals (to paraphrase the author). If animals don’t concern you: the 2014 Heinz Nixdorf Recall study found noise is a risk factor for atherosclerosis. Delhi’s residents also have the “maximum amount of hearing loss proportionate to their age”.

As Dr Deepak Natarajan, a Delhi-based cardiologist, wrote in 2015, “It is ironic that the people setting out to teach the world the salutatory effects of … quietness celebrate Yoga Day without a thought for the noise that we generate every day.”

Someone else tweeted yesterday, after purchasing some ‘green’ firecrackers, that science “as always” (or something similar) provided the solution. But science has no agency: like a car, people drive it. It doesn’t ask questions about where the driver wants to go or complain when he drives too rashly. And in the story of fixing Delhi’s air pollution, the government has driven the car like Salman Khan.