publishing – Close Read

Poonam Pandey and peer-review

One dubious but vigorous narrative that has emerged around Poonam Pandey’s “death” and subsequent return to life is that the mainstream media will publish “anything”.

To be sure, there were broadly two kinds of news reports after the post appeared on her Instagram handle claiming Pandey had died of cervical cancer: one said she’d died and quoted the Instagram post; the other said her management team had said she’d died. That is, the first kind stated her death as a truth and the other stated her team’s statement as a truth. News reports of the latter variety obviously ‘look’ better now that Pandey and her team said she lied (to raise awareness of cervical cancer). But judging the former news reports harshly isn’t fair.

This incident has been evocative of the role of peer-review in scientific publishing. After scientists write up a manuscript describing an experiment and submit it to a journal to consider for publishing, the journal editors farm it out to a group of independent experts on the same topic and ask them if they think the paper is worth publishing. (Pre-publishing) Peer-review has many flaws, including the fact that peer-reviewers are expected to volunteer their time and expertise and that the process is often slow, inconsistent, biased, and opaque.

But for all these concerns, peer-review isn’t designed to reveal deliberately – and increasingly cleverly – concealed fraud. Granted, the journal could be held responsible for missing plagiarism and the journal and peer-reviewers both for clearly duplicated images and entirely bullshit papers. However, pinning the blame for, say, failing to double-check findings because the infrastructure to do so is hard to come by on peer-review would be ridiculous.

Peer-review’s primary function, as far as I understand it, is to check whether the data presented in the study support the conclusions drawn from the study. It works best with some level of trust. Expecting it to respond perfectly to an activity that deliberately and precisely undermines that trust is ridiculous. A better response (to more advanced tools with which to attempt fraud but also to democratise access to scientific knowledge) would be to overhaul the ‘conventional’ publishing process, such as with transparent peer-review and/or paying for the requisite expertise and labour.

(I’m an admirer of the radical strategy eLife adopted in October 2022: to review preprint papers and publicise its reviewers’ findings along with the reviewers’ identities and the paper, share recommendations with the authors to improve it, but not accept or reject the paper per se.)

Equally importantly, we shouldn’t consider a published research paper to be the last word but in fact a work in progress with room for revision, correction or even retraction. Doing otherwise – as much as stigmatising retractions for reasons not related to misconduct or fraud, for that matter – on the other hand, may render peer-review suspect when people find mistakes in a published paper even when the fault lies elsewhere.

Analogously, journalism is required to be sceptical, adversarial even – but of what? Not every claim is worthy of investigative and/or adversarial journalism. In particular, when a claim is publicised that someone has died and a group of people that manages that individual’s public profile “confirms” the claim is true, that’s the end of that. This an important reason why these groups exist, so when they compromise that purpose, blaming journalists is misguided.

And unlike peer-review, the journalistic processes in place (in many but not all newsrooms) to check potentially problematic claims – for example, that “a high-powered committee” is required “for an extensive consideration of the challenges arising from fast population growth” – are perfectly functional, in part because their false-positive rate is lower without having to investigate “confirmed” claims of a person’s death than with.

The overlay bias

I’m not very fond of some highly popular pieces of writing (I won’t name them because I’m nervous about backlash from authors and/or their supporters) because a part of their popularity is undeniably rooted in technological ‘solutions’ that asymmetrically promote work published in the solution’s country of origin.

My favourite example is Pocket, the app that allows users to save copies of articles to read later, offline if required. Not long ago, Pocket introduced an extension for the Google Chrome browser (which counts hundreds of millions of users) such that every time you opened a new tab, it would show you three articles lots of other Pocket users have read and liked. It’s fairly brainless, ergo presumably non-malicious, and you’d expect the results to be distributed equally from among magazines, journals, etc. published around the world.

However, nine times out of ten – but often more – I’d find articles by NYT, The Atlantic, The Baffler, etc. there. I was reluctant to blame Pocket at first, considering their algorithm seemed too simple, but then I realised Pocket was just the last in a long line of other apps and algorithms that simply amplified existing biases.

Before Pocket, for example, there might have been Twitter, Facebook or some other platform that allowed stories from some domains (nytimes.com, thebaffler.com, etc.) to persist for longer on users’ feeds because they were more easily perceived to be legitimate than articles from other sources, say, a Venezuelan newspaper, a Kenyan blog, a Pakistani magazine or a Vietnamese journal. Or there might have been Nuzzle, which auto-compiles a digest of articles that others your friends on the social media have shared most – likely unmindful of the fact that people quite often share headlines, or domains they’d like to be known to be reading, instead of the articles themselves.

This is a social magnification like the biological magnification in nature, whereby toxic substances pile up in greater quantities in the gizzards of animals higher up in the food chain. Here, perceptions of legitimacy and quality accumulate in greater quantities in the feeds and timelines of people who consume, or even glance through, the most information. And this way, a general consciousness of what’s considered desirable erects itself without anything drastic, with just the more fleeting and mindless actions of millions of people, into a giant wheel of information distribution that constantly feeds itself its own momentum.

As the wheel turns, and The Atlantic publishes an article, it doesn’t just publish a good article that draws hundreds of thousands of readers. It also rides a wheel set in motion by American readers, American companies, American developers, American interests and American dollars, with a dollop of historical imperialism, that quietly but surely brings the world a good article plus a good-natured reminder that The Atlantic is good and that readers needn’t go looking for anything else because The Atlantic has them covered.

As I wondered in 2017, and still do: “Will my peers in India have been farther along in their careers had there been an equally influential Indian for-publishers tech stack?” Then again, how much is one more amplifier, Pocket or anything else, going to change?

I went into this tirade because of this Twitter thread, which describes a similar issue with arXiv – the popular preprint repo for physical sciences, computer science and applied mathematics papers (don’t @ me to quibble over arXiv’s actual remit). As the tweeter Jia-Bin Huang writes, the manuscripts that were uploaded last – i.e. most recently – to arXiv are displayed on top of the output stack, and what’s displayed on top of the stack gets more citations and readership.

This is a very simple algorithm, quite like Pocket’s algorithm, but in both cases they’re algorithms overlaid on existing bias-amplifying architectures. In a sense, they’re akin to the people who might stand by and watch a lynching, neither egging the perpetrators on nor stopping them. If the metaphor is brutal, remember that the effects on any publication or scientist that can’t infiltrate or ‘hack’ social biases are brutal as well. While their contents and their ideas might deserve international readership, these publications and scientists will need to spend more – energy, resources, effort – to grab international attention again and again.

The example Jia-Bin Huang cites is of scientists in Asia, who – unlike their American counterparts – can’t upload a paper on arXiv just before the deadline so that their papers sit on top of the stack because 2 pm in New York is 3 am in Taipei.

As some replies to the thread indicated, the people maintaining arXiv can easily solve the problem by waiting for the deadline to pass, then randomising the order of papers displayed in its email blast – but as Jia-Bin Huang notes, doing that would mean negating the just-in-time advantage that arXiv’s American users enjoy. So here we are.

It isn’t hard to see how we can extend the same suggestion to the world’s Pockets and Nuzzles. Pick your millions of users’ thousand most-read articles, mix up their order – even weigh down popular American publishers if necessary – and finally advertise the first ten items from this list. But ultimately, until technological solutions actively negate the biases they overlie, Pocket will lie on the same spectrum as the tools that produce the biases. I admit fact-checking in this paradigm could be labour-intensive, as could relevance-checking vis-à-vis arXiv, but I also think the latter would be better problems to solve.

The post-reporter era

One of the foundation stones of journalism is the process of reporting. That there is a messenger working the gap between an event and a story provides for news to exist and exist with myriad nuances attached to it. There are ethical and moral issues, technical considerations, writing styles, and presentation formats to perfect. The entire news-publishing industry is centered on the activities of reporters and streamlining them.

What the reporter requires the most is… well, a few things. The first is a domain of events, from which he picks issues to talk about. The second is a domain of stories, into which he publishes his reports. The third is a platform using which he may incentivize this process for himself, and acquire the tools with which he may publish his stories efficiently and effectively. The last entity is more commonly understood in the form of a publishing house.

The reason I’ve broken the working of a reporter into these categories is to understand what makes a reporter at all. Today, a reporter is most commonly understood in terms of an individual who is employed with a publishing house and publishes stories for them. Ideally, however, everyone is a reporter: simply the creation of knowledge by people based on experiences around them should be qualification enough. This calls into question the role of a publishing house: is it a platform working with which reporters may function efficiently, or is it an employer of reporters?

If it’s an employer of reporters, then any publishing house wouldn’t have to worry about where the course of journalism is going to take the organization itself. Reporters will have to change the way they work – how they spot issues, evolving writing styles to suit their audiences, so forth – but the publishing house will retain ownership of the reporters themselves. As long as it’s not a platform which individuals use to function as reporters, things are going to be fine.

Now, let’s move to the post-reporter era, where everyone is a reporter (of course, that’s an idealized image, but even so). In this world, a reporter is not someone who works for a publishing house – that aspect of the word’s meaning is left behind in the age of the publishing house. In this world, a reporter is someone who works simply as a messenger between the domains of events and stories, where the role of the publishing house as the owner of reportage is absent.

The nature of such a world throws light on the valuation of information. When multiple reporters cover different events and return to HQ to file their stories, the house decides which stories make the cut and which don’t on the basis of a set of parameters. In other words, the house creates and assigns a particular value to each story, and then compares the values of different stories to determine their destiny.

In the post-reporter era, which is likely to be occupied by channels of individual presentation – ranging from word-of-mouth to full-scale websites – houses that thrive today on the valuation of information and the importance the houses’ readers place on it will steadily fade out. What exists will be an all-encompassing form of what is known as citizen journalism (CJ) today. Houses take to CJ because of the mutually beneficial relationship available therein: the CJ gets the coverage and the advantage of the issue pursued no longer being under wraps; the reporter gets a story that has both civic/criminal and human-interest angles to it.

However, when the CJ voids the relationship by refusing the intervention of a publishing/broadcasting house, and chooses to take his story straight to the people through a channel he finds effective enough, the house-level valuation of stories is replaced by a democratic institution that may or may not be guided by a paternalistic attitude.

Therefore, if a particular house has to survive into the post-reporter era, it must discard issue-valuation as an engine and instead rely on some other entity, such as one represented by a parameter whose efficiency is a maximizable quantity. This can be conceived as a fourth domain which, upon maximization, becomes the superset of which the three domains are subsets.

A counter-productive entity in this situation is that of property, which is accrued in great quantities by a high-achieving house in the present but which delays the onset of change in the future. Even when the house starts to experience slightly rougher weather, its first move will be to pump in more money, thereby offsetting change by some time. Only when the amount of property invested in delaying change is considerable will the house start to consider other alternatives, by which time other competing organizations will have moved into the future.