Replies to the government’s concerns with our criticism of the DNA Profiling Bill

In response to the piece ‘Modi Wants the DNA Profiling Bill Passed Right Away. Here’s Why It Shouldn’t Be‘, published July 24, 2015, Dr. J. Gowrishankar, Director of the Centre for DNA Fingerprinting and Diagnostics, wrote a spirited response describing the benign intentions behind the Bill and why there is a real need for it in India, where the criminal justice system is known to be tardy.

I agree with large sections of his response, but am disappointed that they don’t address any specific points of failures – especially the lacklustre privacy and accountability safeguards. This is also why I don’t ask for the Bill to be shredded but that it be referred to a Parliamentary Standing Committee (at least) before it can be tabled. The following is a an unnumbered ‘listicle’ of my replies to Gowrishankar’s response.

That a part of the Indian Bill’s strength lies in having borrowed parts of laws from other countries, where DNA profiling has been around for more than a decade

The text of India’s Human DNA Profiling Bill may in large part be based on that from the USA, UK, Canada, etc., but many of the problems that the Bill could exacerbate are unique to India – such as the many privacy and accountability concerns highlighted in my article. Those parts of the Bill can’t be compared to what’s happening in the West. In fact, the USA, UK and Canada also have legislations in place that explicitly specify how the DNA profiles can be collected, the best practices for storing and indexing them, as well as who can access them, in what circumstances and how. TheDNA Identification Act 1994 (USA) specifies that all federally supported DNA labs comply with operational standards for collection, storage and analysis set by the FBI. The Criminal Justice and Public Order Act 1994 does the same in the UK. The DNA Identification Act 1998 (Canada) also does the same and further requires a periodic review of itself every five years.

That DNA profiling has a steadfast record in being able to solve disputes and that my skepticism of it is misplaced

Yes, DNA-profiling has a fabulous track record in settling disputes. However, the drafting committee, as well as anyone interested in the Bill’s tabling, would do well to learn from the mistakes of those who have been systematically pressing DNA-profiling to the resolution of civil and criminal disputes in modern times. I am skeptical of the technique – as I’m skeptical of all techniques – so I’ve asked that the Bill be cognisant of the various statistical blips and prescribe best practices to eliminate them. As I write in my article: “This isn’t to say that a reliable [match] can never be arrived at, but only that the draft Bill does not have the commensurate depth required to identify and tackle the sort of statistically motivated mistakes in DNA profiling. In fact, it also abdicates itself from specifying any best practices for the collection, storage and analysis of DNA samples…”

That only identity-neutral information derived from a person’s DNA will be stored in the database

The Bill doesn’t say this. As far as the draft document is concerned, the contents of the database are profiles – not identity-neutral profiles, just profiles. I respect your attitude to privacy but I only ask that it be reflected fully in the Bill as well.

That a database of DNA profiles will only contain the profiles of offenders, missing persons, unidentified bodies and volunteers and that its regulation will, beyond the Bill’s sanctions, require judicial oversight

Bringing criminals to justice faster is a good aspiration to have, but it must be done not at the expense of anybody’s privacy and definitely while the government’s actions – in the form of the Board’s – are always accountable. On the question of retention: it’s understandable if you want to store the profiles of those who are repeat offenders – but why indefinitely? The law in the UK stipulates that profiles can be retained for a maximum of six years. And what’s the rationale behind storing the profiles of those who have been sentenced for life or to death?

That the Board has been given discretionary powers to empower them to keep up with advances in DNA profiling, and that the Board will be staffed by, for example, the Chairperson of the NHRC

Those staffing the Board may be upstanding folk but the Bill has a responsibility to account for the worst of times as well. I don’t want to have to keep a check on who’s on the Board and who’s not – I want the Bill to provide guarantees once and for all that things won’t go wrong. Please also note that the Bill is scheduled to be introduced at a time when the country’s leadership is unwilling to accept that the right to privacy is a fundamental right, at a time when the Central government insists on interfering in the management of highly regarded public institutions. I can only read the Bill’s intentions through the lens of the government that will enact and, ultimately, be responsible for enforcing it.

That the DNA profiles’ database will contain only digital information and not the physical samples from which the data has been derived

I have already stated that setting up the Indian database will incur a one-time cost of Rs.20 crore. And on the other hand, I would like you to explain who will pay for acquiring the DNA profiles at costs that could well run into thousands of crores. In fact, the Bill does not contain the word ‘cost’ in it and seems unconcerned about how its implementation will be funded.

Next, on the question of whether the DNA database will store the physical samples from which the profiles will be derived: Usha Ramanathan – a researcher and advocate who was a dissenting member of the Bill’s drafting committee – has revealed an email communication she had with Gowrishankar dated June 25, 2014, in which he states the following:

“On your question of destruction of DNA collected from the relatives, I wish to state that the CDFD has so far not destroyed any DNA sample received by it since its inception. These samples are being maintained in safe custody in the institute. Once again, it is my assessment that the policy on such destruction needs to be developed and evolved by the proposed DNA Profiling Board.”

As a result, could the costs could be comparable to the NDNAD in the UK?

That my criticism has cherry-picked facts from the Bill

I have cherry-picked facts, but never out of context (that’s the reason the article runs into 4,000 words). I still want a Human DNA Profiling Bill to be passed and agree with you that it has benefits – but it gets to them at a great cost. That’s why I’d like to repeat my statement that the Bill be referred to a Parliamentary Standing Committee, and its niggling as well as substantial issues be resolved to everyone’s satisfaction, before it’s tabled.

The Wire
July 25, 2015

Featured image credit: stewdean/Flickr, CC BY 2.0.

Here’s why the Human DNA Profiling Bill shouldn’t be passed in its current form

The Human DNA Profiling Bill which the Narendra Modi government wants to pass in the current session of Parliament is one of the most intrusive enactments of its kind anywhere in the world, a measure that will render obsolete the national debate on privacy before it has even begun.

Drafted by the Department of Biotechnology (DBT) in the Ministry of Science & Technology, the Bill’s pithy title belies the ambitious, even disturbing, goals that its text envisions. To be sure, that it was drafted at the outset to expedite civil and criminal disputes where possible, to help identify the unclaimed dead, and to track down missing persons is a benign, even desirable, intention to have. Where it fails is in situating this agenda in an accountable and secure framework of rules.

Once passed, the law will set up a national DNA database, a DNA Profiling Board and a mechanism for the use of DNA profiles to resolve criminal and civil disputes with few safeguards to guard against the abuse of this information.

For example, in the Bill, a version of which The Wire was able to access, the Board gives itself wide-ranging discretionary powers about whose name gets into the database (sometimes without consent), who gets to access the DNA profiles, what the database could be used for (“population” studies), and who watches the watchers (in a word, nobody) – readying a potent cocktail of abuse.

The Bill is set to be tabled in the monsoon session of Parliament, which began on July 21. But that could be too soon given the scope and seriousness of the issues the draft raises. The proposed laws’ failures broadly have four facets – reliability, costs, privacy and accountability – and if passed in its current form could gravely jeopardise the integrity of sensitive biological information as well as poison the criminal justice system with a false conviction of judicial infallibility. In the absence of a reason to expedite its passing, the draft Bill could instead be referred to a Parliamentary Standing Committee before it’s tabled.

DNA profiling

Credit: johnnieb/Flickr, CC BY 2.0.
Credit: johnnieb/Flickr, CC BY 2.0.

After human fingerprints were pressed into the service of criminal investigations in 1892, DNA profiles have been the only other biological marker discovered by scientists to be unique to each individual. Since fingerprints at a crime scene can be easily obfuscated, or not left behind at all, and it is almost impossible for a criminal to not leave behind a clue bearing his or her DNA, DNA profiling has assumed great importance in modern forensic science.

Every cell of the body contains a copy of the DNA molecule, a total of three billion base pairs of smaller molecules called nucleotides neatly arranged into structures called chromosomes. Consider this a giant word with three billion letters. Some 99.9% of those letters are identical for every individual – but that 0.01% difference amounts to three million letters that are arranged in a different configuration. Among them, there are parts that contain a short combination of letters repeated a few times. These are called short tandem repeats (STRs), and the frequency of their repetition differs from person to person so much so that no two (known) people have the same DNA overall – unless they’re identical twins or closely related. Identifying this difference forms the basis of DNA profiling, also known as DNA fingerprinting.

The idea of the Bill was first mooted by the DBT in 2003, during the National Democratic Alliance government of Atal Bihari Vajpayee. In 2007, the DNA Profiling Advisory Committee, which had been put together by the DBT, developed the Human DNA Profiling Bill 2007 that has seen changes between 2007 and 2012. In January 2013, a committee of experts was formed to scrutinise the 2012 draft: J. Gowrishankar, Director, CDFD; R.K. Gupta, adviser (C&I), Planning Commission; Jacob P. Koshy, science writer, Mint; Kamal Kumar, retd. IPS, retd. DGP of Hyderabad; C. Muralikrishna Kumar, senior adviser (ICT), Planning Commission; Usha Ramanathan, researcher and advocate; T.S. Rao, adviser, DBT; N. Madhusudan Reddy, staff scientist, CDFD; Raghbir Singh, fmr. Secy., Ministry of Law; Alka Sharma, Director, DBT.

Till late 2014, the committee continued to deliberate and make changes to the draft Bill. Then, it was circulated within the Ministry of Science & Technology for comments, which were then incorporated in the draft.

By January 2015, the revised document had wound its way to the Legislative Department of the Ministry of Law & Justice. According to DBT Secretary K. VijayRaghavan, the department has now finished drafting the Bill and “processed it further for the necessary approval”.

In the same period, 2003-2015, the Central and various state governments have toyed with the idea of collecting and storing DNA profiles. Notably, the Tamil Nadu government sought to amend the Prisoners Identification Act 1920 intending to set up a database of prisoners’ profiles. In 2012, the Uttar Pradesh government made it mandatory for the DNA profiles of dead persons to be saved along with the postmortem.

Although the draft Bill banks on an amendment to the Criminal Procedure Code made in 2005 – to allow DNA evidence to be admissible in a court – its principal and most problematic feature is the central repository it envisages of DNA profiles belonging to crime suspects, criminal offenders, missing persons, unknown deceased persons, and volunteers.

Its contents and operation will be managed by a DNA Profiling Board and a Databank Manager that the Board will appoint, who altogether have too many discretionary powers that drag the credible parts of the document down. These parts include useful mechanisms such as for post-conviction DNA-testing (where a conviction can be overturned by allowing the defendant to appeal for a DNA test).

Overall, the draft Bill has four major flaws:

  1. Reliability of DNA profiling
  2. Visible and hidden costs
  3. Privacy and anonymisation
  4. Power and sunset clauses

I. Reliability of DNA profiling

Credit: kaibara/Flickr, CC BY 2.0.
Credit: kaibara/Flickr, CC BY 2.0.

What are the chances you’ll be killed in an airline accident? There is a number ascribed to this high-cost enterprise, and it is calculated using statistics because it’s hard to estimate how the failure of one of thousands of the components constituting it will or won’t precipitate the failure of the overall entity. So, the chances that you’ll be killed in an airline accident are 1 in 4.7 million. That means if 4.7 million flights are undertaken, one of them will result in a fatal accident, right? Not exactly, because the chances of an accident could be significantly increased if certain components of an aircraft fail, and engineers are not aware of all such precipitant failures.

Analysing the DNA of an individual to look for clues about her/his identity is subject to similar stochastic caveats. This is because, despite the many unique properties of the DNA molecules in our bodies, our ability to preclude errors in indexing them isn’t perfect. The implication is that DNA profiling throws up fewer errors when validating or invalidating less systematic proof, but there are errors nonetheless that a law – and definitely a court interpreting that law – must be aware of.

Moreover, the proofs are also dependent on how rarely or often the STRs have been observed in the past. Estimates of their rarity are based on studying some preset locations on the DNA: the CODIS database of DNA profiles in the US looks at 13 locations, the NDNAD in the UK looks at 10, whereas Interpol analyses look at 12. The CDFD (Centre for DNA Fingerprinting and Diagnostics) – the nodal agency for DNA analysis in the country – plans to look at 17, according to Dr. J. Gowrishankar, its director. These locations were determined to be important in the early days of DNA forensics, and according to lawyers in the US and UK are overdue for a reexamination.

The Human DNA Profiling Bill, on the other hand, is dismissive of this aspect of the technique it is centred on, with its January 2015 draft saying in its introduction that DNA profiling can distinguish between any two people “without a doubt”. The words give the impression that the experts involved in drafting it have no reason to believe that DNA profiles could ever be fallacious. In fact, conspicuously missing from the document are the statistical procedures (performed on DNA information) that will be admissible as evidence in a court of law.

Speaking to The Wire, Gowrishankar clarified that the three words “without a doubt” had been removed from the draft Bill in a later iteration – but only because the Bill would be tabled without that part in Parliament. However, he also added that he would be able to defend the infallibility of the technique.

In 2009, New Scientist reported the case of Charles Richard Smith. Smith was convicted of a sexual assault on Mary Jackson (not her real name) in Sacramento, California, which took place in January 2006. Jackson was sitting in a parking lot when a stranger jumped into her truck and made her drive to a remote location before forcing her to perform oral sex on him. When police arrested Smith and took a swab of cells from his penis, they found a second person’s DNA mixed with his own.

Mark Henderson’s 2012 book The Geek Manifesto: Why Science Matters elaborates on what happened during Smith’s trial (p. 158):

… a forensic scientist testified that the chances that the sample did not come from Jackson were just 1 in 95,000. Smith was convicted and jailed for 25 years. Genetic evidence, however, can be analysed in multiple ways. The analyst who provided the 1 in 95,000 number was convinced that he saw reliable ‘peaks’, indicating matches, at most of the 13 places in the genome where American forensic scientists compare DNA. His supervisor, whose evidence was also presented, thought fewer of these matches were reliable, and so put the probability that the DNA wasn’t Jackson’s at 1 in 47. A subsequent review of the case used a different technique, based on a computer algorithm, to compare the likelihood of the different interpretations of the evidence advanced by the prosecution and the defence. This suggested that this pattern of evidence was only twice as likely if the DNA was Jackson’s than if it belonged to someone else.

This isn’t to say that a reliable estimate can never be arrived at, but only that the draft Bill does not have the commensurate depth required to identify and tackle the sort of statistically motivated mistakes in DNA profiling. In fact, it also abdicates itself from specifying any best practices for the collection, storage and analysis of DNA samples – while  in countries like the UK and USA, a more matured approach to DNA profiling has been instituted through laws like the DNA Identification Act 1994 (USA), the Criminal Justice and Public Order Act 1994 (UK) and the DNA Identification Act 1998 (Canada).

According to Gowrishankar, “The Bill has been drafted keeping the future in mind, so we have not included the different ways in which the information can be analysed. We want to keep our options open,” and that it was up to the defence attorneys to refute findings.

The upper hand that DNA profiling claims in being able to identify a person is bifurcated: it simultaneously relies on being similar to one set of data and being dissimilar to another. And how much a profile is closer to one and farther from the other can be interpreted in many ways – all of them reliant on a control group, a reference point based on which the analyst can say how much similarity and dissimilarity a profile exhibits. This control group is defined by a sub-database that contains the DNA profiles of volunteers. Gowrishankar said that the significance of each match (or mismatch) will be determined relative to how unique the ‘letters’ in the profiles are. As a result, the size of the volunteers’ database plays a critical role in determining the outcome of cases.

In 2007, the noted legal experts Michael Saks and James Koehler presented a problem called the individualisation fallacy that arises when examiners confuse infrequency with uniqueness – a flaw that can be eliminated (to a certain extent) only by enlarging the control, i.e. volunteers’, database. For example, if an anomalous pattern in the DNA of a person has a one-in-a-quintillion chance of occurring (based on its frequency of occurrence among the volunteers), the examiner will assert that given the population of all the people on Earth only that person’s DNA has that pattern (absolute uniqueness). However, the examiner assumes wrongly that he/she is aware of all the sources of that anomaly in human genetics (relative uniqueness). A similar mix-up between the two kinds of uniqueness results in the prosecutor’s fallacy exemplified in the infamous Sally Clark case of 1999.

Another issue that worsens reliability of results is that the draft bill doesn’t explicitly ask to regularly check if any samples have been contaminated, even if it goes to some length to talk about what will happen to those who are found damaging samples in any way. How credible those sanctions are is a different matter. In at least one high-profile human rights case, the murder of five Kashmiri civilians at Pathribal in 2000, DNA samples were tampered with in an attempt to absolve the security forces of the charge of murder. The police officer who orchestrated the tampering was never punished.

II. Visible and hidden costs

Credit: Wikimedia Commons
Credit: Wikimedia Commons

The CDFD charges Rs.5,000 for each blood sample or person and Rs.10,000 for each “forensic exhibit” – such as an item of clothing from a crime scene – and an additional 12.36% as service charge levied by the Government of India. Though the draft Bill proposes including the profiles of only those under the scanner of the criminal justice system, data from the National Crime Records Bureau shows that over 32.7 lakh people were arrested in 2012 alone on criminal charges (proven and unproven) And while Gowrishankar said the official estimates were Rs.5 crore a year for keeping the database updated, acquiring the DNA profiles alone would cost more than Rs.1,800 crore.

The number of 32.7 lakh (even if only for reference) is too bloated for the database’s purposes because it also includes persons accused of minor crimes. Even if the size of the database has to be as big as possible to minimise the effects of the individualisation fallacy, its size becomes meaningless after a point, as the British government discovered in 2008. In that year, the number of profiles on the NDNAD jumped from 1.9 million to 4.1 million but the number of cases solved by the use of DNA profiles fell by 2,632 to 17,614. This was because the 2.2 million profiles were almost entirely of people who hadn’t been charged with any offences, making their DNA profiles irrelevant when it came to comparing those picked up from crime scenes. Similarly, the draft Bill would do well to include only the profiles of those charged with serious criminal offences – comparisons would be more efficient and costs would be lower.

Next, according to GeneWatch UK: “In 2010, putting someone’s DNA profile on the database in England and Wales was estimated to cost £30 to £40 and storing one person’s DNA sample was estimated to cost £1 a year.” The CDFD analysis rates are comparable to these numbers – so it must be noted that the capital costs of setting up the database in the UK was £300 million (Rs.3,000 crore approx.). Third, there is the operational cost – to maintain the communication and security infrastructure, and ensure it is compatible with indices like the CODIS. In fact, in September 2014, the FBI and the CDFD signed an agreement to install an instance of CODIS in CDFD’s Hyderabad office and train the personnel there. However,  Gowrishankar said all of this would warrant only Rs.20 crore.

None of these expenses are mentioned in the draft Bill.

III. Privacy and anonymisation

Credit: home_of_chaos/Flickr, CC BY 2.0.
Credit: home_of_chaos/Flickr, CC BY 2.0.

A person’s DNA profile contains similar information as a person’s password – however, it is more visceral. In the mammoth spatial configuration of the DNA’s atoms is encoded many of our characteristics and personal tendencies – including colour, race, behavioural features and susceptibility to some diseases. However, the few of the three million positions that the CODIS, NDNAD or the CDFD will be looking at are considered “neutral” – they don’t codify any of our features that might give our identities away, so it’s safe to store them without being anxious about what the government is finding out about us. That’s what Gowrishankar says, too, and that only information of those 17 positions that the CDFD will consider will be stored in the database.

However, this information is missing in the draft Bill, giving the impression that non-neutral information from people’s DNA profiles will be stored as well – and sans any safeguards beyond the Bill itself, like the USA has the Genetic Information Nondiscrimination Act 2008. Gowrishankar said that the Bill omitted this detail because some advancement in the future could require analysing more than 17 neutral positions, or fewer, or others altogether, and that if the Bill had been specific to that extent, it would have to be modified over and over again to keep up with the times. Be that as it may, the draft Bill in its current form neither withholds the database from holding distinctly personal information nor does it acknowledge that possibility.

In that context, the information should be accorded the same rights that information on the Internet, or anywhere else, is if not more. First, a person should be able to appeal the inclusion of her DNA profile in the database – although Gowrishankar insisted no profile could mistakenly enter the database as it would require either a court order or an expression of consent to get there. Second, the person should be able to access her/his own DNA profile whenever the need arises through appropriate legal channels – which he said wouldn’t be possible at all. Third, the person whose profile is under scrutiny should be able to know how the information contained is being used and why, and to ascertain its deletion when due. These three rights are missing in the draft bill.

Moreover, in a separate note, the committee says,

The Expert Committee also discussed and emphasised that the Privacy Bill is being piloted separately by the Government. That Bill will override all the other provisions on privacy issues in the DNA Bill.

But even as the draft DNA-profiling bill seeks to deflect the responsibility of securing privacy to the Privacy Bill, aReport of the Group of Experts on Privacy, Chaired by Justice A.P. Shah (former Chief Justice of the Delhi High Court), explicitly set out the missing privacy and security provisions in October 2012, and a majority of them remain unresolved or unaddressed. By neglecting them, the CDFD and the DNA Profiling Board run the risk of turning themselves opaque and, for all practical purposes, unaccountable. For example, the draft Bill does not:

  1. Provide a notice that DNA samples were collected from so-so areas of the body
  2. Inform anybody – particularly the individual – if and when her/his DNA is contaminated, misplaced or stolen
  3. Inform a person if a case involving her/his DNA is pending, ongoing or closed
  4. Inform the people when there are changes in how their DNA is going to be accessed, or if the way their DNA is being stored or used is changed
  5. Distinguish between when DNA can be collected with consent and when it can’t
  6. Say how volunteers can contribute their DNA to the database even though the draft Bill has a provision for voluntary submissions
  7. Provide any explicit guarantee that the collected DNA won’t be used for anything other than circumstances specified in the Bill
  8. Specify when doctors or the police can or can’t access DNA profiles

Without these protections, the DNA profiles could be collected for one purpose but end up being used for something else. Consider #7 – the draft Bill doesn’t aspire to be self-contained and leaves itself open to expanding in the future. At one point (Sec. 31(4)), it spells out the various indices according to which profiles in the database will be stored:

Every DNA Data Bank shall maintain following indices for various categories of data, namely:

(a) a crime scene index;
(b) a suspects’ index;
(c) an offenders’ index;
(d) a missing persons’ index;
(e) unknown deceased persons’ index;
(f) a volunteers’ index; and
(g) such other DNA indices as may be specified by Regulations.

Why bother to specify any of the indices at all if the committee has (g)? And without specifying what regulations those could be and who, apart from the DNA Profiling Board, has the authority to spell them out, the draft Bill signals it could just about bring anyone’s DNA profiles into the database.

Additionally, who will watch the watchmen? The DNA Profiling Board is tasked – rather tasks itself – with determining which DNA profiles enter the database, who gets to access them, and how the database will be organised and maintained, in effect establishing a low quality check over itself. Although Gowrishankar clarified that there would be a Parliamentary check on the Board’s activities and that Parliament would be the ultimate arbiter for all “major” issues arising due to the Bill, there is still a lack of supervision – and potential for abuse – in the day-to-day dispensation of duties. If the Human DNA Profiling bill has to be effective and honest, it must account for the privacy shortcomings described by the Group of Experts.

Another concern is anonymisation – the process through which information contained in DNA profiles can’t be used to retrace the individuals from whom they were acquired. There is no description of a form or application of any kind that the draft Bill expects to be submitted along with the materials containing human DNA. If the Bill expects to use the form currently being used by the CDFD, there is an anomaly: the CDFD form asks for the applicant to mention her caste. Even if the draft Bill doesn’t explicitly mention that the database will have a ‘caste’ column, being able to associate an application form with a sample – and therefore ‘its caste’ – is plausible, especially in the volunteers’ database.

More troublingly, Section 31(6)(a) states that a DNA profile in the database will bear the identity of its source if its source is an offender, and that (b) all other DNA profiles will be relatable with the case reference number. The problem is that the case reference is not anonymised with respect to the people involved in the case.

IV. Power and sunset clauses

Credit: manoftaste-de/Flickr, CC BY 2.0.
Credit: manoftaste-de/Flickr, CC BY 2.0.

The DNA Profiling Board overseeing the implementation of the bill (when enacted) has given itself, and the bill, some conflicting rules and powers that together result in ambiguity about the scope of the bill and its accountability. Some examples:

Conflicts of interest – Section 12(k) states that the board is responsible for “making recommendations for maximising the use of DNA techniques and technologies in administration of justice”. Then, throughout the bill, the board’s powers are also detailed as extending to specifying the rules for how DNA information is collected and secured. Put them together and the board’s essentially saying, “We’ll try to use DNA evidence for as many things as possible, we’ll decide how the information is collected for those purposes, and we’ll decide how we’ll use it.”

Ex post facto implication – Section 13 states that any laboratory that wishes to undertake human DNA-profiling must get prior consent from the board. Then, Section 14(2) allows any DNA laboratory that’s in existence at the time the bill is enacted to perform human DNA profiling without prior approval from the board.

Use of profiles – Section 39(g) states that “Information relating to DNA profiles, DNA samples and records relating thereto shall be made available” to a slew of judicial and executive agencies as well as “for any other purposes, as may be prescribed”. However, those prescriptions have not been detailed in the Bill, and appear to be at the discretion of the DNA Profiling Board. In fact, Section 39(e) states that the profiles, and “samples and records relating thereto”, may be used for creating a “population statistics” database. This is to facilitate population-wide studies of genetic characteristics, and in the absence of perfect anonymisation, could potentially become associated with caste data.

Moreover, Section 35(2), which deals with the communication of DNA profiles to foreign states and institutions, doesn’t limit it to offenders and convicts but, by not discussing it in detail, allows for any profile in the database to be shared. Put this together with an individual’s inability to appeal the inclusion of her/his profile, and anyone’s profile – as long as it has wound its way into the database – can be shared with foreign entities. There are also no restrictions on if the foreign agencies can index the profile in another database.

Legal recourse after three months – Someone who’s been wronged by any of the provisions of the bill can approach a court only if he/she approaches the board first and gives it three months to act on a complaint. In those three months or before that, Section 57(1) of the bill prevents anyone from approaching the courts except the central government or a member of the board itself.

Finally, there’s the absence of a sunset clause – especially when its provisions will expire, and if there is a period after which a DNA profile will be removed from the database. For the latter, the draft Bill specifies that if a person has been acquitted in a case or if the case is set aside, the corresponding profile will be deleted, but nothing is said about the profiles of missing persons who have been identified, volunteers who have died, and other profiles that are likely to be collected at crime scenes. Moreover, no rationale is presented for retaining the profiles of those who are convicted of offences like rape or murder, who end up spending long years or a lifetime in prison. While Gowrishankar asserted that only the DNA profiles of the unidentified dead would be held forever, the draft Bill does not explicitly exclude the rest.

Given the scale of issues with the draft Bill, and its potentially disastrous sidelining of privacy concerns, its scheduled introduction in the monsoon session of the Lok Sabha seems hurried – despite having first been mooted more than a decade ago. Some of the issues may have escaped the drafting committee’s concerns by way of not having received appropriate feedback – such as the issue of hidden costs – but the committee must explain why there is a lack of access to data of the people by the people, why there are no sound anonymisation protocols, and why there are insufficient self-regulation and protection measures.

Download an annotated copy of the Human DNA Profiling Bill draft here (PDF).

The Wire
July 24, 2015

Petition asks why Aadhar is a must to unlock Modi’s DigiLocker

The Wire
July 4, 2015

New Delhi: Of all the schemes of the previous Manmohan Singh government, the Aadhar UID program is one that Narendra Modi is the most committed to. On July 1, he flagged off a ‘digital locker’ service for the country as part of his Digital India initiative. According to its website, DigiLocker provides each user with 10 MB of storage space on the web to store and share files as well as makes for a hub on which to access various government documents.

The catch? A user can only sign up using an Aadhar UID number. Lawyers say this is a violation of the Supreme Court’s 2013  order prohibiting the government from making Aadhar compulsory for accessing any public service. Upset by the denial of the DigiLocker facility to those those without a UID, Sudhir Yadav has filed a petition in the Supreme Court calling for “exemplary punishment” of those responsible for this. The DigiLocker service comes from the Department of Electronics and Information Technology, under the Ministry of Communications & IT. It offers 10 MB of free storage (with an upgrade to 1 GB hinted at), allows pdf, jpg, jpeg, png, bmpg and gif file formats, and stipulates that no single file can be larger than 1 MB. This paltry storage offering is, however, masked by a bigger concern.

While Modi has frequently used the social media as part of his communication strategy, as well as exhibited some appreciation of technology in his governance, he has stayed away from pushing through legislation on privacy of public data.

In the DigiLocker initiative, there is no clarity about whether the government can access the information stored in the lockers, even as a technical documentaccompanying the release states: “It is important to mandate use of Aadhaar number in all resident documents to strongly assert ownership”. Troublingly, the same document goes on to say, “… some document types may be available to ‘trusted’ requesters without electronic authentication and authorisation of the owner (a simple consent may suffice)”.

“Either the Modi government is kind of slow on the uptake and hasn’t yet understood that the Supreme Court has thrice – on September 23, 2013, March 24, 2014 and March 16 2015 – said that services cannot be made incumbent on the UID, or it is telling the Supreme Court that it does not care what the court says and that it will act as it pleases,” says Usha Ramanathan, an independent researcher who has been investigating the UID project since 2009. “The court had also said that the government must change its forms and circulars to make that clear.”

From Orwell to Kafka, Markov to Doctorow: Understanding Big Data through metaphors

On March 20, I attended a short talk by Malavika Jayaram, a fellow at the Berkman Center for Internet & Society, titled ‘What we talk about when we talk about Big Data’ at the T.A.J. Residency in Bengaluru. It was something of an initiation into the social and political contexts of Big Data and its usage, and the important ethical conundrums assailing these contexts.

Even if it was a little slow during the first 15 minutes, Jayaram’s talk progressed rapidly later on as she quickly piled criticism after criticism upon the concept’s foundation, which was quickly being revealed to be immature. Perhaps those familiar with Jayaram’s past research did (or didn’t) find the contents of her talk to contain more nuances than she’s let on before, but to me it revealed an array of perspectives I’ve remained balefully ignorant of.

The first in line was about the metaphors used to describe Big Data – and how our use of metaphors at all betrays our inability to comprehend Big Data in its entirety. Jayaram quoted at length but loosely from an essay by Sara M. Watson, her colleague at Berkman, titled Data is the new “____”. It describes how the dominant metaphors are industrial, dealing with the data itself as if it were a natural resource and the process of analyzing it as if it were being mined or refined.

Data as a natural resource suggests that it has great value to be mined and refined but that it must be handled by experts and large-scale industrial processes. Data as a byproduct describes the transactional traces of digital interactions but suggests it is also wasteful, pollutive, and may not be meaningful without processing. Data has also been described as a fungible resource, as an asset class, suggesting that it can be traded, stored, and protected in a data vault. One programmatic advertising professional related to me that he thinks “data is the steel of the digital economy,” an image that avoids the negative connotations of oil while at the same time expressing concern about monopolizing forces of firms Google and Facebook.

Not Orwellian but Kafkaesque

There are two casualties of this perspective. The first is the people behind the data – those whose features, actions, choices, etc. have become numbers – are forgotten even as the data they have given “birth” to becomes more important and valuable. The second casualty is the constant reminder that data is valuable, and large amounts of data more so, condemning it to a life where it can’t hope to be stagnant for long.

The dehumanization of Big Data, according to Jayaram, extends beyond analysts forgetting the data belongs to faces and names and unto the restriction of personal ownership. The people the data represents often don’t have access to it. This implies an existential anxiety quite unlike found in George Orwell’s 1984 and more like the one in Franz Kafka’s The Trial. In Jayaram’s words,

You are in prison awaiting your trial. Suddenly you find out the trial has been postponed and you have no idea why or how. There seem to be people who know things that you never will. You don’t know what you can do to encourage their decisions to keep the trial permanently postponed. You don’t know what it was about you and you have no way of changing your behavior accordingly.

In 2013, American attorney John Whitehead popularized this comparison in an article titled Kafka’s America. Whitehead argues that the sentiments of Josef K., the protagonist of The Trial, are increasingly becoming the sentiments of a common American.

Josef K’s plight, one of bureaucratic lunacy and an inability to discover the identity of his accusers, is increasingly an American reality. We now live in a society in which a person can be accused of any number of crimes without knowing what exactly he has done. He might be apprehended in the middle of the night by a roving band of SWAT police. He might find himself on a no-fly list, unable to travel for reasons undisclosed. He might have his phones or internet tapped based upon a secret order handed down by a secret court, with no recourse to discover why he was targeted. Indeed, this is Kafka’s nightmare, and it is slowly becoming America’s reality.

Kafka-biographer Reiner Stach summed up these activities as well as the steadily unraveling realism of Kafka’s book as proof of “the extent to which power relies on the complicity of its victims” – and the ‘evil’ mechanism used to achieve this state is a concern that Jayaram places among the prime contemporary problems threatening civil liberties.

If your hard drive’s not in space…

There is an added complication. If the use of Big Data was predominantly suspect, it would have been easier to build consensus against its abuse. However, that isn’t the case: Big Data is more often than not used in ways that don’t harm our personal liberties, and the misfortune is that their collective beneficence as yet has been no match for the collective harm some of its misuses have achieved. Could this be because the potential for its misuse is almost everywhere?

Yes. An often overlooked facet of using Big Data is the idea that the responsible use of Big Data is not a black-and-white deal. Facebook is not all evil and academic ethnographers are not all benign. Zuckerberg’s social network may collect and store large amounts of information that it nefariously trades with advertisers – and may even comply with the NSA’s “requests” – but there is a systematicity, an orderliness, with which the data is being passed around. The complex’s existence alone presents a problem, no doubt, but that there is a complex at all makes it easier to attempt to fix the problem than if the orderliness were absent.

And this orderliness is often absent among academicians, scholars, journalists, etc., who may not think data is a dollar note but at the same time are processing prodigious amounts of it without being as careful as is necessary about how they are logging, storing and sharing it. Jayaram rightly believes that even if information is collected for benevolent purposes, the moment it becomes data it loses its memory and stays on on the Internet as data; that if we are to be responsible data-scientists, being benevolent alone will be inadequate.

To drive the point home, she recalled a comment someone had made to her during a data workshop.

The Utopian way to secure data is to shoot your hard drive into space.

Every other recourse will only fall short.

Consent is not enough

This memoryless, Markovian character of the data-economy demands a redefinition of consent as well. The question “What is consent?” is dependent on what a person is consenting to. However, almost nobody knows how the data will be used, what for, or over what time-frames. Like a variable flowing through different parts of a computer, data can pass through a variety of contexts to each of which it provides value of varying quality. So, the same question of contextual integrity should retrospectively apply to the process of consent-giving as well: What are we consenting to when we’re consenting to something?

And when both the party asking for consent and the party asked for consent can’t know all the ways in which the data will be used, the typical way-out has been to seek consent that protects one against harm – either by ensuring that one’s civil liberties are safeguarded or by explicitly prohibiting choices that will impinge upon, again, one’s civil liberties. This has also been increasingly done in a one-size-fits-all manner that the average citizen doesn’t have the bargaining power to modify.

However, it’s become obvious by now that just protecting these liberties isn’t enough to ensure that data and consent are both promised a contextual integrity.

Why not? Because the statutes that enshrine many of these liberties is yet to be refashioned for the Internet age. In India, at least, the six fundamental rights are to equality, to freedom, against exploitation, to freedom of religion, cultural and educational rights, and to constitutional remedies. Between them, the promise of protecting against the misuse of not one’s person but one’s data is tenuous (although a recent document from the Telecom Regulatory Authority of India could soon fix this).

The Little Brothers

Anyway, an immediate consequence of this typical way-out has been that one needs to be harmed to get remedy, at a time when it remains difficult to define when one’s privacy has been harmed. And since privacy has been an enabler of human rights, even unobtrusive acts of tagging and monitoring that don’t violate the law can force compliance among the people. This is what hacker Andrew Huang talks about in his afterword to Cory Doctorow’s novel Little Brother (2008),

[In] January 2007, … Boston police found suspected explosive devices and shut down the city for a day. These devices turned out to be nothing more than circuit boards with flashing LEDs, promoting a show for the Cartoon Network. The artists who placed this urban graffiti were taken in as suspected terrorists and ultimately charged with felony; the network producers had to shell out a $2 million settlement, and the head of the Cartoon Network resigned over the fallout.

Huang’s example further weakens the Big Brother metaphor by implicating not one malevolent central authority but an epidemic, Kafkaesque paranoia that has “empowered” a multitude of Little Brothers all convinced that God is only in the detail.

While Watson’s essay (Data is the new “____”) is explicit about the power of metaphors to shape public thought, Doctorow’s book and Huang’s afterword take the next logical step in that direction and highlight the clear and present danger for what it is.

It’s not the abuse of power by one head of state but the evolution of statewide machines that (exhibit the potential to) exploit the unpreparedness of the times to coerce and compel, using as their fuel the mountainous entity – sometimes as Gargantuan as to be formless, and sometimes equally absurd – called Big Data (I exaggerate – Jayaram was more measured in her assessments – but not much).

And even if Whitehead and Stach only draw parallels between The Trial and American society, the relevant, singular “flaw” of that society exists elsewhere in the world, too: the more we surveil others, the more we’ll be surveilled ourselves, and the longer we choose to stay ignorant of what’s happening to our data, the more our complicity in its misuse. It is a bitter pill to swallow.

Featured image credit: DARPA

Alibaba IPO – A vindication of China’s Internet?

This is a guest post contributed by Anuj Srivas, tech. journalist and blogger, until recently the author of Hypertext, The Hindu.

The differences between Jack Ma – the founder of Chinese e-commerce giant Alibaba – and an average Silicon Valley CEO are numerous and far-reaching. Mr. Ma’s knowledge of mathematics, for instance, was once so poor that it almost prevented him from attending college. Contrast this to the technological genius of Apple co-founder Steve Wozniak or the academic-based origins of Google’s search algorithm.

His background as an English teacher, who dabbled in a number of different sectors before being fascinated by the Internet industry, is more characteristic of the average American investor that was duped by the dot-com bubble than it is of a Bill Gates or a Mark Zuckerberg.

And yet, today, Alibaba stands shoulder-to-shoulder with much of Silicon Valley. Its recently launched initial public offering (IPO) raked in a little over $20 billion, turning it into the world’s biggest technology flotation.

Is this event an inflection point? To some, it may seem to be a natural course of affairs after Yahoo! threw Alibaba a lifeline back in 2005. But is there something else to take away from it other than the obvious comparisons with India’s fledgling Internet industry?

Foremost, it is enormously pleasing to see Jack Ma, like Lenovo’s YY, clearly avoid subscribing to the Silicon Valley ideology of ‘transparency through opacity’. The CEOs of Google, Yahoo!, Facebook and Microsoft paint a picture of openness, sharing, and transparency wherever they go. The world of the cloud seems to make life easier (“look, no wires!”) but in fact wraps its users in an opaque black box. We have no tools that allow us to track our information and data, let alone allow us to take charge.

Of course, Mr. Ma (who sticks to doling out life and management tips in his speeches) is clearly constrained by the circumstances that allowed Alibaba to become what it is today: namely, the way China views, approaches and governs its Internet. This brings us to one of the more interesting implications of Alibaba’s IPO.

For decades now, China has been the poster-boy for how the Internet would look if we stopped fighting for a transparent, open and censorship-free system. The Great Firewall of China has continued to stand, quite proudly, in the face of international criticism.

The country itself has managed to make more than one U.S technology company come around to its way of thinking. As US government official Tom Lantos commented after Yahoo actively helped China in its censorship efforts, “While technologically and financially you [Yahoo!] are giants, morally you are pygmies.”

What are we to take away from the fact that China is in the process of undergoing one of its harshest ever Internet censorship/crackdown periods since 2003 (when it started construction of its Firewall) while Alibaba may yet go down in history as the biggest technology IPO ever? China’s approach to the Internet is a deadly mixture of censorship, propaganda and protectionism. The victory of Alibaba at the New York Stock Exchange will prove to be fodder for three takeaways.

First, that China’s protectionism-censorship stance (there cannot be one without the other) works. Despite years of criticism and threatened sanctions, China currently houses three of the world’s ten most valuable technology companies. After Alibaba’s IPO, how can Beijing look at its Internet governance approach with anything but approval? This is a moment of triumph for the country’s Internet regulators.

Second, that investors do not, and will not ever, care about censorship.

Third: will other countries, already outraged by the NSA and the Snowden incident, be emboldened to take China-like steps when it comes to governing their local Internet industries? There is little doubt that most countries that need to be build their own digital infrastructure, but China and Russia have shown us that their version of digital sovereignty comes with a lack of privacy and the introduction of a censorship regime. Asian, African and Latin American countries will have to escape this trap; the success of Alibaba does not help this.

On the other hand, this will also prove to be the biggest challenge for China’s Internet. If the country wants its Internet firms to go international, it will find it tough to take refuge behind its current Internet governance policies. Companies like Huawei and ZTE, which are in the telecommunication business, have to constantly defend themselves every time they enter a new country. Alibaba, which of course will not be plagued with national security issues, will have to consciously and unconsciously defend the Chinese Internet wherever it goes.

It would be instructive to monitor Mr. Ma and whichever ideology he chooses to adopt and market in the near future. I have a feeling it will tell us quite a bit about the fate of China’s Internet.

More by Anuj Srivas:

And now, a tweet from our sponsor