Thursday, February 14, 2013

Techies, The NCDC Could Use A Hand

Yesterday, I paid a visit the National Centre for Disease Control in Delhi on a personal errand. While waiting for one of the pathologists, I met an epidemiologist who has worked on both the policy side as well as in the lab. We had an engaging and fascinating discussion that sparked off from our mutual admiration for the amazing work of the very appropriately-named Larry Brilliant

(I encourage you to learn more about this unsung hero who has saved tens of thousands of Indian lives. Here's an interview I did with him three years ago). 

Coming back to the reason for this post, our discussion headed towards how the NCDC currently works to detect potential epidemics and stop them in their tracks. 

From what I gathered, the current system has two levels. The first is monitoring centers located in each district. These are in touch with local clinics & hospitals. They report back to New Delhi via satellite if there is a sudden rise in hospitalisations or sick patients, etc.

The second level is what the NCDC calls its 'Media Scanning & Verification' team. This team is based in Delhi and 'scans' TV & print media across the country for information of potential outbreaks. The team's epidemiologist then gets in touch with local authorities and doctors to 'verify' details of the disease, its virulence and whether any help is required to control the spread. 

In an ideal scenario, the district teams would rapidly spot a potential epidemic and alert the NCDC, which would then swing into action. In case the district teams miss something, the NCDC would pick up signals from media reports and swing into action. 

This 'sort of' works but I personally find it inadequate in the age of Google Flu Trends and rapid, mass transit. 

Our system relies on official reporting, which as we all know, takes time. Several days may be lost between the time 'Patient Zero' develops symptoms to the time 'The System' realises 'Patient Zero' has travelled to another city via a major airport hub. 

Now, Google does report Dengue Trends for India. However, the NCDC has some 35 diseases on its watchlist.

So here's a call to techies and coders who'd be willing to volunteer time and work with the really passionate doctors at the NCDC. I personally believe an effort to develop tools like the ones Google offers, relevant to the Indian context, can save countless lives. I'd be happy to connect you to the team I met at NCDC. Leave me a comment or connect with me via Twitter. 

Wednesday, February 6, 2013

Rapes and the Indian Justice System: An experimental data visualisation

This weekend, I attended an excellent big data visualisation workshop organised by Hacks/Hackers Delhi. The idea was to get journalists and techies to collaborate on investigative, data-driven stories and tell them in intuitive ways. Numbers dull the reader's mind. A well-designed infographic can convey complex ideas in a single frame.

The group I was a part of included journalists Nasr ul Hadi, Rajan Zaveri, Aayush Soni and my colleague at ITG Rohan Venkat. Our 'techies', who did much of the heavy lifting, were Piyush Kumar and Konark Modi. We were also joined by Yuan Lei, a journalism student from Shantou University in Guangdong. Since it's been such an important story in recent weeks, we looked at how rape cases in India are treated by 'the system'.

Before I get to our findings, I want to add a short note. We had just three hours to find data, organise it, clean it, 'query' it, and generate the visualisations. Not a lot of time. I'm sure a team working with more resources (particularly time) will draw more impactful conclusions. Our intention - since this was not a formal editorial process - was to start the conversation. We focused on three questions based on the data we had immediately available.

1. Adjusted for population, which states have the highest incidence of rape? 
For brevity's  sake, we called this 'rape probability'. In other words, how many rapes per thousand people. (Total reported rape cases/ State's Population x 1000).

Some states such as Mizoram appear to have an unusually high 'rape probability'. This may simply be because more rapes are reported, and not necessarily because women are more at risk. The national average was about 0.03 rapes per 1000 people.

2. If a rape case is reported in a state, how often does it result in a formal chargesheet? 
We called this 'chargesheet probability'. (Number of cases where charges are framed / Total reported cases x 100).

The clear outlier here is Manipur with just 9% of reported rapes ending in chargesheets. Is that only because of AFSPA? We cannot draw that conclusion until we know who the suspects are in each reported case. I also noticed an oddity. Three states - Andaman & Nicobar Islands, Tripura and Goa  - have 'chargesheet probabilities' higher than 100%. We didn't have time to find out why, so if someone out there could help in explaining that, I'd be grateful. The national average here was about 80%.

Update:  As Twitter user Pramurto Mukhopadhyay explains here, one likely reason for why the 'chargesheet probability' for Andaman & Nicobar Islands, Tripura and Goa crosses 100% is because there may be more than one accused per case. This is possible in the case of gangrape, or if the main accused had accomplices.

3. Finally, of the total reported cases, how many result in convictions? 
We called this 'conviction probability'. (No. of cases ending in convictions / No. of reported rapes x 100).

The outliers here are Nagaland and Sikkim with convictions secured in nearly 70% of cases that went to trial. Kerala was personally surprising with just a 2.7% conviction rate. The national average was about 18%.

Data sources: 2011 National Crime Records Bureau, National Census data

P.s. Even though we had the data, Google Fusion Tables would not generate visualisations for Jammu and Kashmir. It automatically marked the territory as 'disputed'. Oddly, while it marks Arunachal Pradesh with similar diagonal lines, we still get the data represented on a map. I have contacted the Help Team about this and will post an update if I get a reply. 

P.P.S Several states and U.Ts would show zero in their data fields. That's mostly because data was either unavailable or could not be reliably 'cleaned'. 

Saturday, February 2, 2013

When Journalism & Public Opinion Don't Mix

The Hindu's articles on past events at the Line of Control have set of a storm. Some readers dismiss the reports as a fabricated lie. Others dismiss them as Rawalpindi's propaganda. Both groups hold the opinion that the reports were designed to demoralise India's armed forces. Analysis on the South Asian Idea (SAI) blog, for example, portrays the stories as going against India's "national interest'. It argues that the article plays into the hands of the enemy by diminishing our own forces' morale. 

This is an interestingly-phrased argument. The author of the SAI blogpost does not actually deny Indian forces have committed atrocities. Instead he or she argues that the act of revealing these atrocities is the real problem. I disagree wholeheartedly. 

Jaideep Prabhu has a very thoughtful take on this episode here. In a sort of 'preamble', he writes the following: 
"Unlike those offended by the article, I do not think that the events show the Indian Army in a bad light. Having studied conflicts over centuries, one accepts that tragedies occur when people with weapons under a lot of stress are put in extreme environments. This is not to impose an equality between India and Pakistan – the latter has acquired an international reputation for aiding and abetting terrorists while the former, us guys, may have problems but do not indulge in such activities. It is also incredibly obtuse to think that one side would not give as good as it gets, no matter what the orders are from HQ – unit cohesion would not last the week otherwise." 

I share Jaideep's opinions and tweeted something very similar last month. Now that you know where I stand on whether Pakistan gains any moral(e) advantage, let's move to the Hindu's coverage. 

I asked myself this question: If I had access to the the UNMOGIP documents and I was able to verify their authenticity, would I write about them? My answer was 'Yes'. When I reached that answer, I realised it did not matter how I got the documents. It could have been someone at the "Media Facility" as indicated in Major Lucero's first email to Jaideep; or another contact at the UN Headquarters in New York, where the Major indicated these documents were sent. In fact, as the SAI blogpost author suggests, the source may well have been the ISI's New Delhi station chief himself. 

Frankly, though, once I could confirm the documents were indeed from UNMOGIP, where I got them from was no longer important. The only thing that would matter is that I reported their contents in a balanced and responsible manner. Here's how I would do it. 

Since Pakistan's complaints were not been investigated, I would present their version as claims and allegations. Here's how The Hindu did it: 
"...The allegations, laid out in confidential Pakistani complaints to the United Nations Military Observer Group in India and Pakistan..." 
"...The most savage cross-LoC violence Indian forces are alleged to have participated in..." 
"..The Pakistani military claimed to have recovered an Indian-made watch..." 
"...the Bandala massacre is alleged to have been carried out by irregulars backed by Indian special forces..." 
"...Indian troops, Pakistan alleged, killed a JCO... and three soldiers in a raid on a post in the Baroh sector..." 

There are at least four more instances where the word 'alleged' appears in the article. 

So here's an experiment: remove the words 'alleged' and 'claimed' each time they appear. As any practitioner of journalism, law or diplomacy will tell you, the meanings of the sentences would be dramatically altered. From one-sided allegations, they would become statements of fact. The Hindu has been very deliberate in not allowing this to happen. 

Approach the Indian government and armed forces for an official response. The Hindu approached the Ministry of Defence (which only responded after the report was published), the Ministry of External Affairs, a military spokesperson and an army officer who served in India's Northern Command when some of the incidents allegedly occurred. 

Here, it is important here to explain how this business of 'official comment' works. One individual who clearly has no reporting experience, reached some embarrassingly premature conclusions because of his lack of understanding of this process. Your source and the official spokesman are not always the same person. So, the source could well have been a disgruntled ministry staffer with access to the UNMOGIP files. While the spokesman who has no knowledge of the exchange of information with your source, would deny the story.

Sometimes it gets more complicated. It isn't unheard of for the spokesman to be your source, and yet simultaneously issue a denial on behalf of the ministry!

Write the story presenting the various allegations and responses. This step is the easiest. The hard part of journalism is the stuff you do before you sit down to write. It is the coaxing, convincing and cajoling of officials to hand over information you shouldn't have. Strike that, it's actually the many hours you spend drinking tea with them before you ever approach them for information. It is the dozens upon dozens upon dozens of pages of research you go through to understand the contours of the story. It is the meetings with editors to shape and direct your focus. All of that happens before pen is ever put on paper. 

As far as I am concerned, there is no doubt to the authenticity of the UNMOGIP documents. Jaideep found two of them while conducting his own research. Others are possibly still classified and therefore unavailable on the UN website.

Still, what is in doubt is the veracity of Pakistan's complaints contained therein. For my money, the writer has made this explicitly clear. To some this may sound like a load of journalese, but as any one who pays attention to linguistic details will tell you, that is where the devil lies.