The form of data

Lost in transit
Image Source: Evaluation Toolkit for Magnet School Programs

The classroom
As part of a high school statistics project, our teacher gave us a form to collect data about customer automobile preferences. We had to analyse the data collected, and present it as a report.

While some were honest enough to actually go and get the forms filled, there were quite a few students who were getting dummy data filled by other classmates.

A few years later
I was walking near a market when a lady holding a bunch of papers asked me if I could spare a few minutes to answer questions about potato chips. She filled the fields of the survey form with my answers at great speed — a great time-saving skill, no doubt.

However, when one of my answers seemed unfavourable, she said ‘Oh no! I can’t record that.’ And then, she changed my answer!

* * *

Statistics form the core of almost every article we read. But behind numbers like 83.7% and 4.8 million, there is data collected by field staff.

While statistical reports talk of error margins, how reliable is the data on which they are based? It is hard to tell. Can we improve their quality? Definitely.

One of the projects we have had the opportunity to work on in the recent past, addresses this very issue.

Before we jump to the solution, here’s a look at the problem in a little detail.
Data typically goes through several stages before becoming a meaningful number — capture, display, interaction and analysis.

Data Chain
The typical data chain

Data capture, more often than not, involves paper forms. And paper forms have several inherent problems.

The first is the time lag between when the data is captured, and when it is available for analysis. The second problem is that of data integrity. Forms filled in manually are susceptible to errors during data capture, as well as during data transfer, as illustrated in the two real scenarios mentioned earlier. The third, and perhaps the most critical problem, is that of data authenticity. Paper forms can very easily be used to generate false information.

Raw aggregated data – typically tabulated – is not user friendly. It requires filtering in order to be useful for decision and policy making.

All this seems a lot like a game of Chinese Whispers. By the time the data can actually be analysed, it may lose its value.

Wouldn’t it be great if we could skip a few steps? As it happens, that is how technology can help. This was the subject on which our CEO, Mr. Sunil Malhotra recently spoke about at the 124A Bilateral Training Programme of International Centre for Information Systems and Audit (organised by Comptroller and Auditor General of India). While interacting with the delegates of FBSA, Republic of Iraq, during the session on Disease Surveillance and the Role of Technology, Mr. Malhotra emphasized the need to shorten the data collection timeline, as well as ensure integrity of data, through the use of mobile technology.

Here’s an excerpt from the companion presentation, explaining the common challenges involved in data collection, as well as how mobile technology can help solve them.

Stay tuned for the next post, detailing our own working solution!