Three ways to overcome the lack of right data in India

Analytics   |   
Published January 28, 2015   |   

The digital footprint of society is expanding the world over into fragmented mediums (blogs, tweets, reviews etc) and technologies (mobile, web, cloud/SaaS etc). Data generated from mobile devices and the internet of things are the main contributors to this data explosion. While this provides organizations with significant business opportunities, it also presents several challenges in harnessing these information sources.

India’s digital landscape maybe evolving quickly but the overall penetration remains low, with only one in five Indians using the internet (as in July 2014). Enterprises and businesses do have access to a veritable wealth of information. While some larger organizations have made a start in harnessing the information – telecom providers, online travel agencies, online retail stores are some of the industries that are using big data analytics to engage customers to a certain extent – most Indian companies are still learning how to collect and store big data.

To put it simply, big data analytics is still in its infancy in India. Most companies are just learning to store the data collected. There are several challenges when it comes to the collection of data sets themselves. Past and current data is required to make the application of big data analytics really useful but there is a scarcity of past data in public and private sectors in India. The lack of historical data can be traced to the following:

Late and slow computerization

Healthcare, economic and statistical data, in both private and public sectors in India, is yet to be fully computerized. The main reason for this is the late adoption of IT in India. Unlike in the West, most industries in India made the transition from manual records to computerized information systems only during the last decade. Over the years, the state and central ministries have made the move towards e-governance. Efforts to deliver public services and to make access to these services easier are being made as well. While this is still a work in progress, huge amounts of data across many government sectors are yet to be digitized.

Poor quality inputs

Not only quantity, the quality of data being used for crunching also influences the quality of insights. If the signal-to-noise ratio is high, the accuracy of results may vary for less than optimum data samples. Public social media information that is available for most individuals from India lacks quality information about the users. Random facts and figures in individual profiles, sharing of spam content and fake social media accounts that are created for bots are very common in India.


Social media sites are becoming increasingly vulnerable to spam attacks. Time spent by a captive audience on social media sites opens up windows of opportunity for online threats and spammers. Again, social media spam contributes to the signal-to-noise-ratio that defines the quality of big data.

This comes in the way of generating appropriate results.

Cultural and social influences

In most Western markets, insights generated through big data can be applied across a wide consumer base. But given the extensive cultural and linguistic variation across India, any insight generated for a consumer based, say, in Chandigarh will not be directly applicable to a consumer based in Chennai. This problem is made worse by the fact that a lot of local data lives in regional publications, in different languages and has limited online visibility.

Unstructured sources of data

Big data in India is not structured. Most transactional data in the healthcare and retail segments are stored purely for book-keeping purposes. In most developed countries, user data is rich enough to provide demographic or group level markers that can be used to generate customized insights while maintaining individual privacy. The absence of such standard identifiers in Indian consumer data is one of the biggest bottlenecks in mapping transactional and social records in India.

Handsets and internet connectivity

Even though smartphones are driving the new handset market in India, feature phones still dominate everyday usage. Most connections in India are pre-paid and fewer than 10 percent of users have access to 3G networks. To add to it, internet connection speed is among the lowest in Asia. As a result, consumer data, especially retail enterprise data is limited.

As more people in India make the move to smartphones and internet connectivity improves, there will be an increase in the amount of usable data generated. That said, organizations need to make a huge effort to improve the quality of enterprise data. The good news is, the key contributors to the promise of big data analytics in India are steadily gaining ground. An increase in social media users, efforts by enterprises, both public and private, for optimum collection and storage of transactional enterprise data, will contribute to better quality data sets, leading to the improved application of big data analytics.

This article originally appeared here. Republished with permission from the author.