Open datasets relevant to India

Shraddha
2 min readMar 31, 2018

In a recent attempt to find some data to explain statistical concepts, I did quite a bit of researching (read: searching on google) to find data relevant to India. While the data may be open, only a few can be used to analyse and derive insights. These are some of the links I came across:

  1. Bhuvan — Indian Geo-Platform of ISRO

There are a lot of things going on, on this website. There is alot of data here. Only image data though. If you are interested in processing images taken from satellites then I’m sure you will find something here.

2. National Remote sensing centre

As mentioned on their website: The data from the satellites are used for several applications covering agriculture, water resources, urban planning, rural development, mineral prospecting, environment, forestry, ocean resources and disaster management. This is again a database of images. You would need to create an account to download data from there.

3. DataMeet

The data here would be more useful if you are building an application. Not for analysis.

4. Ministry of Statistics and Programme Implementation

It has interesting data. Area and population distribution across variables; consumption numbers; expenditure numbers; etc. They can all be accessed from the above link. However, these are more of the final numbers. The kind you may come up with after your analysis. Or you may find a way to combine all that data to do some interesting analysis. The data opens up as an .xlsx in the browser. You will have to copy paste it to use it.

5. World Bank

World bank has data for various countries, India included. This just gives the final value for the metric chosen. You can select metric on the website and see the value for it.

And finally, (the most useful information usually comes at the end).

6. Open Governance India

I found this the most useful resource. There is data for electricity consumption/ generation / inflation/ population/ water usage etc. You can explore the website to find your data of interest. The data can be downloaded as a .csv making it very convenient to analyse. It has some of the above and more websites as it’s datasource. You can also access the data from your R or Python application directly. You can also see the visualisations on their website before you download to get a feel of it. All the downloads I have encountered here require you to select:

1. Regions

2. Variables of Interest

3. Time period

This website has all the data in one place and gives lots of options to view and download the data. Highly recommended!

If you know of any other open datasources pertaining to India please mention in the comments below. Also, let me know your experience with the above websites :)

If you want to start exploring data science concepts, my first post in the series of Data science for layman can be found here.

--

--

Shraddha

A data scientist &researcher, enjoys painting, crafts, dancing and dreaming