Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.2%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: SuhaanCoding
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 1.41 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 8 months ago · Last pushed 7 months ago
Metadata Files
Readme License Citation

README.md

Digital Infrastructure & Economic Development in India

Just my personal research project looking at how telecom infrastructure affects economic growth across Indian states. Been working on this for a few months now using data from 2018-2023.

What I'm Working On

I got curious about whether building more cell towers and digital infrastructure actually leads to better economic outcomes. So I gathered data from various Indian states and started digging into the relationships between things like:

  • Telecom tower density
  • Wireless/wireline user numbers
  • GDP per capita
  • Digital payment adoption
  • School internet access

The results have been pretty interesting - there's definitely some strong patterns emerging.

Files in This Repo

I've organized my analysis into different notebooks as I worked through various approaches:

notebooks/01_spatial_infrastructure_analysis.ipynb - Started here just calculating averages for neighboring states to see if there were any spatial patterns. Pretty basic stuff but helped me understand the data.

notebooks/02_telecom_correlation_analysis.ipynb - Dove deeper into correlations between telecom variables. Found some surprising relationships, especially with porting requests.

notebooks/03_infrastructure_relationships.ipynb - Refined the correlation analysis after I realized I needed better data cleaning. This one has the state-by-state breakdowns that were really eye-opening.

notebooks/04_network_spatial_mapping.ipynb - Got into NetworkX to map out state relationships geographically. The visualizations turned out pretty cool.

notebooks/05_panel_causality_analysis.ipynb - Attempted some serious econometric analysis here. Had to install linearmodels and figure out panel data methods. Still not 100% sure I got everything right.

notebooks/06_temporal_causality_testing.ipynb - Looking at UPI transactions vs wireless users over time. The Granger causality tests were tricky to set up properly.

notebooks/07_gdp_prediction_models.ipynb - Built a linear regression model to predict GDP from telecom variables. Actually got decent results (R² around 0.82).

notebooks/08_comparative_regression_analysis.ipynb - Comparing different infrastructure types. Still working on this one.

Running the Code

You'll need Python 3.8+ and Jupyter. I've been using conda but pip should work too:

bash git clone https://github.com/SuhaanCoding/digital-infrastructure-india.git cd digital-infrastructure-india conda env create -f environment.yml conda activate digital-infra jupyter lab

Main Findings So Far

  • Wireline users have a much stronger correlation with GDP than wireless users (was not expecting this!)
  • States with high tower density don't necessarily have better economic outcomes
  • There's definitely some spatial spillover effects between neighboring states
  • UPI adoption seems to follow wireless user growth by about 4-6 months

Data Sources

Mostly pulled from: - TRAI reports (telecom data) - Government statistical yearbooks - RBI payment system data - Various state government websites

Notes

This is ongoing research, so some of the analysis might change as I refine my methods. Also, I'm still learning some of the more advanced econometric techniques, so feedback is welcome!

The data cleaning was honestly the most time-consuming part - lots of inconsistent formatting and missing values to deal with.

License

MIT License - feel free to use this for your own research.

Owner

  • Login: SuhaanCoding
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this research, please cite it as below."
type: software
title: "Digital Infrastructure & Economic Development in India"
abstract: "Personal research project examining relationships between telecommunications infrastructure and economic development across Indian states using various econometric and statistical methods."
authors:
  - family-names: "Khurana"
    given-names: "Suhaan"
repository-code: "https://github.com/SuhaanCoding/digital-infrastructure-india"
url: "https://github.com/SuhaanCoding/digital-infrastructure-india"
license: MIT
date-released: "2024-01-15"
version: "1.0.0"
keywords:
  - "telecommunications"
  - "economic development"
  - "India"
  - "infrastructure"
  - "data analysis"
  - "regression analysis"
  - "spatial analysis" 

GitHub Events

Total
Last Year

Dependencies

environment.yml pypi
  • Cython ==3.0.10
  • formulaic ==1.0.1
  • mypy-extensions ==1.0.0
  • pyhdfe ==0.2.0
requirements.txt pypi
  • Cython ==3.0.10
  • beautifulsoup4 ==4.12.3
  • formulaic ==1.0.1
  • ipywidgets ==8.1.2
  • jupyter ==1.0.0
  • jupyterlab ==4.1.2
  • linearmodels ==5.4
  • lxml ==5.1.0
  • matplotlib ==3.8.3
  • mypy-extensions ==1.0.0
  • networkx ==3.2.1
  • numpy ==1.26.4
  • openpyxl ==3.1.2
  • pandas ==2.2.1
  • plotly ==5.19.0
  • pyhdfe ==0.2.0
  • requests ==2.31.0
  • scikit-learn ==1.4.1
  • scipy ==1.12.0
  • seaborn ==0.13.2
  • statsmodels ==0.14.1
  • xlrd ==2.0.1