digital-infrastructure-india
https://github.com/suhaancoding/digital-infrastructure-india
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.2%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: SuhaanCoding
- License: mit
- Language: Jupyter Notebook
- Default Branch: main
- Size: 1.41 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Digital Infrastructure & Economic Development in India
Just my personal research project looking at how telecom infrastructure affects economic growth across Indian states. Been working on this for a few months now using data from 2018-2023.
What I'm Working On
I got curious about whether building more cell towers and digital infrastructure actually leads to better economic outcomes. So I gathered data from various Indian states and started digging into the relationships between things like:
- Telecom tower density
- Wireless/wireline user numbers
- GDP per capita
- Digital payment adoption
- School internet access
The results have been pretty interesting - there's definitely some strong patterns emerging.
Files in This Repo
I've organized my analysis into different notebooks as I worked through various approaches:
notebooks/01_spatial_infrastructure_analysis.ipynb - Started here just calculating averages for neighboring states to see if there were any spatial patterns. Pretty basic stuff but helped me understand the data.
notebooks/02_telecom_correlation_analysis.ipynb - Dove deeper into correlations between telecom variables. Found some surprising relationships, especially with porting requests.
notebooks/03_infrastructure_relationships.ipynb - Refined the correlation analysis after I realized I needed better data cleaning. This one has the state-by-state breakdowns that were really eye-opening.
notebooks/04_network_spatial_mapping.ipynb - Got into NetworkX to map out state relationships geographically. The visualizations turned out pretty cool.
notebooks/05_panel_causality_analysis.ipynb - Attempted some serious econometric analysis here. Had to install linearmodels and figure out panel data methods. Still not 100% sure I got everything right.
notebooks/06_temporal_causality_testing.ipynb - Looking at UPI transactions vs wireless users over time. The Granger causality tests were tricky to set up properly.
notebooks/07_gdp_prediction_models.ipynb - Built a linear regression model to predict GDP from telecom variables. Actually got decent results (R² around 0.82).
notebooks/08_comparative_regression_analysis.ipynb - Comparing different infrastructure types. Still working on this one.
Running the Code
You'll need Python 3.8+ and Jupyter. I've been using conda but pip should work too:
bash
git clone https://github.com/SuhaanCoding/digital-infrastructure-india.git
cd digital-infrastructure-india
conda env create -f environment.yml
conda activate digital-infra
jupyter lab
Main Findings So Far
- Wireline users have a much stronger correlation with GDP than wireless users (was not expecting this!)
- States with high tower density don't necessarily have better economic outcomes
- There's definitely some spatial spillover effects between neighboring states
- UPI adoption seems to follow wireless user growth by about 4-6 months
Data Sources
Mostly pulled from: - TRAI reports (telecom data) - Government statistical yearbooks - RBI payment system data - Various state government websites
Notes
This is ongoing research, so some of the analysis might change as I refine my methods. Also, I'm still learning some of the more advanced econometric techniques, so feedback is welcome!
The data cleaning was honestly the most time-consuming part - lots of inconsistent formatting and missing values to deal with.
License
MIT License - feel free to use this for your own research.
Owner
- Login: SuhaanCoding
- Kind: user
- Repositories: 1
- Profile: https://github.com/SuhaanCoding
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this research, please cite it as below."
type: software
title: "Digital Infrastructure & Economic Development in India"
abstract: "Personal research project examining relationships between telecommunications infrastructure and economic development across Indian states using various econometric and statistical methods."
authors:
- family-names: "Khurana"
given-names: "Suhaan"
repository-code: "https://github.com/SuhaanCoding/digital-infrastructure-india"
url: "https://github.com/SuhaanCoding/digital-infrastructure-india"
license: MIT
date-released: "2024-01-15"
version: "1.0.0"
keywords:
- "telecommunications"
- "economic development"
- "India"
- "infrastructure"
- "data analysis"
- "regression analysis"
- "spatial analysis"
GitHub Events
Total
Last Year
Dependencies
- Cython ==3.0.10
- formulaic ==1.0.1
- mypy-extensions ==1.0.0
- pyhdfe ==0.2.0
- Cython ==3.0.10
- beautifulsoup4 ==4.12.3
- formulaic ==1.0.1
- ipywidgets ==8.1.2
- jupyter ==1.0.0
- jupyterlab ==4.1.2
- linearmodels ==5.4
- lxml ==5.1.0
- matplotlib ==3.8.3
- mypy-extensions ==1.0.0
- networkx ==3.2.1
- numpy ==1.26.4
- openpyxl ==3.1.2
- pandas ==2.2.1
- plotly ==5.19.0
- pyhdfe ==0.2.0
- requests ==2.31.0
- scikit-learn ==1.4.1
- scipy ==1.12.0
- seaborn ==0.13.2
- statsmodels ==0.14.1
- xlrd ==2.0.1