icmi-paper
MACHINE ELEARNING ENHANCED OAXACA BLINDER DECOMPOSITION
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.9%) to scientific vocabulary
Repository
MACHINE ELEARNING ENHANCED OAXACA BLINDER DECOMPOSITION
Basic Info
- Host: GitHub
- Owner: Anshu989856
- License: other
- Default Branch: main
- Size: 2.25 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
ICMI-PAPER
MACHINE ELEARNING ENHANCED OAXACA BLINDER DECOMPOSITION
Machine Learning-Enhanced Oaxaca-Blinder Decomposition for Analyzing Healthcare Test Price Variations Across Indian Cities
📘 Overview
This repository contains the complete research paper and supplementary materials for:
Title:
Machine Learning-Enhanced Oaxaca-Blinder Decomposition for Analyzing Healthcare Test Price Variations Across Indian Cities
Authors:
- Anshuman Parida – b23es1008@iitj.ac.in
- Pranav Pant – pant.4@iitj.ac.in
- Tarun Raj Singh – singh.188@iitj.ac.in
- Department of Computer Science & Engineering,
Indian Institute of Technology Jodhpur, India
📄 Abstract
This study investigates the variations in healthcare test prices across Indian cities using an extended Oaxaca-Blinder decomposition method enhanced with machine learning techniques.
We introduce a novel dataset — HEALTH-PRICE — combining city-level demographics, economics, and healthcare infrastructure data. Using traditional regression and neural models, we decompose the observed price disparities into explained (e.g., city population, per capita income, number of labs) and unexplained (e.g., market inefficiencies, provider behavior) components.
Our analysis reveals: - Significant structural inefficiencies in diagnostic pricing - Potential policy pathways to improve affordability - Actionable insights for making diagnostic services more equitable
🧠 Key Contributions
- Developed an extended multi-group Oaxaca-Blinder decomposition framework.
- Integrated machine learning (MLP neural networks + SHAP) for non-linear counterfactual modeling.
- Built and released the HEALTH-PRICE dataset with over 126,000 entries across 101 Indian cities.
- Proposed targeted policy strategies based on empirical decomposition.
📊 Dataset Description
The HEALTH-PRICE dataset includes: - 101 Indian cities - 1393 diagnostic test types - 163 diseases - City-level predictors: population, income, lab density, city type, and zone - Test prices collected using Selenium + BeautifulSoup scraping from Lal PathLabs
Note: The dataset and scraping scripts are not included here but are referenced in the paper and hosted externally.
🧮 Methodology
We employ both: - OLS-based Oaxaca-Blinder decomposition - Neural Oaxaca-Blinder decomposition using feedforward neural networks and SHAP interpretability
The study quantifies: - Explained component: Impact of observable variables - Unexplained component: Structural and unobservable effects
🧪 Technologies Used
- Python
- pandas, scikit-learn, statsmodels
- Selenium & BeautifulSoup (for scraping)
- PyTorch (for neural decomposition)
- SHAP (Shapley values for NN interpretability)
- Matplotlib & Seaborn (visualizations)
📚 Citation
Please cite this work if you use or refer to it in any academic or public context:
APA:
Parida, A., Pant, P., & Singh, T. R. (2025). Machine Learning-Enhanced Oaxaca-Blinder Decomposition for Analyzing Healthcare Test Price Variations Across Indian Cities. Indian Institute of Technology Jodhpur.
BibTeX:
```bibtex @article{parida2025healthprice, title={Machine Learning-Enhanced Oaxaca-Blinder Decomposition for Analyzing Healthcare Test Price Variations Across Indian Cities}, author={Parida, Anshuman and Pant, Pranav and Singh, Tarun Raj}, journal={Indian Institute of Technology Jodhpur}, year={2025} }
Owner
- Name: ANSHUMAN PARIDA
- Login: Anshu989856
- Kind: user
- Repositories: 1
- Profile: https://github.com/Anshu989856
Citation (CITATION.cff)
cff-version: 1.2.0
title: "Machine Learning-Enhanced Oaxaca-Blinder Decomposition for Analyzing Healthcare Test Price Variations Across Indian Cities"
message: "If you use this research, please cite it using the metadata below."
authors:
- family-names: Parida
given-names: Anshuman
affiliation: Indian Institute of Technology Jodhpur
email: b23es1008@iitj.ac.in
- family-names: Pant
given-names: Pranav
affiliation: Indian Institute of Technology Jodhpur
email: pant.4@iitj.ac.in
- family-names: Singh
given-names: Tarun Raj
affiliation: Indian Institute of Technology Jodhpur
email: singh.188@iitj.ac.in
date-released: 2025-05-28
version: "1.0"
repository-code: "https://github.com/yourusername/your-repo-name"
type: paper
keywords:
- Oaxaca-Blinder Decomposition
- Healthcare Price Variation
- Machine Learning
- Diagnostic Testing
- Indian Cities
- Public Health Policy
- Neural Networks
abstract: >
This study investigates price disparities in diagnostic healthcare tests across Indian cities using an enhanced Oaxaca-Blinder decomposition framework integrated with machine learning. By analyzing a novel dataset (HEALTH-PRICE), we decompose test price differences into explained and unexplained components and reveal significant market and structural inefficiencies. Our results provide data-driven insights for improving healthcare affordability and equity in India.
GitHub Events
Total
- Push event: 1
Last Year
- Push event: 1