icmi-paper

MACHINE ELEARNING ENHANCED OAXACA BLINDER DECOMPOSITION

https://github.com/anshu989856/icmi-paper

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.9%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

MACHINE ELEARNING ENHANCED OAXACA BLINDER DECOMPOSITION

Basic Info
  • Host: GitHub
  • Owner: Anshu989856
  • License: other
  • Default Branch: main
  • Size: 2.25 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 9 months ago · Last pushed 9 months ago
Metadata Files
Readme License Citation

README.md

ICMI-PAPER

MACHINE ELEARNING ENHANCED OAXACA BLINDER DECOMPOSITION

Machine Learning-Enhanced Oaxaca-Blinder Decomposition for Analyzing Healthcare Test Price Variations Across Indian Cities

📘 Overview

This repository contains the complete research paper and supplementary materials for:

Title:
Machine Learning-Enhanced Oaxaca-Blinder Decomposition for Analyzing Healthcare Test Price Variations Across Indian Cities

Authors:
- Anshuman Parida – b23es1008@iitj.ac.in
- Pranav Pant – pant.4@iitj.ac.in
- Tarun Raj Singh – singh.188@iitj.ac.in
- Department of Computer Science & Engineering,
Indian Institute of Technology Jodhpur, India


📄 Abstract

This study investigates the variations in healthcare test prices across Indian cities using an extended Oaxaca-Blinder decomposition method enhanced with machine learning techniques.

We introduce a novel dataset — HEALTH-PRICE — combining city-level demographics, economics, and healthcare infrastructure data. Using traditional regression and neural models, we decompose the observed price disparities into explained (e.g., city population, per capita income, number of labs) and unexplained (e.g., market inefficiencies, provider behavior) components.

Our analysis reveals: - Significant structural inefficiencies in diagnostic pricing - Potential policy pathways to improve affordability - Actionable insights for making diagnostic services more equitable


🧠 Key Contributions

  • Developed an extended multi-group Oaxaca-Blinder decomposition framework.
  • Integrated machine learning (MLP neural networks + SHAP) for non-linear counterfactual modeling.
  • Built and released the HEALTH-PRICE dataset with over 126,000 entries across 101 Indian cities.
  • Proposed targeted policy strategies based on empirical decomposition.

📊 Dataset Description

The HEALTH-PRICE dataset includes: - 101 Indian cities - 1393 diagnostic test types - 163 diseases - City-level predictors: population, income, lab density, city type, and zone - Test prices collected using Selenium + BeautifulSoup scraping from Lal PathLabs

Note: The dataset and scraping scripts are not included here but are referenced in the paper and hosted externally.


🧮 Methodology

We employ both: - OLS-based Oaxaca-Blinder decomposition - Neural Oaxaca-Blinder decomposition using feedforward neural networks and SHAP interpretability

The study quantifies: - Explained component: Impact of observable variables - Unexplained component: Structural and unobservable effects


🧪 Technologies Used

  • Python
  • pandas, scikit-learn, statsmodels
  • Selenium & BeautifulSoup (for scraping)
  • PyTorch (for neural decomposition)
  • SHAP (Shapley values for NN interpretability)
  • Matplotlib & Seaborn (visualizations)

📚 Citation

Please cite this work if you use or refer to it in any academic or public context:

APA:

Parida, A., Pant, P., & Singh, T. R. (2025). Machine Learning-Enhanced Oaxaca-Blinder Decomposition for Analyzing Healthcare Test Price Variations Across Indian Cities. Indian Institute of Technology Jodhpur.

BibTeX:

```bibtex @article{parida2025healthprice, title={Machine Learning-Enhanced Oaxaca-Blinder Decomposition for Analyzing Healthcare Test Price Variations Across Indian Cities}, author={Parida, Anshuman and Pant, Pranav and Singh, Tarun Raj}, journal={Indian Institute of Technology Jodhpur}, year={2025} }

Owner

  • Name: ANSHUMAN PARIDA
  • Login: Anshu989856
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
title: "Machine Learning-Enhanced Oaxaca-Blinder Decomposition for Analyzing Healthcare Test Price Variations Across Indian Cities"
message: "If you use this research, please cite it using the metadata below."
authors:
  - family-names: Parida
    given-names: Anshuman
    affiliation: Indian Institute of Technology Jodhpur
    email: b23es1008@iitj.ac.in
  - family-names: Pant
    given-names: Pranav
    affiliation: Indian Institute of Technology Jodhpur
    email: pant.4@iitj.ac.in
  - family-names: Singh
    given-names: Tarun Raj
    affiliation: Indian Institute of Technology Jodhpur
    email: singh.188@iitj.ac.in
date-released: 2025-05-28
version: "1.0"
repository-code: "https://github.com/yourusername/your-repo-name"
type: paper
keywords:
  - Oaxaca-Blinder Decomposition
  - Healthcare Price Variation
  - Machine Learning
  - Diagnostic Testing
  - Indian Cities
  - Public Health Policy
  - Neural Networks
abstract: >
  This study investigates price disparities in diagnostic healthcare tests across Indian cities using an enhanced Oaxaca-Blinder decomposition framework integrated with machine learning. By analyzing a novel dataset (HEALTH-PRICE), we decompose test price differences into explained and unexplained components and reveal significant market and structural inefficiencies. Our results provide data-driven insights for improving healthcare affordability and equity in India.

GitHub Events

Total
  • Push event: 1
Last Year
  • Push event: 1