pyvmte

Implementation of Mogstad, Santos, Torgovitsky 2017 ECMA "Using Instrumental Variables for Inference About Policy Relevant Treatment Parameters".

https://github.com/buddejul/pyvmte

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: wiley.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.0%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Implementation of Mogstad, Santos, Torgovitsky 2017 ECMA "Using Instrumental Variables for Inference About Policy Relevant Treatment Parameters".

Basic Info
  • Host: GitHub
  • Owner: buddejul
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 258 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 9
  • Releases: 0
Created about 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme Changelog Citation

README.md

pyvmte

pre-commit.ci status image codecov

Project

This project implements a method for inference about general treatment effects in instrumental variables settings first proposed in Mogstad, Santos, Torgovitsky 2018 Econometrica (henceforth MST). The main novel results reported here are Monte Carlo simulations corresponding to the data generating process of the paper. In particular, I report extensive simulations corresponding to the identification result reported in Figure 5 of the paper.

The goal of MST is to make inference on parameters that are generally not point-identified in instrumental variable settings. For example, a well-known result is that we can only point-identify the local average treatment effect (LATE) for a given complier subpopulation in a binary-instrument binary-treatment setting. However, the researcher might be interested in the LATE for a different subpopulation or the average treatment effect (ATE).

The general idea of the paper is to set-identify a target parameter, where the identified set is constrained by the estimators we can point-identify. Intuitively, we need to make some assumptions about unobservables to provide a set for the target parameter. However, all parameters that we can point-identify put restrictions on these unobservables. The main contribution of MST is to show that all identified and target parameters can be written as linear maps of so-called marginal treatment response (MTR) functions in a binary choice model. Hence, a combination of data moments and assumptions on MTR functions imply an identified set for the target parameter. For a more detailed introduction to the method see the report in this project.

I originally started working on this project in an econometrics topics course in the 2023 summer term, but back then couldn't get the code to run properly. In particular, similar Monte Carlo studies resulted in estimates exhibiting severe bias due to a faulty implementation. The results reported here now are more plausible given the results in MST. While technically their estimator is only consistent (probably not unbiased) and they don't report any simulations, this paper would probably have not been published if their method was severely biased for any realistic sample size.

Implementation

All sets in MST (identified or estimated) are implicitly defined by linear programs (LPs). Thus, the key programming task is to compute the inputs into the linear program. I then pass these into scipy.optimize.linprog which is the scipy wrapper for several LP solvers, including highs which I use as the standard. I also implement the copt solver, which generally performs best for a range of problems (e.g. see these benchmarks). However, for the small size of the problems in my simulations I did not see any performance differences (if anything, scipy has the faster API).

Following MST, I split the code into a section identification and estimation. The former implements pure identification results for a known DGP, while the latter implements estimation of the identified set based on data. Both are based on LPs similar in spirit but with slightly different constraints. In particular, estimation has to deal with sampling uncertainty since in any finite sample the constraints will only be satisfied approximately. For details see the report.

Usage

To get started, create and activate the environment with

console $ conda/mamba env create $ conda activate pyvmte

To build the project, type

console $ pytask

To reduce runtime it is recommended to use the pytask-parallel plug-in:

console $ pytask -n <workers>

where workers is the number of workers.

With parallelization the project builds in 5-10 minutes on my machine using 11 workers.

To reduce run-time it is also possible to adjust the simulation settings in config_mc_by_size.py and config_mc_by_target.py:

```python MCSAMPLESIZES = [500, 2500, 10000]

MONTECARLOBYSIZE = MonteCarloSetup( samplesize=10000, repetitions=10000, )

MONTECARLOBYTARGET = MonteCarloSetup( samplesize=10000, repetitions=1000, uhirange=np.arange(0.35, 1, 0.05), ) ```

Reducing the repetitions always works. Only be careful with really small sample sizes which can result in errors because estimators might be undefined (the linear programs do not have a solution).

Credits

The original repo I mainly used for development can be found at pyvmte.

This project was created with cookiecutter and the econ-project-templates.

Owner

  • Name: Julian Budde
  • Login: buddejul
  • Kind: user
  • Location: Bonn, Germany

Econ PhD, Uni Bonn

Citation (CITATION)

@Unpublished{pyvmte2023,
    Title  = {Implementation of Mogstad, Santos, Torgovitsky 2018 Econometrica.},
    Author = {Julian Budde},
    Year   = {2023},
    Url    = {https://github.com/buddejul/pyvmte}
}

GitHub Events

Total
  • Create event: 1
Last Year
  • Create event: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 33
  • Total pull requests: 26
  • Average time to close issues: 27 days
  • Average time to close pull requests: 1 day
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 0.36
  • Average comments per pull request: 0.42
  • Merged pull requests: 23
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 8
  • Pull requests: 14
  • Average time to close issues: 8 days
  • Average time to close pull requests: 3 days
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.13
  • Average comments per pull request: 0.21
  • Merged pull requests: 11
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • buddejul (33)
Pull Request Authors
  • buddejul (26)
Top Labels
Issue Labels
enhancement (22) bug (11)
Pull Request Labels

Dependencies

.github/workflows/main.yml actions
  • actions/checkout v3 composite
  • codecov/codecov-action v3 composite
  • mamba-org/provision-with-micromamba main composite
pyproject.toml pypi
environment.yml pypi
  • coptpy *
  • kaleido ==0.1.0.post1
  • pdbp *