pyvmte

Implementation of Mogstad, Santos, Torgovitsky 2017 ECMA "Using Instrumental Variables for Inference About Policy Relevant Treatment Parameters".

https://github.com/buddejul/pyvmte

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
✓
Academic publication links
Links to: wiley.com
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (15.0%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

Implementation of Mogstad, Santos, Torgovitsky 2017 ECMA "Using Instrumental Variables for Inference About Policy Relevant Treatment Parameters".

Basic Info

Host: GitHub
Owner: buddejul
Language: Python
Default Branch: main
Homepage:
Size: 258 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 9
Releases: 0

Created about 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme Changelog Citation

pyvmte

Project

This project implements a method for inference about general treatment effects in instrumental variables settings first proposed in Mogstad, Santos, Torgovitsky 2018 Econometrica (henceforth MST). The main novel results reported here are Monte Carlo simulations corresponding to the data generating process of the paper. In particular, I report extensive simulations corresponding to the identification result reported in Figure 5 of the paper.

The goal of MST is to make inference on parameters that are generally not point-identified in instrumental variable settings. For example, a well-known result is that we can only point-identify the local average treatment effect (LATE) for a given complier subpopulation in a binary-instrument binary-treatment setting. However, the researcher might be interested in the LATE for a different subpopulation or the average treatment effect (ATE).

The general idea of the paper is to set-identify a target parameter, where the identified set is constrained by the estimators we can point-identify. Intuitively, we need to make some assumptions about unobservables to provide a set for the target parameter. However, all parameters that we can point-identify put restrictions on these unobservables. The main contribution of MST is to show that all identified and target parameters can be written as linear maps of so-called marginal treatment response (MTR) functions in a binary choice model. Hence, a combination of data moments and assumptions on MTR functions imply an identified set for the target parameter. For a more detailed introduction to the method see the report in this project.

I originally started working on this project in an econometrics topics course in the 2023 summer term, but back then couldn't get the code to run properly. In particular, similar Monte Carlo studies resulted in estimates exhibiting severe bias due to a faulty implementation. The results reported here now are more plausible given the results in MST. While technically their estimator is only consistent (probably not unbiased) and they don't report any simulations, this paper would probably have not been published if their method was severely biased for any realistic sample size.

Implementation

All sets in MST (identified or estimated) are implicitly defined by linear programs (LPs). Thus, the key programming task is to compute the inputs into the linear program. I then pass these into scipy.optimize.linprog which is the scipy wrapper for several LP solvers, including highs which I use as the standard. I also implement the copt solver, which generally performs best for a range of problems (e.g. see these benchmarks). However, for the small size of the problems in my simulations I did not see any performance differences (if anything, scipy has the faster API).

Following MST, I split the code into a section identification and estimation. The former implements pure identification results for a known DGP, while the latter implements estimation of the identified set based on data. Both are based on LPs similar in spirit but with slightly different constraints. In particular, estimation has to deal with sampling uncertainty since in any finite sample the constraints will only be satisfied approximately. For details see the report.

Usage

To get started, create and activate the environment with

console $ conda/mamba env create $ conda activate pyvmte

To build the project, type

console $ pytask

To reduce runtime it is recommended to use the pytask-parallel plug-in:

console $ pytask -n <workers>

where workers is the number of workers.

With parallelization the project builds in 5-10 minutes on my machine using 11 workers.

To reduce run-time it is also possible to adjust the simulation settings in config_mc_by_size.py and config_mc_by_target.py:

```python MCSAMPLESIZES = [500, 2500, 10000]

MONTECARLOBYSIZE = MonteCarloSetup( samplesize=10000, repetitions=10000, )

MONTECARLOBYTARGET = MonteCarloSetup( samplesize=10000, repetitions=1000, uhirange=np.arange(0.35, 1, 0.05), ) ```

Reducing the repetitions always works. Only be careful with really small sample sizes which can result in errors because estimators might be undefined (the linear programs do not have a solution).

Credits

The original repo I mainly used for development can be found at pyvmte.

This project was created with cookiecutter and the econ-project-templates.

Owner

Name: Julian Budde
Login: buddejul
Kind: user
Location: Bonn, Germany

Repositories: 2
Profile: https://github.com/buddejul

Econ PhD, Uni Bonn

Citation (CITATION)

@Unpublished{pyvmte2023,
    Title  = {Implementation of Mogstad, Santos, Torgovitsky 2018 Econometrica.},
    Author = {Julian Budde},
    Year   = {2023},
    Url    = {https://github.com/buddejul/pyvmte}
}

GitHub Events

Total

Create event: 1

Last Year

Create event: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 33
Total pull requests: 26
Average time to close issues: 27 days
Average time to close pull requests: 1 day
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 0.36
Average comments per pull request: 0.42
Merged pull requests: 23
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 8
Pull requests: 14
Average time to close issues: 8 days
Average time to close pull requests: 3 days
Issue authors: 1
Pull request authors: 1
Average comments per issue: 0.13
Average comments per pull request: 0.21
Merged pull requests: 11
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

buddejul (33)

Pull Request Authors

buddejul (26)

Top Labels

Issue Labels

enhancement (22) bug (11)

Pull Request Labels

Dependencies

.github/workflows/main.yml actions

actions/checkout v3 composite
codecov/codecov-action v3 composite
mamba-org/provision-with-micromamba main composite

pyproject.toml pypi

environment.yml pypi

coptpy *
kaleido ==0.1.0.post1
pdbp *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

pyvmte

Science Score: 54.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

pyvmte

Project

Implementation

Usage

Credits

Owner

Citation (CITATION)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies