doubleml-for-py

DoubleML - Double Machine Learning in Python

https://github.com/doubleml/doubleml-for-py

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    2 of 15 committers (13.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.9%) to scientific vocabulary

Keywords

causal-inference data-science double-machine-learning econometrics machine-learning python scikit-learn statistics
Last synced: 6 months ago · JSON representation ·

Repository

DoubleML - Double Machine Learning in Python

Basic Info
  • Host: GitHub
  • Owner: DoubleML
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: main
  • Homepage: https://docs.doubleml.org
  • Size: 10.1 MB
Statistics
  • Stars: 640
  • Watchers: 15
  • Forks: 98
  • Open Issues: 28
  • Releases: 30
Topics
causal-inference data-science double-machine-learning econometrics machine-learning python scikit-learn statistics
Created over 5 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

DoubleML - Double Machine Learning in Python

build PyPI version Conda Version codecov Codacy Badge Python version

The Python package DoubleML provides an implementation of the double / debiased machine learning framework of Chernozhukov et al. (2018). It is built on top of scikit-learn (Pedregosa et al., 2011).

Note that the Python package was developed together with an R twin based on mlr3. The R package is also available on GitHub and CRAN Version.

Documentation and Maintenance

Documentation and website: https://docs.doubleml.org/

DoubleML is currently maintained by @PhilippBach and @SvenKlaassen.

Bugs can be reported to the issue tracker at https://github.com/DoubleML/doubleml-for-py/issues.

Main Features

Double / debiased machine learning (Chernozhukov et al. (2018)) for

  • Partially linear regression models (PLR)
  • Partially linear IV regression models (PLIV)
  • Interactive regression models (IRM)
  • Interactive IV regression models (IIVM)

The object-oriented implementation of DoubleML is very flexible. The model classes DoubleMLPLR, DoubleMLPLIV, DoubleMLIRM and DoubleIIVM implement the estimation of the nuisance functions via machine learning methods and the computation of the Neyman orthogonal score function. All other functionalities are implemented in the abstract base class DoubleML. In particular functionalities to estimate double machine learning models and to perform statistical inference via the methods fit, bootstrap, confint, p_adjust and tune. This object-oriented implementation allows a high flexibility for the model specification in terms of ...

  • ... the machine learners for the nuisance functions,
  • ... the resampling schemes,
  • ... the double machine learning algorithm,
  • ... the Neyman orthogonal score functions,
  • ...

It further can be readily extended with regards to

  • ... new model classes that come with Neyman orthogonal score functions being linear in the target parameter,
  • ... alternative score functions via callables,
  • ... alternative resampling schemes,
  • ...

An overview of the OOP structure of the DoubleML package is given in the graphic available at https://github.com/DoubleML/doubleml-for-py/blob/main/doc/oop.svg

Installation

DoubleML requires

  • Python
  • sklearn
  • numpy
  • scipy
  • pandas
  • statsmodels
  • joblib

To install DoubleML with pip use

pip install -U DoubleML

DoubleML can be installed from source via

git clone git@github.com:DoubleML/doubleml-for-py.git cd doubleml-for-py pip install --editable .

Detailed installation instructions can be found in the documentation.

Contributing

DoubleML is a community effort. Everyone is welcome to contribute. To get started for your first contribution we recommend reading our contributing guidelines and our code of conduct.

Citation

If you use the DoubleML package a citation is highly appreciated:

You can click the "Cite this repository" button at the top of the GitHub page to cite the package directly. Alternatively, you can cite the following paper:

Bach, P., Chernozhukov, V., Kurz, M. S., and Spindler, M. (2022), DoubleML - An Object-Oriented Implementation of Double Machine Learning in Python, Journal of Machine Learning Research, 23(53): 1-6, https://www.jmlr.org/papers/v23/21-0862.html.

Bibtex-entry:

@article{DoubleML2022, title = {{DoubleML} -- {A}n Object-Oriented Implementation of Double Machine Learning in {P}ython}, author = {Philipp Bach and Victor Chernozhukov and Malte S. Kurz and Martin Spindler}, journal = {Journal of Machine Learning Research}, year = {2022}, volume = {23}, number = {53}, pages = {1--6}, url = {http://jmlr.org/papers/v23/21-0862.html} }

Acknowledgements

Funding by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) is acknowledged – Project Number 431701914.

References

Bach, P., Chernozhukov, V., Kurz, M. S., and Spindler, M. (2022), DoubleML - An Object-Oriented Implementation of Double Machine Learning in Python, Journal of Machine Learning Research, 23(53): 1-6, https://www.jmlr.org/papers/v23/21-0862.html.

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W. and Robins, J. (2018), Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21: C1-C68. doi:10.1111/ectj.12097.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M. and Duchesnay, E. (2011), Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12: 2825--2830, https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html.

Owner

  • Name: DoubleML
  • Login: DoubleML
  • Kind: organization

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: DoubleML - Double Machine Learning in Python
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Philipp
    family-names: Bach
    email: philipp.bach@uni-hamburg.de
    affiliation: University of Hamburg
    orcid: 'https://orcid.org/0000-0002-7183-9239'
  - given-names: Victor
    family-names: Chernozhukov
    affiliation: MIT
    orcid: 'https://orcid.org/0000-0002-3250-6714'
    email: vchern@mit.edu
  - given-names: Sven
    family-names: Klaassen
    email: sven.klaassen@uni-hamburg.de
    affiliation: University of Hamburg
    orcid: 'https://orcid.org/0009-0004-9080-0809'
  - given-names: Malte S.
    family-names: Kurz
  - given-names: Martin
    family-names: Spindler
    email: martin.spindler@uni-hamburg.de
    orcid: 'https://orcid.org/0000-0002-1294-7782'
    affiliation: University of Hamburg
repository-code: 'https://github.com/DoubleML/doubleml-for-py'
url: 'https://docs.doubleml.org/stable/index.html'
license: BSD-3-Clause
references:
  - type: article
    authors:
      - given-names: Philipp
        family-names: Bach
      - given-names: Victor
        family-names: Chernozhukov
      - given-names: Malte S.
        family-names: Kurz
      - given-names: Martin
        family-names: Spindler
    title: "DoubleML - An Object-Oriented Implementation of Double Machine Learning in Python"
    journal: Journal of Machine Learning Research
    year: 2022
    volume: 23
    issue: 53
    start: 1
    end: 6
    url: http://jmlr.org/papers/v23/21-0862.html

  - type: article
    authors:
      - given-names: Philipp
        family-names: Bach
      - given-names: Malte S.
        family-names: Kurz
      - given-names: Victor
        family-names: Chernozhukov
      - given-names: Martin
        family-names: Spindler
      - given-names: Sven
        family-names: Klaassen
    title: "DoubleML: An Object-Oriented Implementation of Double Machine Learning in R"
    journal: Journal of Statistical Software
    year: 2024
    volume: 108
    issue: 3
    start: 1
    end: 56
    doi: 10.18637/jss.v108.i03
    url: https://www.jstatsoft.org/index.php/jss/article/view/v108i03

GitHub Events

Total
  • Create event: 34
  • Release event: 4
  • Issues event: 34
  • Watch event: 138
  • Delete event: 4
  • Issue comment event: 33
  • Push event: 300
  • Pull request review event: 54
  • Pull request review comment event: 55
  • Pull request event: 56
  • Fork event: 23
Last Year
  • Create event: 34
  • Release event: 4
  • Issues event: 34
  • Watch event: 138
  • Delete event: 4
  • Issue comment event: 33
  • Push event: 300
  • Pull request review event: 54
  • Pull request review comment event: 55
  • Pull request event: 56
  • Fork event: 23

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 2,515
  • Total Committers: 15
  • Avg Commits per committer: 167.667
  • Development Distribution Score (DDS): 0.541
Past Year
  • Commits: 764
  • Committers: 7
  • Avg Commits per committer: 109.143
  • Development Distribution Score (DDS): 0.098
Top Committers
Name Email Commits
Sven Klaassen 4****n 1,155
Malte S. Kurz m****z@g****m 1,113
OliverSchacht 6****t 73
Philipp Bach b****p@o****m 54
Jan Teichert-Kluge j****e@u****e 52
Michaela Kecskésová x****0@f****z 38
Philipp Schwarz P****z@a****m 8
Schacht B****0@u****e 7
David Masip d****4@g****m 5
Ezequiel Smucler e****0@g****m 3
Kin Ho Lucien Lo l****o@l****e 3
LGTM Migrator l****r 1
anntis a****a@g****m 1
Kin Ho Lo k****o@b****m 1
Kin Ho Lucien Lo l****o@l****e 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 70
  • Total pull requests: 155
  • Average time to close issues: 3 months
  • Average time to close pull requests: 14 days
  • Total issue authors: 27
  • Total pull request authors: 14
  • Average comments per issue: 1.13
  • Average comments per pull request: 0.58
  • Merged pull requests: 142
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 25
  • Pull requests: 51
  • Average time to close issues: 28 days
  • Average time to close pull requests: 8 days
  • Issue authors: 6
  • Pull request authors: 7
  • Average comments per issue: 0.6
  • Average comments per pull request: 0.55
  • Merged pull requests: 43
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • SvenKlaassen (25)
  • MalteKurz (15)
  • PhilippBach (4)
  • Alalalalaki (2)
  • gzhelev2020 (2)
  • OliverSchacht (1)
  • esmucler (1)
  • AlejandroTL (1)
  • prateekgv (1)
  • yangfanliang (1)
  • JohannWO (1)
  • kkckk1110 (1)
  • apoorvalal (1)
  • MarkovSc (1)
  • tmieno2 (1)
Pull Request Authors
  • SvenKlaassen (89)
  • MalteKurz (50)
  • OliverSchacht (14)
  • PhilippBach (5)
  • mychaelka (4)
  • JanTeichertKluge (2)
  • lucien1011 (2)
  • esmucler (2)
  • PragyanTiwari (2)
  • ShreyDixit (1)
  • vnastl (1)
  • CarlesRieraA (1)
  • anntis (1)
  • lgtm-com[bot] (1)
Top Labels
Issue Labels
enhancement (30) new feature (25) bug (11) exception handling (7) refactoring (6) documentation (5) continuous integration (2) good first issue (2) unit tests (1) question (1)
Pull Request Labels
enhancement (11) new feature (10) continuous integration (9) refactoring (8) unit tests (5) bug (4) exception handling (4) documentation (3)

Dependencies

requirements-dev.txt pypi
  • pydata-sphinx-theme * development
  • pytest * development
  • sphinx * development
  • xgboost * development
requirements.txt pypi
  • joblib *
  • numpy *
  • pandas *
  • scipy *
  • sklearn *
  • statsmodels *
setup.py pypi
  • joblib *
  • numpy *
  • pandas *
  • scipy *
  • sklearn *
  • statsmodels *
.github/workflows/codeql.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/deploy_pkg.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v3 composite
.github/workflows/pytest.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • codacy/codacy-coverage-reporter-action v1 composite
  • codecov/codecov-action v3 composite