medoutcon

medoutcon: Nonparametric efficient causal mediation analysis with machine learning in R - Published in JOSS (2022)

https://github.com/nhejazi/medoutcon

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 12 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: arxiv.org, joss.theoj.org, zenodo.org
  • Committers with academic emails
    1 of 3 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

causal-inference causal-machine-learning inverse-probability-weights machine-learning mediation-analysis r r-package stochastic-interventions targeted-learning treatment-effects

Keywords from Contributors

bioconductor-package

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 40% confidence
Engineering Computer Science - 40% confidence
Last synced: 6 months ago · JSON representation

Repository

:package: R/medoutcon: Efficient Causal Mediation Analysis with Natural and Interventional Direct/Indirect Effects

Basic Info
Statistics
  • Stars: 13
  • Watchers: 7
  • Forks: 8
  • Open Issues: 3
  • Releases: 1
Topics
causal-inference causal-machine-learning inverse-probability-weights machine-learning mediation-analysis r r-package stochastic-interventions targeted-learning treatment-effects
Created almost 7 years ago · Last pushed 8 months ago
Metadata Files
Readme Changelog Contributing License

README.Rmd

---
output:
  rmarkdown::github_document
bibliography: "inst/REFERENCES.bib"
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

# R/`medoutcon`


[![R-CMD-check](https://github.com/nhejazi/medoutcon/actions/workflows/R-CMD-check.yml/badge.svg)](https://github.com/nhejazi/medoutcon/actions/workflows/R-CMD-check.yml)
[![Coverage Status](https://img.shields.io/codecov/c/github/nhejazi/medoutcon/master.svg)](https://codecov.io/github/nhejazi/medoutcon?branch=master)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![MIT license](http://img.shields.io/badge/license-MIT-brightgreen.svg)](http://opensource.org/licenses/MIT)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5809519.svg)](https://doi.org/10.5281/zenodo.5809519)
[![DOI](https://joss.theoj.org/papers/10.21105/joss.03979/status.svg)](https://doi.org/10.21105/joss.03979)


> Efficient Causal Mediation Analysis for the Natural and Interventional Effects

__Authors:__ [Nima Hejazi](https://nimahejazi.org), [Iván
Díaz](https://idiaz.xyz), and [Kara
Rudolph](https://kararudolph.github.io/)

---

## What's `medoutcon`?

The `medoutcon` R package provides facilities for efficient estimation of
path-specific (in)direct effects that measure the impact of a treatment variable
$A$ on an outcome variable $Y$, through a direct path (through $A$ only) and an
indirect path (through a set of mediators $M$ only). In the presence of an
intermediate mediator-outcome confounder $Z$, itself
affected by the treatment $A$, these correspond to the _interventional_
(in)direct effects described by @diaz2020nonparametric, though similar (yet less
general) effect definitions and/or estimation strategies have appeared i`n
@`vanderweele2014effect, @rudolph2017robust, @zheng2017longitudinal, and
@benkeser2020nonparametric. When no intermediate confounders are present, these
effect definitions simplify to the well-studied _natural_ (in)direct effects,
and our estimators are analogs of those formulated by @zheng2012targeted.  Both
an efficient one-step bias-corrected estimator with cross-fitting
[@pfanzagl1985contributions; @zheng2011cross; @chernozhukov2018double] and a
cross-validated targeted minimum loss estimator (TMLE) [@vdl2011targeted;
@zheng2011cross] are made available. `medoutcon` integrates with the [`sl3` R
package](https://github.com/tlverse/sl3) [@coyle-gh-sl3] to leverage statistical
machine learning in the estimation procedure.

---

## Installation

Install the most recent _stable release_ from GitHub via
[`remotes`](https://CRAN.R-project.org/package=remotes):

```{r gh-master-installation, eval=FALSE}
remotes::install_github("nhejazi/medoutcon")
```

---

## Example

To illustrate how `medoutcon` may be used to estimate stochastic interventional
(in)direct effects of the exposure (`A`) on the outcome (`Y`) in the presence of
mediator(s) (`M`) and a mediator-outcome confounder (`Z`), consider the
following example:

```{r example, warning=FALSE}
library(data.table)
library(stringr)
library(medoutcon)
set.seed(02138)

# produces a simple data set based on ca causal model with mediation
make_example_data <- function(n_obs = 1000) {
  ## baseline covariates
  w_1 <- rbinom(n_obs, 1, prob = 0.6)
  w_2 <- rbinom(n_obs, 1, prob = 0.3)
  w_3 <- rbinom(n_obs, 1, prob = pmin(0.2 + (w_1 + w_2) / 3, 1))
  w <- cbind(w_1, w_2, w_3)
  w_names <- paste("W", seq_len(ncol(w)), sep = "_")

  ## exposure
  a <- as.numeric(rbinom(n_obs, 1, plogis(rowSums(w) - 2)))

  ## mediator-outcome confounder affected by treatment
  z <- rbinom(n_obs, 1, plogis(rowMeans(-log(2) + w - a) + 0.2))

  ## mediator -- could be multivariate
  m <- rbinom(n_obs, 1, plogis(rowSums(log(3) * w[, -3] + a - z)))
  m_names <- "M"

  ## outcome
  y <- rbinom(n_obs, 1, plogis(1 / (rowSums(w) - z + a + m)))

  ## construct output
  dat <- as.data.table(cbind(w = w, a = a, z = z, m = m, y = y))
  setnames(dat, c(w_names, "A", "Z", m_names, "Y"))
  return(dat)
}

# set seed and simulate example data
example_data <- make_example_data(n_obs = 5000L)
w_names <- str_subset(colnames(example_data), "W")
m_names <- str_subset(colnames(example_data), "M")

# quick look at the data
head(example_data)

# compute one-step estimate of the interventional direct effect
os_de <- medoutcon(
  W = example_data[, ..w_names],
  A = example_data$A,
  Z = example_data$Z,
  M = example_data[, ..m_names],
  Y = example_data$Y,
  effect = "direct",
  estimator = "onestep"
)
os_de

# compute targeted minimum loss estimate of the interventional direct effect
tmle_de <- medoutcon(
  W = example_data[, ..w_names],
  A = example_data$A,
  Z = example_data$Z,
  M = example_data[, ..m_names],
  Y = example_data$Y,
  effect = "direct",
  estimator = "tmle"
)
tmle_de
```

For details on how to use data adaptive regression (machine learning) techniques
in the estimation of nuisance parameters, consider consulting the vignette that
accompanies the package.

---

## Issues

If you encounter any bugs or have any specific feature requests, please [file an
issue](https://github.com/nhejazi/medoutcon/issues).

---

## Contributions

Contributions are very welcome. Interested contributors should consult our
[contribution
guidelines](https://github.com/nhejazi/medoutcon/blob/master/CONTRIBUTING.md)
prior to submitting a pull request.

---

## Citation

After using the `medoutcon` R package, please cite the following:

        @article{diaz2020nonparametric,
          title={Non-parametric efficient causal mediation with intermediate
            confounders},
          author={D{\'\i}az, Iv{\'a}n and Hejazi, Nima S and Rudolph, Kara E
            and {van der Laan}, Mark J},
          year={2020},
          url = {https://arxiv.org/abs/1912.09936},
          doi = {10.1093/biomet/asaa085},
          journal={Biometrika},
          volume = {108},
          number = {3},
          pages = {627--641},
          publisher={Oxford University Press}
        }

        @article{hejazi2022medoutcon-joss,
          author = {Hejazi, Nima S and Rudolph, Kara E and D{\'\i}az,
            Iv{\'a}n},
          title = {{medoutcon}: Nonparametric efficient causal mediation
            analysis with machine learning in {R}},
          year = {2022},
          doi = {10.21105/joss.03979},
          url = {https://doi.org/10.21105/joss.03979},
          journal = {Journal of Open Source Software},
          publisher = {The Open Journal}
        }

        @software{hejazi2022medoutcon-rpkg,
          author={Hejazi, Nima S and D{\'\i}az, Iv{\'a}n and Rudolph, Kara E},
          title = {{medoutcon}: Efficient natural and interventional causal
            mediation analysis},
          year  = {2024},
          doi = {10.5281/zenodo.5809519},
          url = {https://github.com/nhejazi/medoutcon},
          note = {R package version 0.2.3}
        }

---

## License

© 2020-2024 [Nima S. Hejazi](https://nimahejazi.org)

The contents of this repository are distributed under the MIT license. See below
for details:
```
MIT License

Copyright (c) 2020-2024 Nima S. Hejazi

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```

---

## References


Owner

  • Name: nima hejazi
  • Login: nhejazi
  • Kind: user
  • Location: Boston, Massachusetts
  • Company: Harvard Chan School of Public Health

Assistant Professor of Biostatistics at the Harvard School of Public Health

JOSS Publication

medoutcon: Nonparametric efficient causal mediation analysis with machine learning in R
Published
January 05, 2022
Volume 7, Issue 69, Page 3979
Authors
Nima S. Hejazi ORCID
Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, USA
Kara E. Rudolph ORCID
Department of Epidemiology, Mailman School of Public Health, Columbia University, USA
Iván Díaz ORCID
Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, USA
Editor
Mikkel Meyer Andersen ORCID
Tags
causal inference machine learning semiparametric estimation mediation analysis natural direct effect interventional direct effect

GitHub Events

Total
  • Issues event: 1
  • Push event: 2
  • Pull request event: 1
  • Pull request review event: 2
  • Fork event: 2
Last Year
  • Issues event: 1
  • Push event: 2
  • Pull request event: 1
  • Pull request review event: 2
  • Fork event: 2

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 230
  • Total Committers: 3
  • Avg Commits per committer: 76.667
  • Development Distribution Score (DDS): 0.204
Past Year
  • Commits: 1
  • Committers: 1
  • Avg Commits per committer: 1.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Nima Hejazi nh@n****g 183
Philippe Boileau p****m@g****m 40
idiazst i****5@m****u 7
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 30
  • Total pull requests: 21
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 1 month
  • Total issue authors: 6
  • Total pull request authors: 3
  • Average comments per issue: 0.67
  • Average comments per pull request: 1.14
  • Merged pull requests: 19
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • idiazst (14)
  • nhejazi (10)
  • PhilBoileau (2)
  • rrrlw (2)
  • psychesha21 (1)
  • erikcs (1)
Pull Request Authors
  • nhejazi (16)
  • PhilBoileau (8)
  • nt-williams (1)
Top Labels
Issue Labels
enhancement (6) bug (6)
Pull Request Labels
enhancement (8)

Dependencies

DESCRIPTION cran
  • R >= 3.2.0 depends
  • assertthat * imports
  • data.table * imports
  • dplyr * imports
  • hal9001 >= 0.4.1 imports
  • origami >= 1.0.3 imports
  • scales * imports
  • sl3 >= 1.4.3 imports
  • speedglm * imports
  • stats * imports
  • stringr * imports
  • tibble * imports
  • zeallot * imports
  • Rsolnp * suggests
  • SuperLearner * suggests
  • arm * suggests
  • covr * suggests
  • glmnet * suggests
  • knitr * suggests
  • nnls * suggests
  • ranger * suggests
  • rmarkdown * suggests
  • testthat * suggests
  • xgboost * suggests
.github/workflows/R-CMD-check.yml actions
  • actions/checkout v3 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-tinytex v2 composite