https://github.com/blind-contours/isoxshift

This package finds, in a mixed exposure, the most efficient intervention that gets to a target outcome level. This parameter is similar to synergy in isobolic curve analysis in toxicology but geared towards shift intervention on a population. It then estimates this interaction using targeted learning.

https://github.com/blind-contours/isoxshift

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.1%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

This package finds, in a mixed exposure, the most efficient intervention that gets to a target outcome level. This parameter is similar to synergy in isobolic curve analysis in toxicology but geared towards shift intervention on a population. It then estimates this interaction using targeted learning.

Basic Info
  • Host: GitHub
  • Owner: blind-contours
  • Language: R
  • Default Branch: main
  • Size: 196 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme Contributing

README.Rmd

---
output:
  rmarkdown::github_document
bibliography: "inst/references.bib"
always_allow_html: true
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

# R/`IsoXshift`


[![R-CMD-check](https://github.com/blind-contours/IsoXshift/workflows/R-CMD-check/badge.svg)](https://github.com/blind-contours/IsoXshift/actions)
[![Coverage Status](https://img.shields.io/codecov/c/github/blind-contours/IsoXshift/master.svg)](https://codecov.io/github/blind-contours/IsoXshift?branch=master)
[![CRAN](https://www.r-pkg.org/badges/version/IsoXshift)](https://www.r-pkg.org/pkg/IsoXshift)
[![CRAN downloads](https://cranlogs.r-pkg.org/badges/IsoXshift)](https://CRAN.R-project.org/package=IsoXshift)
[![CRAN total downloads](http://cranlogs.r-pkg.org/badges/grand-total/IsoXshift)](https://CRAN.R-project.org/package=IsoXshift)
[![Project Status: Active  The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![MIT license](https://img.shields.io/badge/license-MIT-brightgreen.svg)](https://opensource.org/licenses/MIT)




> Isobolic Interaction Identification and Estimation using Data-Adaptive Stochastic Interventions
__Authors:__ [David McCoy](https://davidmccoy.org)

---

## What's `IsoXshift`?

The `IsoXshift` R package offers an approach which identifies the minimum effort intervention on two exposures which, if the population were given these intervention levels, would result in a target outcome. This parameter reflects the most synergistic interaction or set of interactions in a mixed exposure. The target parameter is similar to isobolic interactions used in toxicology studies where researchers investigate how much simultaneous dose of two exposures results in a target outcome, like cancer or cell death in cultures. 

From a policy perspective, this parameter represents the most efficient intervention that can be done on a mixed exposure to get to a desired outcome. This is because, for a collection of possible interventions or changes to pollutants, for example, we find the exposure set that most efficiently results in an expected outcome close to our desired outcome. Efficient here means the exposure(s) need to be shifted the least to get to a desired outcome, like pre-industry levels of thyroid cancer etc. 


## Realistic Interventions


This package first identifies the most efficient intervention policy that gets to a desired outcome using g-computation which results in two exposure levels a set of exposures should be set to. Because it's unrealistic to set a population to this specific oracle intervention, because the likelihood of certain individuals exposed to this level may be near 0, we instead estimate the effects of the policy if we were to get everyone as close as possible to this oracle level. This is done by finding an intervention level as close to the oracle level as possible under some restrictions that the individual conditional likelihood of being exposed doesn't move too far away from their observed levels. 


We the estimate the impact of our "intention to intervene" using CV-TMLE. Using this oracle point paramater as our target we shift individuals as close as possible to this level without violating the density ratio, the intervention level exposure likelihood compared to observed level likelihood. Thus, each individuals actual intervention is different but is aimed towards the target, hence intention to intervene. 

## Joint vs. Additive Interventions 


We define interaction as the counterfactual mean of the outcome under stochastic interventions of two exposures compared to the additive counterfactual mean of the two exposures intervened on independently. These interventions or exposure changes depend on naturally observed values, as described in past literature [@diaz2012population; @haneuse2013estimation], but with our new parameter in mind. Thus, what is estimated is like asking, what the expected outcome is if we were to enforce the most efficient policy intervention in a realistic setting where not everyone can actually receive that exact exposure level or levels. 

## Target Levels and Shifting to those Levels 

To utilize the package, users need to provide vectors for exposures, covariates, and outcomes. They also specify the target_outcome_lvl for the outcome, epsilon, which is some allowed closeness to the target. For example, if the target outcome level is 15, and epsilon is 0.5, then interventions that lead to 15.5 are considered. The restriction limit is hn_trunc_thresh which is the allowed distance from the original exposure likelihood. 10 for example indicates that the likelihood should not be more than x10 difference from the original exposure level likelihood. That is, if an individual's likelihood is originally 0.1 given their covariate history and the likelihood of exposure to the intervened level is 0.01, this is 10 times different and would be the limit intervention. 

A detailed guide is provided in the vignette. With these inputs, `IsoXshift` processes the data and delivers tables showcasing fold-specific results and aggregated outcomes, allowing users to glean insights effectively.

`IsoXshift` also incorporates features from the `sl3` package [@coyle-sl3-rpkg], facilitating ensemble machine learning in the estimation process. If the user does not specify any stack parameters, `IsoXshift` will automatically create an ensemble of machine learning algorithms that strike a balance between flexibility and computational efficiency.


---

## Installation

*Note:* Because the `IsoXshift` package (currently) depends on `sl3` that
allows ensemble machine learning to be used for nuisance parameter
estimation and `sl3` is not on CRAN the `IsoXshift` package is not
available on CRAN and must be downloaded here.

`IsoXshift` uses the `sl3` package to build ensemble machine learners for each nuisance parameter. 
We have to install off the development branch, first download these two packages for `sl3`


```{r sl3_devel,  eval = FALSE}
remotes::install_github("tlverse/sl3@devel")
```

Make sure `sl3` installs correctly then install `IsoXshift`

```{r IsoXshift,  eval = FALSE}
remotes::install_github("blind-contours/IsoXshift@main")
```

---

## Example

To illustrate how `IsoXshift` may be used to ascertain the effect of a mixed exposure, we will use synthetic data from the National Institute of Environmental Health. Let's first load the relevant packages:

```{r example, warning=FALSE}
library(IsoXshift)
library(devtools)
library(kableExtra)
library(sl3)

seed <- 429153
set.seed(seed)
```

We will directly use synthetic data from the NIEHS used to test new mixture methods. This data has built in strong positive and negative marginal effects and certain interactions. Found here: https://github.com/niehs-prime/2015-NIEHS-MIxtures-Workshop

```{r NIEHS example}
data("NIEHS_data_1", package = "IsoXshift")
```

```{r NIEH Nodes}
NIEHS_data_1$W <- rnorm(nrow(NIEHS_data_1), mean = 0, sd = 0.1)
w <- NIEHS_data_1[, c("W", "Z")]
a <- NIEHS_data_1[, c("X1", "X2", "X3", "X4", "X5", "X6", "X7")]
y <- NIEHS_data_1$Y

head(NIEHS_data_1) %>%
  kbl(caption = "NIEHS Data") %>%
  kable_classic(full_width = F, html_font = "Cambria")
```



This data has X1 and X7 has the most synergy or super-additive effect so we might expect to find this
relationship as the most synergistic exposure relationship based on our definition. It is also possible that the most efficient intervention is one that intervenes on an antagonistic pair, shifting positive associations higher and negative lower in the antagonistic interaction. 

```{r run IsoXshift, eval = TRUE, message=FALSE, warning=FALSE}

ptm <- proc.time()
sim_results <- IsoXshift(
  w = w,
  a = a,
  y = y,
  n_folds = 6,
  num_cores = 6,
  outcome_type = "continuous",
  seed = seed,
  target_outcome_lvl = 12,
  epsilon = 0.5
)
proc.time() - ptm

oracle_parameter <- sim_results$`Oracle Pooled Results`
k_fold_results <- sim_results$`K-fold Results`
oracle_targets <- sim_results$`K Fold Oracle Targets`
```

Of note: these results will be more consistent with higher folds but here we use 6 so readme builds more quickly for users. 


## K-fold Specific Results


```{r k fold results}
k_fold_results <- do.call(rbind, k_fold_results)
rownames(k_fold_results) <- NULL

k_fold_results %>%
  kbl(caption = "List of K Fold Results") %>%
  kable_classic(full_width = F, html_font = "Cambria")
```

Here we see that X1-X5 are found in all the folds. This means that, to get to our target outcome of 15, with precision up to 0.5, these two exposures are found to most efficiently get to our target outcome under minimal intervention. 

The column Psi shows the expected change in outcome under shift compared to no shift. Type indicates which variable was shifted, X1, X5, X1 and X5 and then interaction which compares X1-X5 to X1 + X5. So for example a Psi of -13.5 for X1 indicates that the outcome reduces by 13.5 when we attempt to shift X1 towards the oracle point parameter. Which in this fold is 0.05. So under a policy where we try and shift X1 towards the value 0.05 under restrictions of not violating positivity support the outcome goes down by 13.5. The deltas in Average Delta columns shows the average delta in the fold for each variable. This is the average shift we get under our support to get to the oracle parameter. For example -13.5 means that the average outcome goes down by 13.5 when shifting X1 by -0.902 on average, towards our target. 

Overall, average delta column indicates the average shift away from each individuals observed exposure level in order to reach the target under restrictions. 

In reality, when looking at the Average Delta column we can only reduce X1 by about 1 and increase X5 by about 2 with support from our data. 

## Oracle Point Parameters

These interventions are: 

```{r oracle targets}
oracle_targets <- do.call(rbind, oracle_targets)
oracle_targets %>%
  kbl(caption = "Oracle Targets") %>%
  kable_classic(full_width = F, html_font = "Cambria")
```

Here this table shows the average exposed level for each exposure, the intervened level for both exposures, this is the level the exposures are set to which gets to the target outcome most efficiently, Avg Difference, is the average difference between the intervention and observed outcome (the "effort"), and Difference is the difference between the expected outcome under intervention and the target outcome. 

What we see here is that to get to the target outcome 12, where the observed average is 53, so a significant reduction, the most efficient intervention is to reduce X1 to around 0.05 and to increase X5 (due to its antagonistic relationship) to about 3. 

To get more power, we do a pooled TMLE over our findings for the intervention with minimal effort that gets to our target outcome: 

## Oracle Parameter

```{r top negative results}
oracle_parameter %>%
  kbl(caption = "Pooled Oracle Parameter") %>%
  kable_classic(full_width = F, html_font = "Cambria")
```

This gives pooled estimates for the shift of each variable in the relationship individually, joint and our definition of interaction comparing the expectation of the outcome under joint shift compared to the expectations under the sum of individual shifts. 



Overall, this package finds the intervention that, with minimal effort, gets to a desired outcome in a mixed exposure. It then estimates, using CV-TMLE, a policy intervention that attempts to get a population's exposure as close as possible to this oracle intervention level without violating positivity. 


In this NIEHS data set we correctly identify the most synergistic relationship built into the data. 

More discussion is found in the vignette. 



---

## Issues

If you encounter any bugs or have any specific feature requests, 
please [file an
issue](https://github.com/blind-contours/IsoXshift/issues). Further details
on filing
issues are provided in our [contribution
guidelines](https://github.com/blind-contours/
IsoXshift/main/contributing.md).

---

## Contributions

Contributions are very welcome. Interested contributors should consult our
[contribution
guidelines](https://github.com/blind-contours/IsoXshift/blob/master/CONTRIBUTING.md)
prior to submitting a pull request.

---

## Citation

After using the `IsoXshift` R package, please cite the following:


---

## Related

* [R/`tmle3shift`](https://github.com/tlverse/tmle3shift) - An R package
  providing an independent implementation of the same core routines for the TML
  estimation procedure and statistical methodology as is made available here,
  through reliance on a unified interface for Targeted Learning provided by the
  [`tmle3`](https://github.com/tlverse/tmle3) engine of the [`tlverse`
  ecosystem](https://github.com/tlverse).

* [R/`medshift`](https://github.com/nhejazi/medshift) - An R package providing
  facilities to estimate the causal effect of stochastic treatment regimes in
  the mediation setting, including classical (IPW) and augmented double robust
  (one-step) estimators. This is an implementation of the methodology explored
  by @diaz2020causal.

* [R/`haldensify`](https://github.com/nhejazi/haldensify) - A minimal package
  for estimating the conditional density treatment mechanism component of this
  parameter based on using the [highly adaptive
  lasso](https://github.com/tlverse/hal9001) [@coyle-hal9001-rpkg;
  @hejazi2020hal9001-joss] in combination with a pooled hazard regression. This
  package implements a variant of the approach advocated by @diaz2011super.

---

## Funding

The development of this software was supported in part through 
NIH grant P42ES004705 from NIEHS



---

## License

© 2020-2022 [David B. McCoy](https://davidmccoy.org)

The contents of this repository are distributed under the MIT license. See below
for details:
```
MIT License
Copyright (c) 2020-2022 David B. McCoy
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```

---

## References

Owner

  • Name: David McCoy
  • Login: blind-contours
  • Kind: user

GitHub Events

Total
  • Fork event: 1
Last Year
  • Fork event: 1

Dependencies

.github/workflows/draft-pdf.yml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v1 composite
  • openjournals/openjournals-draft-action master composite
.github/workflows/r.yml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v1 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-tinytex v1 composite
DESCRIPTION cran
  • R >= 2.10 depends
  • MASS * imports
  • Rdpack * imports
  • assertthat * imports
  • cvTools * imports
  • data.table * imports
  • dplyr * imports
  • foreach * imports
  • furrr * imports
  • future * imports
  • ggplot2 * imports
  • haldensify * imports
  • magrittr * imports
  • partykit * imports
  • polspline * imports
  • pracma * imports
  • purrr * imports
  • rlang * imports
  • sl3 * imports
  • stringr * imports
  • kableExtra * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • testthat >= 3.0.0 suggests