https://github.com/aristotle-tek/tidyhte

https://github.com/aristotle-tek/tidyhte

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of ddimmery/tidyhte
Created over 3 years ago · Last pushed over 4 years ago

https://github.com/aristotle-tek/tidyhte/blob/main/



# tidyhte



[![Lifecycle:
experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![lint](https://github.com/ddimmery/tidyhte/actions/workflows/lint.yaml/badge.svg)](https://github.com/ddimmery/tidyhte/actions/workflows/lint.yaml)
[![codecov](https://codecov.io/gh/ddimmery/tidyhte/branch/main/graph/badge.svg?token=AHT3X4S2KQ)](https://codecov.io/gh/ddimmery/tidyhte)
[![R-CMD-check](https://github.com/ddimmery/tidyhte/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ddimmery/tidyhte/actions/workflows/R-CMD-check.yaml)
[![CRAN
Status](https://www.r-pkg.org/badges/version/tidyhte)](https://cran.r-project.org/package=tidyhte)
[![License:
MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![DOI](https://zenodo.org/badge/375330850.svg)](https://zenodo.org/badge/latestdoi/375330850)


`tidyhte` provides tidy semantics for estimation of heterogeneous
treatment effects through the use of [Kennedys (n.d.) doubly-robust
learner](https://arxiv.org/abs/2004.14497).

The goal of `tidyhte` is to use a sort of recipe design. This should
(hopefully) make it extremely easy to scale an analysis of HTE from the
common single-outcome / single-moderator case to many outcomes and many
moderators. The configuration of `tidyhte` should make it extremely easy
to perform the same analysis across many outcomes and for a wide-array
of moderators. Its written to be fairly easy to extend to different
models and to add additional diagnostics and ways to output information
from a set of HTE estimates.

The best place to start for learning how to use `tidyhte` are the
vignettes which runs through example analyses from start to finish:
`vignette("experimental_analysis")` and
`vignette("observational_analysis")`. There is also a writeup
summarizing the method and implementation in
`vignette("methodological-details")`.

# Installation

You will be able to install the released version of tidyhte from
[CRAN](https://CRAN.R-project.org) with:

``` r
install.packages("tidyhte")
```

But this does not yet exist. In the meantime, install the development
version from [GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("ddimmery/tidyhte")
```

# Setting up a configuration

To set up a simple configuration, its straightforward to use the Recipe
API:

``` r
library(tidyhte)
library(dplyr)

basic_config() %>%
    add_propensity_score_model("SL.glmnet") %>%
    add_outcome_model("SL.glmnet") %>%
    add_moderator("Stratified", x1, x2) %>%
    add_moderator("KernelSmooth", x3) %>%
    add_vimp(sample_splitting = FALSE) -> hte_cfg
```

The `basic_config` includes a number of defaults: it starts off the
SuperLearner ensembles for both treatment and outcome with linear models
(`"SL.glm"`)

# Running an Analysis

``` r
data %>%
    attach_config(hte_cfg) %>%
    make_splits(userid, .num_splits = 12) %>%
    produce_plugin_estimates(
        outcome_variable,
        treatment_variable,
        covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
    ) %>%
    construct_pseudo_outcomes(outcome_variable, treatment_variable) -> data

data %>%
    estimate_QoI(covariate1, covariate2) -> results
```

To get information on estimate CATEs for a moderator not included
previously would just require rerunning the final line:

``` r
data %>%
    estimate_QoI(covariate3) -> results
```

Replicating this on a new outcome would be as simple as running the
following, with no reconfiguration necessary.

``` r
data %>%
    attach_config(hte_cfg) %>%
    produce_plugin_estimates(
        second_outcome_variable,
        treatment_variable,
        covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
    ) %>%
    construct_pseudo_outcomes(second_outcome_variable, treatment_variable) %>%
    estimate_QoI(covariate1, covariate2) -> results
```

This leads to the ability to easily chain together analyses across many
outcomes in an easy way:

``` r
library("foreach")

data %>%
    attach_config(hte_cfg) %>%
    make_splits(userid, .num_splits = 12) -> data

foreach(outcome = list_of_outcomes, .combine = "bind_rows") %do% {
    data %>%
    produce_plugin_estimates(
        outcome,
        treatment_variable,
        covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
    ) %>%
    construct_pseudo_outcomes(outcome, treatment_variable) %>%
    estimate_QoI(covariate1, covariate2) %>%
    mutate(outcome = rlang::as_string(outcome))
}
```

The function `estimate_QoI` returns results in a tibble format which
makes it easy to manipulate or plot results.

Owner

  • Name: Andrew Peterson
  • Login: aristotle-tek
  • Kind: user
  • Location: Poitiers/ Nantes
  • Company: University of Poitiers

Data Scientist at Elenchos.ai. University of Poitiers.

GitHub Events

Total
Last Year