https://github.com/aristotle-tek/tidyhte

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org, zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (15.6%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: aristotle-tek
License: other
Default Branch: main
Homepage: https://ddimmery.github.io/tidyhte/
Size: 8.43 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Fork of ddimmery/tidyhte

Created over 3 years ago · Last pushed over 4 years ago

https://github.com/aristotle-tek/tidyhte/blob/main/



# tidyhte



[![Lifecycle:
experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![lint](https://github.com/ddimmery/tidyhte/actions/workflows/lint.yaml/badge.svg)](https://github.com/ddimmery/tidyhte/actions/workflows/lint.yaml)
[![codecov](https://codecov.io/gh/ddimmery/tidyhte/branch/main/graph/badge.svg?token=AHT3X4S2KQ)](https://codecov.io/gh/ddimmery/tidyhte)
[![R-CMD-check](https://github.com/ddimmery/tidyhte/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ddimmery/tidyhte/actions/workflows/R-CMD-check.yaml)
[![CRAN
Status](https://www.r-pkg.org/badges/version/tidyhte)](https://cran.r-project.org/package=tidyhte)
[![License:
MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![DOI](https://zenodo.org/badge/375330850.svg)](https://zenodo.org/badge/latestdoi/375330850)


`tidyhte` provides tidy semantics for estimation of heterogeneous
treatment effects through the use of [Kennedys (n.d.) doubly-robust
learner](https://arxiv.org/abs/2004.14497).

The goal of `tidyhte` is to use a sort of recipe design. This should
(hopefully) make it extremely easy to scale an analysis of HTE from the
common single-outcome / single-moderator case to many outcomes and many
moderators. The configuration of `tidyhte` should make it extremely easy
to perform the same analysis across many outcomes and for a wide-array
of moderators. Its written to be fairly easy to extend to different
models and to add additional diagnostics and ways to output information
from a set of HTE estimates.

The best place to start for learning how to use `tidyhte` are the
vignettes which runs through example analyses from start to finish:
`vignette("experimental_analysis")` and
`vignette("observational_analysis")`. There is also a writeup
summarizing the method and implementation in
`vignette("methodological-details")`.

# Installation

You will be able to install the released version of tidyhte from
[CRAN](https://CRAN.R-project.org) with:

``` r
install.packages("tidyhte")
```

But this does not yet exist. In the meantime, install the development
version from [GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("ddimmery/tidyhte")
```

# Setting up a configuration

To set up a simple configuration, its straightforward to use the Recipe
API:

``` r
library(tidyhte)
library(dplyr)

basic_config() %>%
    add_propensity_score_model("SL.glmnet") %>%
    add_outcome_model("SL.glmnet") %>%
    add_moderator("Stratified", x1, x2) %>%
    add_moderator("KernelSmooth", x3) %>%
    add_vimp(sample_splitting = FALSE) -> hte_cfg
```

The `basic_config` includes a number of defaults: it starts off the
SuperLearner ensembles for both treatment and outcome with linear models
(`"SL.glm"`)

# Running an Analysis

``` r
data %>%
    attach_config(hte_cfg) %>%
    make_splits(userid, .num_splits = 12) %>%
    produce_plugin_estimates(
        outcome_variable,
        treatment_variable,
        covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
    ) %>%
    construct_pseudo_outcomes(outcome_variable, treatment_variable) -> data

data %>%
    estimate_QoI(covariate1, covariate2) -> results
```

To get information on estimate CATEs for a moderator not included
previously would just require rerunning the final line:

``` r
data %>%
    estimate_QoI(covariate3) -> results
```

Replicating this on a new outcome would be as simple as running the
following, with no reconfiguration necessary.

``` r
data %>%
    attach_config(hte_cfg) %>%
    produce_plugin_estimates(
        second_outcome_variable,
        treatment_variable,
        covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
    ) %>%
    construct_pseudo_outcomes(second_outcome_variable, treatment_variable) %>%
    estimate_QoI(covariate1, covariate2) -> results
```

This leads to the ability to easily chain together analyses across many
outcomes in an easy way:

``` r
library("foreach")

data %>%
    attach_config(hte_cfg) %>%
    make_splits(userid, .num_splits = 12) -> data

foreach(outcome = list_of_outcomes, .combine = "bind_rows") %do% {
    data %>%
    produce_plugin_estimates(
        outcome,
        treatment_variable,
        covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
    ) %>%
    construct_pseudo_outcomes(outcome, treatment_variable) %>%
    estimate_QoI(covariate1, covariate2) %>%
    mutate(outcome = rlang::as_string(outcome))
}
```

The function `estimate_QoI` returns results in a tibble format which
makes it easy to manipulate or plot results.

Owner

Name: Andrew Peterson
Login: aristotle-tek
Kind: user
Location: Poitiers/ Nantes
Company: University of Poitiers

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science