https://github.com/aristotle-tek/tidyhte
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.6%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: aristotle-tek
- License: other
- Default Branch: main
- Homepage: https://ddimmery.github.io/tidyhte/
- Size: 8.43 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of ddimmery/tidyhte
Created over 3 years ago
· Last pushed over 4 years ago
https://github.com/aristotle-tek/tidyhte/blob/main/
# tidyhte
[](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[](https://github.com/ddimmery/tidyhte/actions/workflows/lint.yaml)
[](https://codecov.io/gh/ddimmery/tidyhte)
[](https://github.com/ddimmery/tidyhte/actions/workflows/R-CMD-check.yaml)
[](https://cran.r-project.org/package=tidyhte)
[](https://opensource.org/licenses/MIT)
[](https://zenodo.org/badge/latestdoi/375330850)
`tidyhte` provides tidy semantics for estimation of heterogeneous
treatment effects through the use of [Kennedys (n.d.) doubly-robust
learner](https://arxiv.org/abs/2004.14497).
The goal of `tidyhte` is to use a sort of recipe design. This should
(hopefully) make it extremely easy to scale an analysis of HTE from the
common single-outcome / single-moderator case to many outcomes and many
moderators. The configuration of `tidyhte` should make it extremely easy
to perform the same analysis across many outcomes and for a wide-array
of moderators. Its written to be fairly easy to extend to different
models and to add additional diagnostics and ways to output information
from a set of HTE estimates.
The best place to start for learning how to use `tidyhte` are the
vignettes which runs through example analyses from start to finish:
`vignette("experimental_analysis")` and
`vignette("observational_analysis")`. There is also a writeup
summarizing the method and implementation in
`vignette("methodological-details")`.
# Installation
You will be able to install the released version of tidyhte from
[CRAN](https://CRAN.R-project.org) with:
``` r
install.packages("tidyhte")
```
But this does not yet exist. In the meantime, install the development
version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("ddimmery/tidyhte")
```
# Setting up a configuration
To set up a simple configuration, its straightforward to use the Recipe
API:
``` r
library(tidyhte)
library(dplyr)
basic_config() %>%
add_propensity_score_model("SL.glmnet") %>%
add_outcome_model("SL.glmnet") %>%
add_moderator("Stratified", x1, x2) %>%
add_moderator("KernelSmooth", x3) %>%
add_vimp(sample_splitting = FALSE) -> hte_cfg
```
The `basic_config` includes a number of defaults: it starts off the
SuperLearner ensembles for both treatment and outcome with linear models
(`"SL.glm"`)
# Running an Analysis
``` r
data %>%
attach_config(hte_cfg) %>%
make_splits(userid, .num_splits = 12) %>%
produce_plugin_estimates(
outcome_variable,
treatment_variable,
covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
) %>%
construct_pseudo_outcomes(outcome_variable, treatment_variable) -> data
data %>%
estimate_QoI(covariate1, covariate2) -> results
```
To get information on estimate CATEs for a moderator not included
previously would just require rerunning the final line:
``` r
data %>%
estimate_QoI(covariate3) -> results
```
Replicating this on a new outcome would be as simple as running the
following, with no reconfiguration necessary.
``` r
data %>%
attach_config(hte_cfg) %>%
produce_plugin_estimates(
second_outcome_variable,
treatment_variable,
covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
) %>%
construct_pseudo_outcomes(second_outcome_variable, treatment_variable) %>%
estimate_QoI(covariate1, covariate2) -> results
```
This leads to the ability to easily chain together analyses across many
outcomes in an easy way:
``` r
library("foreach")
data %>%
attach_config(hte_cfg) %>%
make_splits(userid, .num_splits = 12) -> data
foreach(outcome = list_of_outcomes, .combine = "bind_rows") %do% {
data %>%
produce_plugin_estimates(
outcome,
treatment_variable,
covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
) %>%
construct_pseudo_outcomes(outcome, treatment_variable) %>%
estimate_QoI(covariate1, covariate2) %>%
mutate(outcome = rlang::as_string(outcome))
}
```
The function `estimate_QoI` returns results in a tibble format which
makes it easy to manipulate or plot results.
Owner
- Name: Andrew Peterson
- Login: aristotle-tek
- Kind: user
- Location: Poitiers/ Nantes
- Company: University of Poitiers
- Website: https://elenchos.ai
- Twitter: andrew_nyu
- Repositories: 25
- Profile: https://github.com/aristotle-tek
Data Scientist at Elenchos.ai. University of Poitiers.