tehtuner
tehtuner: An R package to fit and tune models for the conditional average treatment effect - Published in JOSS (2023)
Science Score: 93.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 12 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
clinical-trials
heterogeneity-of-treatment-effect
r
subgroup-identification
Scientific Fields
Earth and Environmental Sciences
Physical Sciences -
40% confidence
Last synced: 4 months ago
·
JSON representation
Repository
An R Package to Fit and Tune to Models to Detect Treatment Effect Heterogeneity
Basic Info
Statistics
- Stars: 5
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 3
Topics
clinical-trials
heterogeneity-of-treatment-effect
r
subgroup-identification
Created about 4 years ago
· Last pushed 9 months ago
Metadata Files
Readme
Changelog
Contributing
License
Code of conduct
README.Rmd
---
output: github_document
editor_options:
chunk_output_type: console
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
library(ggplot2)
library(ggtext)
library(magrittr)
library(kableExtra)
library(Cairo)
library(rpart.plot)
```
# tehtuner
[](https://CRAN.R-project.org/package=tehtuner)
[](https://doi.org/10.21105/joss.05453)
[](https://github.com/jackmwolf/tehtuner/actions)
The goal of `tehtuner` is to implement methods to fit models to detect and model
treatment effect heterogeneity (TEH) while controlling the Type-I error of falsely
detecting a differential effect when the conditional average treatment effect is
uniform across the study population.
Currently `tehtuner` supports Virtual Twins models (Foster
et al., 2011) for detecting TEH using the permutation procedure proposed in (Wolf et al., 2022).
Virtual Twins is a two-step approach to detecting differential treatment
effects. Subjects' conditional average treatment effects (CATEs) are first
estimated in Step 1 using a flexible model. Then, a simple and interpretable
model is fit in Step 2 to model these estimated CATEs as a function of the
covariates.
The Step 2 model is dependent on some tuning parameter. This parameter is
selected to control the Type-I error rate by permuting the data under the
null hypothesis of a constant treatment effect and identifying the minimal
null penalty parameter (MNPP), which is the smallest penalty parameter that
yields a Step 2 model with no covariate effects. The $1-\alpha$ quantile
of the distribution of is then used to fit the Step 2 model on the original
data.
In dong so, the Type-I error rate is controlled to be $\alpha$.
## Installation
`tehtuner` is available on [CRAN](https://CRAN.R-project.org); you can download the release version with:
``` r
install.packages("tehtuner")
```
You can download the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("jackmwolf/tehtuner")
```
## Example
We consider simulated data from a small clinical trial with 1000 subjects.
Each subject has 10 measured covariates, 8 continuous and 2 binary.
We are interested in estimating and understanding the CATE through Virtual Twins.
```{r}
library(tehtuner)
data("tehtuner_example")
```
We will consider a Virtual Twins model using a random forest to estimate the CATEs in Step 1 and then fitting a regression tree on the estimated CATEs in Step 2 with the Type-I error rate set at $\alpha = 0.2$.
```{r cache=TRUE}
set.seed(100)
vt_cate <- tunevt(
data = tehtuner_example, Y = "Y", Trt = "Trt", step1 = "randomforest",
step2 = "rtree", alpha0 = 0.2, p_reps = 100, ntree = 50
)
vt_cate
```
The fitted Step 2 model can be accessed via `$vtmod`.
In this case, as we used a regression tree in Step 2, our final model model is of class `rpart.object`.
```{r dev='CairoPNG', warning = FALSE}
vt_cate$vtmod
rpart.plot::rpart.plot(vt_cate$vtmod, digits = -2)
```
The fitted model for the CATE is a function of the covariates (`V1`, and `V3`), so we would conclude that there is treatment effect heterogeneity at the 20% level.
We can also look at the null distribution of the MNPP through `vt_cate$theta_null`.
The 80th percentile of $\hat\theta$ under the null hypothesis is
```{r}
quantile(vt_cate$theta_null, 0.8)
```
while the MNPP of our observed data is
```{r}
vt_cate$mnpp
```
The procedure fit the Step 2 model using the 80th quantile of the null distribution which resulted in a model that included covariates since the MNPP was above the 80th quantile.
```{r mnpp_plot, dev='CairoPNG', echo = FALSE}
ggplot(mapping = aes(x = vt_cate$theta_null, y = after_stat(density))) +
geom_histogram(color = "black", fill = "white", binwidth = 0.025) +
theme_minimal() +
scale_y_continuous(expand = expansion(mult = c(0, 0.1))) +
geom_vline(
aes(xintercept = c(vt_cate$mnpp, quantile(vt_cate$theta_null, 0.8))),
color = c("#0072B2", "#D55E00"),
linewidth = 2,
linetype = 2
) +
labs(
y = "Density",
x = expression(hat(theta)),
title = expression("Sampling distribution of" ~ hat(theta) ~ "under" ~ H[0]),
subtitle = paste0(
"",
"80th quantile (critical value): ", formatC(quantile(vt_cate$theta_null, 0.8), digits = 2, format = "f"),
"",
"; ",
"",
"Observed MNPP: ", formatC(vt_cate$mnpp, 2, format = "f"),
""
)
) +
theme(
plot.subtitle = element_markdown(size = 12)
)
```
### Running in Parallel
Version `0.2.0` added the `parallel` option to `tunevt()` which allows the user to perform the permutation procedure in parallel to reduce computation times.
Before doing so, you must register a parallel backend; see `?foreach::foreach` for more information.
For example, to carry out 100 permutations across 2 processors:
```{r eval = FALSE}
cl <- parallel::makeCluster(2)
doParallel::registerDoParallel(cl)
vt_cate_parallel <- tunevt(
data = tehtuner_example, Y = "Y", Trt = "Trt", step1 = "randomforest",
step2 = "rtree", alpha0 = 0.2, p_reps = 100, ntree = 50, parallel = TRUE
)
parallel::stopCluster(cl)
```
## References
- Foster, J. C., Taylor, J. M., & Ruberg, S. J. (2011). Subgroup identification from randomized clinical trial data. _Statistics in Medicine, 30_(24), 2867–2880. https://doi.org/10.1002/sim.4322
- Wolf, J. M., Koopmeiners, J. S., & Vock, D. M. (2022). A permutation procedure to detect heterogeneous treatment effects in randomized clinical trials while controlling the type-I error rate. _Clinical Trials, 19_(5). https://doi.org/10.1177/17407745221095855
- Deng C., Wolf J. M., Vock D. M., Carroll D. M., Hatsukami D. K., Leng N., & Koopmeiners J. S. (2023). “Practical guidance on modeling choices for the virtual twins method.” _Journal of Biopharmaceutical Statistics_. https://doi.org/10.1080/10543406.2023.2170404
- Wolf, J. M., (2023). tehtuner: An R package to fit and tune models for the conditional average treatment effect. _Journal of Open Source Software, 8_(86), 5453. https://doi.org/10.21105/joss.05453
Owner
- Name: Jack M Wolf
- Login: jackmwolf
- Kind: user
- Company: University of Minnesota Biostatistics
- Website: jackmwolf.rbind.io
- Twitter: _jackmwolf
- Repositories: 4
- Profile: https://github.com/jackmwolf
he/him \\ Biostatistics PhD Student
JOSS Publication
tehtuner: An R package to fit and tune models for the conditional average treatment effect
Published
June 26, 2023
Volume 8, Issue 86, Page 5453
Tags
causal inference clinical trialsGitHub Events
Total
- Watch event: 1
- Push event: 1
Last Year
- Watch event: 1
- Push event: 1
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 8
- Total pull requests: 5
- Average time to close issues: 22 days
- Average time to close pull requests: about 3 hours
- Total issue authors: 2
- Total pull request authors: 1
- Average comments per issue: 0.75
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- jackmwolf (7)
- elimillera (1)
Pull Request Authors
- jackmwolf (5)
Top Labels
Issue Labels
enhancement (3)
documentation (2)
bug (1)
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 184 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 2
- Total maintainers: 1
cran.r-project.org: tehtuner
Fit and Tune Models to Detect Treatment Effect Heterogeneity
- Homepage: https://github.com/jackmwolf/tehtuner
- Documentation: http://cran.r-project.org/web/packages/tehtuner/tehtuner.pdf
- License: GPL (≥ 3)
-
Latest release: 0.3.0
published almost 3 years ago
Rankings
Forks count: 28.8%
Dependent packages count: 29.8%
Stargazers count: 31.7%
Dependent repos count: 35.5%
Average: 42.5%
Downloads: 86.8%
Maintainers (1)
Last synced:
4 months ago
Dependencies
DESCRIPTION
cran
- R >= 3.5.0 depends
- Rdpack * imports
- SuperLearner * imports
- earth * imports
- glmnet * imports
- party * imports
- randomForestSRC * imports
- rpart * imports
- stringr * imports
- spelling * suggests
- testthat >= 3.0.0 suggests
.github/workflows/check-standard.yaml
actions
- actions/checkout v3 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
