https://github.com/bdwilliamson/lvimp
Perform Inference on Summaries of Longidutinal Algorithm-Agnostic Variable Importance
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.8%) to scientific vocabulary
Repository
Perform Inference on Summaries of Longidutinal Algorithm-Agnostic Variable Importance
Basic Info
- Host: GitHub
- Owner: bdwilliamson
- License: other
- Language: R
- Default Branch: main
- Size: 53.7 KB
Statistics
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
R/lvimp: inference on longitudinal summaries of algorithm-agnostic variable importance
Software author: Brian Williamson
Methodology authors: Brian Williamson, Erica Moodie, and Susan Shortreed
Introduction
In prediction settings where data are collected over time, it is often of interest to understand both the importance of variables for predicting the response at each time point and the importance summarized over the time series. Building on recent advances in estimation and inference for variable importance measures (specifically, the vimp package), we define summaries of variable importance trajectories. These measures can be estimated and the same approaches for inference can be applied regardless of the choice of the algorithm(s) used to estimate the prediction function. This package provides functions that, given fitted values from prediction algorithms, compute algorithm-agnostic estimates that summarize population variable importance over time.
More detail may be found in our paper.
Issues
If you encounter any bugs or have any specific feature requests, please file an issue.
R installation
You may install a development release of lvimp from GitHub via pak by running the following code:
r
pak::pkg_install(repo = "bdwilliamson/lvimp")
Example
This example shows how to use lvimp in a simple setting with simulated data.
```r
load required functions and packages
library("vimp") library("SuperLearner")
generate some data from a simple setting -------------------------------------
set.seed(4747) p <- 2 n <- 5e4 T <- 3 timepoints <- seqlen(T) - 1 beta01 <- rep(1, T) beta02 <- 1 + timepoints / 4 beta0 <- lapply(as.list(seqlen(T)), function(t) { matrix(c(beta01[t], beta_02[t])) })
generate 2 covariates
x <- lapply(as.list(1:T), function(t) as.data.frame(replicate(p, stats::rnorm(n, 0, 1))))
apply the function to the x's
y <- lapply(as.list(1:T), function(t) as.matrix(x[[t]]) %*% beta_0[[t]] + rnorm(n, 0, 1))
"true" outcome variance
true_var <- unlist(lapply(as.list(1:T), function(t) mean((y[[t]] - mean(y[[t]])) ^ 2)))
note that true difference in R-squareds for variable j, under independence, is
betaj^2 * var(xj) / var(y)
mseone <- unlist(lapply(as.list(1:T), function(t) mean((y[[t]] - beta01[t] * x[[t]][, 1]) ^ 2))) msetwo <- unlist(lapply(as.list(1:T), function(t) mean((y[[t]] - beta02[t] * x[[t]][, 2]) ^ 2))) msefull <- unlist(lapply(as.list(1:T), function(t) mean((y[[t]] - as.matrix(x[[t]]) %*% beta0[[t]]) ^ 2))) r2one <- 1 - mseone / truevar r2two <- 1 - msetwo / truevar r2full <- 1 - msefull / true_var
estimate predictiveness, variable importance at each timepoint ---------------
set.seed(1234)
in this case, glm is correctly specified (so only use one learner to speed things up)
vimlist1 <- lapply(as.list(1:T), function(t) { vimp::cvvim(Y = y[[t]], X = x[[t]], indx = 1, V = 10, type = "rsquared", SL.library = c("SL.glm")) }) set.seed(5678) vimlist2 <- lapply(as.list(1:T), function(t) { vimp::cvvim(Y = y[[t]], X = x[[t]], indx = 2, V = 10, type = "rsquared", SL.library = c("SL.glm")) })
obtain the average, linear trend, and AUTC for the time series ---------------
lvimobj <- lvim(vimlist1, timepoints = 1:3) estaverage <- lvimaverage(lvimobj, indices = 1:3) esttrend <- lvimtrend(lvimobj, indices = 1:3) estautc <- lvimautc(lvimobj, indices = 1:3) ```
Owner
- Name: Brian Williamson
- Login: bdwilliamson
- Kind: user
- Location: Seattle, Washington USA
- Company: Kaiser Permanente Washington Health Research Institute
- Website: https://bdwilliamson.github.io/
- Repositories: 46
- Profile: https://github.com/bdwilliamson
Assistant Investigator at Kaiser Permanente Washington Health Research Institute. Interested in inference in high-dimensional settings.