shadowvimp
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.8%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: OktawiaStaburo
- License: apache-2.0
- Language: R
- Default Branch: main
- Size: 2.05 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Created over 1 year ago
· Last pushed 10 months ago
Metadata Files
Readme
Changelog
License
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%",
fig.width = 30,
fig.height = 20,
fig.align = "center"
)
```
# shadowVIMP
[](https://github.com/OktawiaStaburo/shadowVIMP/actions)
## Overview
`shadowVIMP` is an R package for variable selection that reduces the number of covariates in a statistically rigorous and informed way, identifying the most informative predictors. This package implements a method that performs statistical tests on the Variable Importance Measures (VIMP) obtained from the Random Forest (RF) algorithm to determine whether each covariate is statistically significant and truly informative. In contrast to widely used methods, such as selecting the top *n* covariates with the highest VIMP or choosing covariates with a VIMP above a certain threshold, the method implemented in `shadowVIMP` allows for a statistical justification of whether a given VIMP is sufficiently large to be unlikely due to chance. The main function of the package, `shadow_vimp()`, outputs a table indicating whether each covariate is informative, along with its associated (adjusted) p-values. In addition, the `plot_vimps()` function provides a convenient way to visualise the VIMPs obtained from the algorithm, including unadjusted, FDR- and FWER-adjusted p-values. Details on the method, a realistic example of its usage, and guidance on interpreting the results can be found in the vignette: `vignette("shadowVIMP-vignette")`.
## Installation
The `shadowVIMP` package can be installed in the following way::
``` {r, eval = FALSE}
install.packages("shadowVIMP")
```
## Usage
Imagine you are working with a dataset with many variables and you want to select the subset of the most informative covariates. Instead of computing the variable importance, displaying it on variable importance plots, and (somewhat arbitrarily) selecting the set of informative features, you can use the method implemented in this package and select the subset of features in a more robust way. The following example shows the basic use case.
```{r example, message = FALSE, warning = FALSE}
library(shadowVIMP)
library(magrittr)
library(ggplot2)
data(mtcars)
# For reproducibility
set.seed(789)
# Value of num.threads parameter
global_num_threads <- 1
# Standard usage
# WARNING 1: When working with real data, increase the value of the niters parameter or leave it at the default value.
# WARNING 2: To avoid potential issues with using multiple threads on CRAN, we set num.threads to 1, by default it is set to half of the available threads, which speeds up computation.
vimp_seq <- shadow_vimp(data = mtcars, outcome_var = "vs", niters = c(30, 100, 150), num.threads = global_num_threads)
# Summary of the results
vimp_seq
# Print informative covariates according to the pooled criterion (with and without p-value correction)
vimp_seq$final_dec_pooled
# The significance level used in each step of the procedure:
vimp_seq$alpha
# Were all covariates deemed insignificant at any step of the pre-selection? If so, which step?
# If `step_all_covariates_removed == 0`, then at least one covariate survived the pre-selection
vimp_seq$step_all_covariates_removed
# Check the time needed to execute each step of the algorithm and the entire procedure
vimp_seq$time_elapsed
# Check the call code that was used to create the inspected object
vimp_seq$call
# Check the VIMPs of the covariates and their shadows from the last step of the procedure
vimp_seq$vimp_history %>% head()
# Inspect in detail two steps of pre-selection
# vimp_seq$pre_selection$step_1
# vimp_seq$pre_selection$step_2
```
You can visualize your results in the following way:
```{r example_cont, cache=FALSE}
plot_vimps(
shadow_vimp_out = vimp_seq,
text_size = 10,
legend.text = element_text(size = 40),
axis.text = element_text(size = 30)
) +
patchwork::plot_annotation(
title = "shadowVIMP Results",
) &
theme(plot.title = element_text(size = 45, face = "bold"))
```
For a more realistic and detailed example of how to use this package, and the theory behind the method, see `vignette("shadowVIMP-vignette")`.
## Getting help
If you encounter a clear bug, please file an issue with a minimal reproducible example.
Owner
- Login: OktawiaStaburo
- Kind: user
- Repositories: 1
- Profile: https://github.com/OktawiaStaburo
GitHub Events
Total
- Issues event: 1
- Push event: 8
- Public event: 1
- Create event: 1
Last Year
- Issues event: 1
- Push event: 8
- Public event: 1
- Create event: 1
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: about 1 month
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: about 1 month
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- OktawiaStaburo (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 173 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 1
- Total maintainers: 1
cran.r-project.org: shadowVIMP
Covariate Selection Based on VIMP Permutation-Like Testing
- Homepage: https://github.com/OktawiaStaburo/shadowVIMP
- Documentation: http://cran.r-project.org/web/packages/shadowVIMP/shadowVIMP.pdf
- License: Apache License (≥ 2)
-
Latest release: 1.0.2
published about 1 year ago
Rankings
Dependent packages count: 26.2%
Dependent repos count: 32.3%
Average: 48.3%
Downloads: 86.4%
Maintainers (1)
Last synced:
10 months ago
Dependencies
.github/workflows/R-CMD-check.yaml
actions
- actions/checkout v4 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION
cran
- dplyr * imports
- ggforce * imports
- ggplot2 * imports
- ggpubr * imports
- magrittr * imports
- parallel * imports
- patchwork * imports
- ranger * imports
- rlang * imports
- stats * imports
- stringr * imports
- tidyr * imports
- AppliedPredictiveModeling * suggests
- knitr * suggests
- rmarkdown * suggests
- spelling * suggests
- testthat >= 3.0.0 suggests