vennlasso
Variable selection for heterogeneous populations using the vennLasso penalty
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
2 of 2 committers (100.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (2.6%) to scientific vocabulary
Keywords
Repository
Variable selection for heterogeneous populations using the vennLasso penalty
Basic Info
- Host: GitHub
- Owner: jaredhuling
- Language: C++
- Default Branch: master
- Homepage: https://jaredhuling.github.io/vennLasso
- Size: 4.45 MB
Statistics
- Stars: 5
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.html
vennLasso
The
vennLassopackage provides methods for hierarchical variable selection for models with covariate effects stratified by multiple binary factors.Installation and Help Files
The
vennLassopackage can be installed from CRAN using:install.packages("vennLasso")The development version can be installed using the devtools package:
devtools::install_github("jaredhuling/vennLasso")or by cloning and building.
Load the vennLasso package:
library(vennLasso)Access help file for the main fitting function
vennLasso()by running:?vennLassoHelp file for cross validation function
cv.vennLasso()can be accessed by running:?cv.vennLassoA Quick Example
Simulate heterogeneous data:
set.seed(100) dat.sim <- genHierSparseData(ncats = 3, # number of stratifying factors nvars = 25, # number of variables nobs = 150, # number of observations per strata nobs.test = 10000, hier.sparsity.param = 0.5, prop.zero.vars = 0.75, # proportion of variables # zero for all strata snr = 0.5, # signal-to-noise ratio family = "gaussian") # design matrices x <- dat.sim$x x.test <- dat.sim$x.test # response vectors y <- dat.sim$y y.test <- dat.sim$y.test # binary stratifying factors grp <- dat.sim$group.ind grp.test <- dat.sim$group.ind.testInspect the populations for each strata:
plotVenn(grp)
Fit vennLasso model with tuning parameter selected with 5-fold cross validation:
fit.adapt <- cv.vennLasso(x, y, grp, adaptive.lasso = TRUE, nlambda = 50, family = "gaussian", standardize = FALSE, intercept = TRUE, nfolds = 5)Plot selected variables for each strata (not run):
library(igraph)## ## Attaching package: 'igraph'## The following objects are masked from 'package:stats': ## ## decompose, spectrum## The following object is masked from 'package:base': ## ## unionplotSelections(fit.adapt)Predict response for test data:
preds.vl <- predict(fit.adapt, x.test, grp.test, s = "lambda.min", type = 'response')Evaluate mean squared error:
mean((y.test - preds.vl) ^ 2)## [1] 0.6852124mean((y.test - mean(y.test)) ^ 2)## [1] 1.011026Compare with naive model with all interactions between covariates and stratifying binary factors:
df.x <- data.frame(y = y, x = x, grp = grp) df.x.test <- data.frame(x = x.test, grp = grp.test) # create formula for interactions between factors and covariates form <- paste("y ~ (", paste(paste0("x.", 1:ncol(x)), collapse = "+"), ")*(grp.1*grp.2*grp.3)" )Fit linear model and generate predictions for test set:
lmf <- lm(as.formula(form), data = df.x) preds.lm <- predict(lmf, df.x.test)Evaluate mean squared error:
mean((y.test - preds.lm) ^ 2)## [1] 0.8056107mean((y.test - preds.vl) ^ 2)## [1] 0.6852124
Owner
- Name: Jared Huling
- Login: jaredhuling
- Kind: user
- Website: http://jaredhuling.org
- Repositories: 41
- Profile: https://github.com/jaredhuling
Assistant Professor in the Division of Biostatistics at the University of Minnesota
GitHub Events
Total
Last Year
Committers
Last synced: almost 3 years ago
All Time
- Total Commits: 112
- Total Committers: 2
- Avg Commits per committer: 56.0
- Development Distribution Score (DDS): 0.339
Top Committers
| Name | Commits | |
|---|---|---|
| jaredhuling | h****g@w****u | 74 |
| jaredhuling | h****7@o****u | 38 |
Issues and Pull Requests
Last synced: over 2 years ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 151 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 7
- Total maintainers: 1
cran.r-project.org: vennLasso
Variable Selection for Heterogeneous Populations
- Homepage: https://github.com/jaredhuling/vennLasso
- Documentation: http://cran.r-project.org/web/packages/vennLasso/vennLasso.pdf
- License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
- Status: removed
-
Latest release: 0.1.6
published over 5 years ago
Rankings
Maintainers (1)
Dependencies
- R >= 3.2.0 depends
- MASS * imports
- Matrix * imports
- Rcpp >= 0.11.0 imports
- VennDiagram * imports
- foreach * imports
- igraph * imports
- methods * imports
- survival * imports
- visNetwork * imports
- knitr * suggests
- rmarkdown * suggests