decorrelate
Decorrelation projection scalable to high dimensional data using estimated correlation with low rank and shrinkage
Science Score: 39.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.9%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Decorrelation projection scalable to high dimensional data using estimated correlation with low rank and shrinkage
Basic Info
- Host: GitHub
- Owner: GabrielHoffman
- Language: R
- Default Branch: master
- Homepage: https://gabrielhoffman.github.io/decorrelate/
- Size: 13.4 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Created over 5 years ago
· Last pushed 10 months ago
Metadata Files
Readme
Changelog
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
### Fast Probabilistic Whitening Transformation for Ultra-High Dimensional Data
Data whitening is a widely used preprocessing step to remove correlation structure since statistical models often assume independence [(Kessy, et al. 2018)](https://doi.org/10.1080/00031305.2016.1277159). The typical procedures transforms the observed data by an inverse square root of the sample correlation matrix (**Figure 1**). For low dimension data (i.e. $n > p$), this transformation produces transformed data with an identity sample covariance matrix. This procedure assumes either that the true covariance matrix is know, or is well estimated by the sample covariance matrix. Yet the use of the sample covariance matrix for this transformation can be problematic since **1)** the complexity is $\mathcal{O}(p^3)$ and **2)** it is not applicable to the high dimensional (i.e. $n \ll p$) case since the sample covariance matrix is no longer full rank.
Here we use a probabilistic model of the observed data to apply a whitening transformation. This Gaussian Inverse Wishart Empirical Bayes (GIW-EB) **1)** model substantially reduces computational complexity, and **2)** regularizes the eigen-values of the sample covariance matrix to improve out-of-sample performance.

## Installation
``` r
devtools::install_github("GabrielHoffman/decorrelate")
```
Owner
- Name: Gabriel Hoffman
- Login: GabrielHoffman
- Kind: user
- Location: New York
- Company: Icahn School of Medicine at Mount Sinai
- Website: http://gabrielhoffman.github.io/
- Repositories: 9
- Profile: https://github.com/GabrielHoffman
Statistical genomics
GitHub Events
Total
- Push event: 4
- Public event: 1
Last Year
- Push event: 4
- Public event: 1
Packages
- Total packages: 1
-
Total downloads:
- cran 191 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 2
- Total maintainers: 1
cran.r-project.org: decorrelate
Decorrelation Projection Scalable to High Dimensional Data
- Homepage: https://gabrielhoffman.github.io/decorrelate/
- Documentation: http://cran.r-project.org/web/packages/decorrelate/decorrelate.pdf
- License: Artistic-2.0
-
Latest release: 0.1.6.4
published 11 months ago
Rankings
Dependent packages count: 25.9%
Dependent repos count: 31.8%
Average: 47.8%
Downloads: 85.6%
Maintainers (1)
Last synced:
10 months ago
Dependencies
DESCRIPTION
cran
- R >= 4.2.0 depends
- methods * depends
- CholWishart * imports
- Matrix * imports
- Rcpp * imports
- Rfast * imports
- graphics * imports
- irlba * imports
- stats * imports
- utils * imports
- CCA * suggests
- RUnit * suggests
- clusterGeneration * suggests
- colorRamps * suggests
- cowplot * suggests
- ggplot2 * suggests
- knitr * suggests
- latex2exp * suggests
- mvtnorm * suggests
- pander * suggests
- rmarkdown * suggests
- whitening * suggests
- yacca * suggests