cvCovEst

cvCovEst: Cross-validated covariance matrix estimator selection and evaluation in R - Published in JOSS (2021)

https://github.com/philboileau/cvcovest

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 14 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    3 of 9 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

covariance-matrix-estimation cross-validation high-dimensional-statistics nonparametric-statistics

Keywords from Contributors

bioconductor-package bioconductor contrastive-learning dimensionality-reduction
Last synced: 6 months ago · JSON representation

Repository

An R package for nonparametric covariance matrix estimation in high dimensions

Basic Info
Statistics
  • Stars: 13
  • Watchers: 3
  • Forks: 4
  • Open Issues: 3
  • Releases: 2
Topics
covariance-matrix-estimation cross-validation high-dimensional-statistics nonparametric-statistics
Created almost 6 years ago · Last pushed about 2 years ago
Metadata Files
Readme Changelog Contributing License

README.Rmd

---
output: github_document
bibliography: "inst/REFERENCES.bib"
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# R/`cvCovEst`


[![CircleCI](https://dl.circleci.com/status-badge/img/gh/PhilBoileau/cvCovEst/tree/master.svg?style=svg)](https://app.circleci.com/pipelines/github/PhilBoileau/cvCovEst?branch=master)
[![codecov](https://codecov.io/gh/PhilBoileau/cvCovEst/branch/master/graph/badge.svg?token=miHiqpGXxJ)](https://app.codecov.io/gh/PhilBoileau/cvCovEst)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![DOI](https://joss.theoj.org/papers/10.21105/joss.03273/status.svg)](https://doi.org/10.21105/joss.03273)
[![MIT license](http://img.shields.io/badge/license-MIT-brightgreen.svg)](https://opensource.org/license/mit/)


> Cross-Validated Covariance Matrix Estimation

__Authors:__ [Philippe Boileau](https://pboileau.ca),
[Brian Collica](https://www.linkedin.com/in/brian-collica-553b0b94), and
[Nima Hejazi](https://nimahejazi.org)

---

## What's `cvCovEst`?

`cvCovEst` implements an efficient cross-validated procedure for covariance
matrix estimation, particularly useful in high-dimensional settings. The
general methodology allows for cross-validation to be used to data adaptively
identify the optimal estimator of the covariance matrix from a prespecified set
of candidate estimators. An overview of the framework is provided in the
package vignette. For a more detailed description, see @boileau2021. A suite of
plotting and diagnostic tools are also included.

---

## Installation

For standard use, install `cvCovEst` from
[CRAN](https://cran.r-project.org/package=cvCovEst):

```{r CRAN-install, eval=FALSE}
install.packages("cvCovEst")
```

The _development version_ of the package may be installed from GitHub using
[`remotes`](https://CRAN.R-project.org/package=remotes):

```{r gh-master-installation, eval=FALSE}
remotes::install_github("PhilBoileau/cvCovEst")
```

---

## Example

To illustrate how `cvCovEst` may be used to select an optimal covariance matrix
estimator via cross-validation, consider the following toy example:

```{r example, message=FALSE, warning=FALSE}
library(MASS)
library(cvCovEst)
set.seed(1584)

# generate a 50x50 covariance matrix with unit variances and off-diagonal
# elements equal to 0.5
Sigma <- matrix(0.5, nrow = 50, ncol = 50) + diag(0.5, nrow = 50)

# sample 50 observations from multivariate normal with mean = 0, var = Sigma
dat <- mvrnorm(n = 50, mu = rep(0, 50), Sigma = Sigma)

# run CV-selector
cv_cov_est_out <- cvCovEst(
    dat = dat,
    estimators = c(linearShrinkLWEst, denseLinearShrinkEst,
                   thresholdingEst, poetEst, sampleCovEst),
    estimator_params = list(
      thresholdingEst = list(gamma = c(0.2, 2)),
      poetEst = list(lambda = c(0.1, 0.2), k = c(1L, 2L))
    ),
    cv_loss = cvMatrixFrobeniusLoss,
    cv_scheme = "v_fold",
    v_folds = 5
  )

# print the table of risk estimates
# NOTE: the estimated covariance matrix is accessible via the `$estimate` slot
cv_cov_est_out$risk_df
```

---

## Issues

If you encounter any bugs or have any specific feature requests, please [file
an issue](https://github.com/PhilBoileau/cvCovEst/issues).

---

## Contributions

Contributions are very welcome. Interested contributors should consult our
[contribution
guidelines](https://github.com/PhilBoileau/cvCovEst/blob/master/CONTRIBUTING.md)
prior to submitting a pull request.

---

## Citation

Please cite the following paper when using the `cvCovEst` R software package.

```
@article{cvCovEst2021,
  doi = {10.21105/joss.03273},
  url = {https://doi.org/10.21105/joss.03273},
  year = {2021},
  publisher = {The Open Journal},
  volume = {6},
  number = {63},
  pages = {3273},
  author = {Philippe Boileau and Nima S. Hejazi and Brian Collica and Mark J. van der Laan and Sandrine Dudoit},
  title = {cvCovEst: Cross-validated covariance matrix estimator selection and evaluation in `R`},
  journal = {Journal of Open Source Software}
}

```

When describing or discussing the theory underlying the `cvCovEst` method, or
simply using the method, please cite the pre-print below.

```
@article{boileau2022,
	author = {Philippe Boileau and Nima S. Hejazi and Mark J. van der Laan and Sandrine Dudoit},
	doi = {10.1080/10618600.2022.2110883},
	eprint = {https://doi.org/10.1080/10618600.2022.2110883},
	journal = {Journal of Computational and Graphical Statistics},
	number = {ja},
	pages = {1-28},
	publisher = {Taylor & Francis},
	title = {Cross-Validated Loss-Based Covariance Matrix Estimator Selection in High Dimensions},
	url = {https://doi.org/10.1080/10618600.2022.2110883},
	volume = {0},
	year = {2022},
	bdsk-url-1 = {https://doi.org/10.1080/10618600.2022.2110883}}

```

---

## License

© 2020-2023 [Philippe Boileau](https://pboileau.ca)

The contents of this repository are distributed under the MIT license. See file
[`LICENSE.md`](https://github.com/PhilBoileau/cvCovEst/blob/master/LICENSE.md)
for details.

---

## References

Owner

  • Name: Philippe Boileau
  • Login: PhilBoileau
  • Kind: user
  • Location: Berkeley, CA

PhD candidate in biostatistics at UC Berkeley

JOSS Publication

cvCovEst: Cross-validated covariance matrix estimator selection and evaluation in R
Published
July 26, 2021
Volume 6, Issue 63, Page 3273
Authors
Philippe Boileau ORCID
Graduate Group in Biostatistics, University of California, Berkeley, Center for Computational Biology, University of California, Berkeley
Nima S. Hejazi ORCID
Graduate Group in Biostatistics, University of California, Berkeley, Center for Computational Biology, University of California, Berkeley
Brian Collica ORCID
Department of Statistics, University of California, Berkeley
Mark J. van der Laan ORCID
Center for Computational Biology, University of California, Berkeley, Department of Statistics, University of California, Berkeley, Division of Biostatistics, School of Public Health, University of California, Berkeley
Sandrine Dudoit ORCID
Center for Computational Biology, University of California, Berkeley, Department of Statistics, University of California, Berkeley, Division of Biostatistics, School of Public Health, University of California, Berkeley
Editor
Frederick Boehm ORCID
Tags
covariance matrix cross-validation high-dimensional statistics loss-based estimation multivariate analysis

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 413
  • Total Committers: 9
  • Avg Commits per committer: 45.889
  • Development Distribution Score (DDS): 0.349
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Philippe Boileau p****m@g****m 269
Brian Collica b****a@b****u 96
Brian Collica M****n@b****t 17
Nima Hejazi nh@n****g 15
Jamarcus Liu y****u@b****u 8
Brian Collica M****n@B****l 5
Sandrine Dudoit s****e@s****u 1
Hadley Wickham h****m@g****m 1
Florian Privé f****1@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 17
  • Total pull requests: 56
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 4 days
  • Total issue authors: 6
  • Total pull request authors: 6
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.77
  • Merged pull requests: 56
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • PhilBoileau (6)
  • nhejazi (5)
  • Marie-PerrotDockes (3)
  • AlexPars (1)
  • bcollica (1)
  • yunanwu123 (1)
Pull Request Authors
  • PhilBoileau (43)
  • bcollica (8)
  • nhejazi (3)
  • hadley (1)
  • privefl (1)
  • JamarcusLiu (1)
Top Labels
Issue Labels
enhancement (4) bug (2)
Pull Request Labels
enhancement (2) bug (1)

Packages

  • Total packages: 1
  • Total downloads:
    • cran 438 last-month
  • Total dependent packages: 3
  • Total dependent repositories: 0
  • Total versions: 11
  • Total maintainers: 1
cran.r-project.org: cvCovEst

Cross-Validated Covariance Matrix Estimation

  • Versions: 11
  • Dependent Packages: 3
  • Dependent Repositories: 0
  • Downloads: 438 Last month
Rankings
Forks count: 11.3%
Stargazers count: 16.3%
Dependent packages count: 18.7%
Average: 24.2%
Dependent repos count: 35.5%
Downloads: 39.2%
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 4.0.0 depends
  • Matrix * imports
  • RColorBrewer * imports
  • RMTstat * imports
  • RSpectra * imports
  • Rdpack * imports
  • assertthat * imports
  • coop * imports
  • dplyr * imports
  • ggplot2 * imports
  • ggpubr * imports
  • matrixStats * imports
  • methods * imports
  • origami * imports
  • purrr * imports
  • rlang * imports
  • stats * imports
  • stringr * imports
  • tibble * imports
  • MASS * suggests
  • covr * suggests
  • future * suggests
  • future.apply * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • spelling * suggests
  • testthat * suggests