scPCA

scPCA: A toolbox for sparse contrastive principal component analysis in R - Published in JOSS (2020)

https://github.com/philboileau/scpca

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 8 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    2 of 6 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

bioconductor contrastive-learning dimensionality-reduction

Keywords from Contributors

bioconductor-package covariance-matrix-estimation cross-validation high-dimensional-statistics nonparametric-statistics bioinformatics biomarker-discovery biostatistics causal-inference computational-biology
Last synced: 6 months ago · JSON representation

Repository

A toolbox for sparse contrastive principal component analysis

Basic Info
Statistics
  • Stars: 12
  • Watchers: 4
  • Forks: 1
  • Open Issues: 1
  • Releases: 1
Topics
bioconductor contrastive-learning dimensionality-reduction
Created almost 7 years ago · Last pushed about 1 year ago
Metadata Files
Readme Changelog Contributing License

README.Rmd

---
output:
  rmarkdown::github_document
bibliography: "inst/REFERENCES.bib"
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

# R/`scPCA`

[![Travis CI Build Status](https://travis-ci.org/PhilBoileau/scPCA.svg?branch=master)](https://travis-ci.org/PhilBoileau/scPCA.svg?branch=master)
[![AppVeyor Build  Status](https://ci.appveyor.com/api/projects/status/github/PhilBoileau/scPCA?branch=master&svg=true)](https://ci.appveyor.com/project/PhilBoileau/scPCA/)
[![Codecov test coverage](https://codecov.io/gh/PhilBoileau/scPCA/branch/master/graph/badge.svg)](https://codecov.io/gh/PhilBoileau/scPCA?branch=master)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![BioC status](http://www.bioconductor.org/shields/build/release/bioc/scPCA.svg)](https://bioconductor.org/checkResults/release/bioc-LATEST/scPCA)
[![Bioc Time](http://bioconductor.org/shields/years-in-bioc/scPCA.svg)](https://bioconductor.org/packages/release/bioc/html/scPCA.html)
[![status](https://joss.theoj.org/papers/7f0f1271ede7aba120d71c9b5a14c865/status.svg)](https://joss.theoj.org/papers/7f0f1271ede7aba120d71c9b5a14c865)
[![MIT license](http://img.shields.io/badge/license-MIT-brightgreen.svg)](http://opensource.org/licenses/MIT)

> Sparse Contrastive Principal Component Analysis for Computational Biology

__Authors:__ [Philippe Boileau](https://pboileau.ca/),
  [Nima Hejazi](https://nimahejazi.org),
  [Sandrine Dudoit](https://statistics.berkeley.edu/~sandrine/)

---

## What's `scPCA`?

The exploration and analysis of modern high-dimensional biological data
regularly involves the use of dimension reduction techniques in order to tease
out meaningful and interpretable information from complex experimental data,
often subject to batch effects and other noise. In tandem with the
development of sequencing technology (e.g., RNA-seq, scRNA-seq), many variants
of PCA have been developed in attempts to remedy deficiencies in
interpretability and stability that plague vanilla PCA.

Such developments have included both various forms of sparse PCA (SPCA)
[@zou2006sparse; @erichson2018sparse], which increase the stability and
interpretability of principal component loadings in high dimensions, and, more
recently, contrastive PCA (cPCA) [@abid2018exploring], which captures relevant
information in the target (experimental) data set by eliminating technical noise
through comparison to a so-called background data set. While SPCA and cPCA have
both individually proven useful in resolving distinct shortcomings of PCA,
neither is capable of simultaneously tackling the issues of interpretability,
stability and relevance simultaneously. The `scPCA` package implements
_sparse contrastive PCA_ [@boileau2020] to accomplish these tasks in the context
of high-dimensional biological data. In addition to implementing this newly developed 
technique, the `scPCA` package implements cPCA and generalizations thereof.

---

## Installation

For standard use, install from
[Bioconductor](https://bioconductor.org/packages/scPCA) using
[`BiocManager`](https://CRAN.R-project.org/package=BiocManager):

```{r bioc-installation, eval = FALSE}
if (!requireNamespace("BiocManager", quietly=TRUE)) {
  install.packages("BiocManager")
}
BiocManager::install("scPCA")
```

To contribute, install the bleeding-edge _development version_ from GitHub via
[`remotes`](https://CRAN.R-project.org/package=remotes):

```{r gh-master-installation, eval = FALSE}
remotes::install_github("PhilBoileau/scPCA")
```

Current and prior [Bioconductor](https://bioconductor.org) releases are
available under branches with numbers prefixed by "RELEASE_". For example, to
install the version of this package available via Bioconductor 3.10, use

```{r gh-develop-installation, eval = FALSE}
remotes::install_github("PhilBoileau/scPCA@RELEASE_3_10")
```

---

## Example

For details on how to best use the `scPCA` R package, please consult the most
recent [package
vignette](https://bioconductor.org/packages/release/bioc/vignettes/scPCA/inst/doc/scpca_intro.html)
available through the [Bioconductor
project](https://bioconductor.org/packages/scPCA).

---

## Issues

If you encounter any bugs or have any specific feature requests, please [file an
issue](https://github.com/PhilBoileau/scPCA/issues).

---

## Contributions

Contributions are welcome. Interested contributors should consult our
[contribution
guidelines](https://github.com/PhilBoileau/scPCA/blob/master/CONTRIBUTING.md)
prior to submitting a pull request.

---

## Citation

Please cite the first paper below after using the `scPCA` R software package. 
Please also make sure to cite the article describing the statistical methodology
when using scPCA or cross-validated cPCA as part of an analysis.

```
@article{boileau2020scPCAjoss,
  doi = {10.21105/joss.02079},
  url = {https://doi.org/10.21105/joss.02079},
  year = {2020},
  publisher = {The Open Journal},
  volume = {5},
  number = {46},
  pages = {2079},
  author = {Philippe Boileau and Nima Hejazi and Sandrine Dudoit},
  title = {scPCA: A toolbox for sparse contrastive principal component analysis in R},
  journal = {Journal of Open Source Software}
}

@article{boileau2020scPCA,
    author = {Boileau, Philippe and Hejazi, Nima S and Dudoit, Sandrine},
    title = "{Exploring High-Dimensional Biological Data with Sparse Contrastive Principal Component Analysis}",
    journal = {Bioinformatics},
    year = {2020},
    month = {03},
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btaa176},
    url = {https://doi.org/10.1093/bioinformatics/btaa176},
    note = {btaa176},
    eprint = {https://academic.oup.com/bioinformatics/article-pdf/doi/10.1093/bioinformatics/btaa176/32914142/btaa176.pdf},
}
```

---

## License

© 2019-2023 [Philippe Boileau](https://pboileau.ca/)

The contents of this repository are distributed under the MIT license. See file
`LICENSE` for details.

---

## References

Owner

  • Name: Philippe Boileau
  • Login: PhilBoileau
  • Kind: user
  • Location: Berkeley, CA

PhD candidate in biostatistics at UC Berkeley

JOSS Publication

scPCA: A toolbox for sparse contrastive principal component analysis in R
Published
February 25, 2020
Volume 5, Issue 46, Page 2079
Authors
Philippe Boileau ORCID
Graduate Group in Biostatistics, University of California, Berkeley
Nima S. Hejazi ORCID
Graduate Group in Biostatistics, University of California, Berkeley, Center for Computational Biology, University of California, Berkeley
Sandrine Dudoit ORCID
Center for Computational Biology, University of California, Berkeley, Department of Statistics, University of California, Berkeley, Division of Epidemiology and Biostatistics, School of Public Health, University of California, Berkeley
Editor
Charlotte Soneson ORCID
Tags
dimensionality reduction principal component analysis computational biology unwanted variation sparsity

GitHub Events

Total
  • Issues event: 1
  • Delete event: 1
  • Issue comment event: 1
  • Push event: 3
  • Pull request event: 2
  • Create event: 1
Last Year
  • Issues event: 1
  • Delete event: 1
  • Issue comment event: 1
  • Push event: 3
  • Pull request event: 2
  • Create event: 1

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 324
  • Total Committers: 6
  • Avg Commits per committer: 54.0
  • Development Distribution Score (DDS): 0.238
Past Year
  • Commits: 4
  • Committers: 2
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.5
Top Committers
Name Email Commits
Philippe Boileau p****m@g****m 247
Nima Hejazi nh@n****g 52
Nitesh Turaga n****a@g****m 12
J Wokaty j****y@s****u 10
Sandrine Dudoit s****e@s****u 2
Fabian Scheipl f****l@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 25
  • Total pull requests: 39
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 13 hours
  • Total issue authors: 8
  • Total pull request authors: 3
  • Average comments per issue: 1.96
  • Average comments per pull request: 0.28
  • Merged pull requests: 39
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 minute
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • PhilBoileau (10)
  • LTLA (8)
  • nhejazi (2)
  • adigorla (1)
  • PietroD (1)
  • fabian-s (1)
  • klai001 (1)
  • chisin (1)
Pull Request Authors
  • PhilBoileau (28)
  • nhejazi (11)
  • fabian-s (1)
Top Labels
Issue Labels
enhancement (7) bug (2) not an issue (1)
Pull Request Labels
enhancement (2)

Packages

  • Total packages: 3
  • Total downloads:
    • bioconductor 11,957 total
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 7
  • Total maintainers: 1
proxy.golang.org: github.com/PhilBoileau/scPCA
  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.5%
Average: 6.7%
Dependent repos count: 7.0%
Last synced: 6 months ago
proxy.golang.org: github.com/philboileau/scpca
  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.5%
Average: 6.7%
Dependent repos count: 7.0%
Last synced: 6 months ago
bioconductor.org: scPCA

Sparse Contrastive Principal Component Analysis

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 11,957 Total
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 23.0%
Downloads: 69.0%
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 4.0.0 depends
  • BiocParallel * imports
  • DelayedArray * imports
  • Matrix * imports
  • MatrixGenerics * imports
  • RSpectra * imports
  • Rdpack * imports
  • ScaledMatrix * imports
  • assertthat * imports
  • cluster * imports
  • coop * imports
  • dplyr * imports
  • elasticnet * imports
  • kernlab * imports
  • matrixStats * imports
  • methods * imports
  • origami * imports
  • purrr * imports
  • sparsepca * imports
  • stats * imports
  • stringr * imports
  • tibble * imports
  • BiocStyle * suggests
  • DelayedMatrixStats * suggests
  • SingleCellExperiment * suggests
  • covr * suggests
  • ggplot2 * suggests
  • ggpubr * suggests
  • knitr * suggests
  • microbenchmark * suggests
  • rmarkdown * suggests
  • sparseMatrixStats * suggests
  • splatter * suggests
  • testthat >= 2.1.0 suggests