SuperCell

Coarse-graining of large single-cell RNA-seq data into metacells

https://github.com/gfellerlab/supercell

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 14 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.9%) to scientific vocabulary

Keywords

coarse-graining r-package scrna-seq-analysis scrna-seq-data
Last synced: 9 months ago · JSON representation

Repository

Coarse-graining of large single-cell RNA-seq data into metacells

Basic Info
  • Host: GitHub
  • Owner: GfellerLab
  • License: gpl-3.0
  • Language: R
  • Default Branch: master
  • Homepage:
  • Size: 82.3 MB
Statistics
  • Stars: 87
  • Watchers: 4
  • Forks: 13
  • Open Issues: 3
  • Releases: 1
Topics
coarse-graining r-package scrna-seq-analysis scrna-seq-data
Created almost 6 years ago · Last pushed 11 months ago
Metadata Files
Readme License Code of conduct

README.Rmd

---
title: "Coarse-graining of large single-cell RNA-seq data into metacells"
csl: elsevier-harvard.csl
output:
  md_document:
    variant: markdown_github
---

```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  fig.path = "plots/"
)
```

[![R-CMD-check](https://github.com/GfellerLab/SuperCell/workflows/R-CMD-check/badge.svg)](https://github.com/GfellerLab/SuperCell/actions)
[![DOI](https://img.shields.io/badge/DOI%3A-10.1186/s12859--022--04861--1-brightgreen)](https://doi.org/10.1186/s12859-022-04861-1)
[![License](https://img.shields.io/badge/License-LICENSE-green)](./LICENSE)


# Coarse-graining of large single-cell RNA-seq data into metacells
SuperCell is an R package for coarse-graining large single-cell RNA-seq data into metacells and performing downstream analysis at the metacell level. 

The exponential scaling of scRNA-seq data represents an important hurdle for downstream analyses.
One of the solutions to facilitate the analysis of large-scale and noisy scRNA-seq data is to merge transcriptionally highly similar cells into *metacells*. This concept was first introduced by [*Baran et al., 2019*](https://doi.org/10.1186/s13059-019-1812-2) (MetaCell) and by [*Iacono et al., 2018*](https://doi:10.1101/gr.230771.117) (bigSCale). More recent methods to build *metacells* have been described in [*Ben-Kiki et al. 2022*](https://doi.org/10.1186/s13059-022-02667-1) (MetaCell2), [*Bilous et al., 2022*](https://doi.org/10.1186/s12859-022-04861-1) (SuperCell) and [*Persad et al., 2022*](https://doi.org/10.1038/s41587-023-01716-9) (SEACells). Despite some differences in the implementation, all the methods are network-based and can be summarized as follows: 

**1.** A single-cell network is computed based on cell-to-cell similarity (in transcriptomic space)

**2.** Highly similar cells are identified as those forming dense regions in the single-cell network and merged together into metacells (coarse-graining)

**3.** Transcriptomic information within each metacell is combined (average or sum).

**4.** Metacell data are used for the downstream analyses instead of large-scale single-cell data

```{r echo=FALSE}
knitr::include_graphics("plots/Fig1A.png")
```

Unlike clustering, the aim of metacells is not to identify large groups of cells that comprehensively capture biological concepts, like cell types, but to merge cells that share highly similar profiles, and may carry repetitive information. **Therefore metacells represent a compromise structure that optimally remove redundant information in scRNA-seq data while preserving the biologically relevant heterogeneity.**

An important concept when building metacells is the **graining level** (*γ*), which we define as the ratio between the number of single cells in the initial data and the number of metacells. We suggest applying *γ* between $10$ and $50$, which significantly reduces the computational resources needed to perform the downstream analyses while preserving most of the result of the initial (i.e., single-cell) analyses.


## Installation

SuperCell requires [igraph](https://cran.r-project.org/web/packages/igraph/index.html), [RANN](https://cran.r-project.org/web/packages/RANN/index.html), [WeightedCluster](https://cran.r-project.org/web/packages/WeightedCluster/index.html), [corpcor](https://cran.r-project.org/web/packages/corpcor/index.html), [weights](https://cran.r-project.org/web/packages/weights/index.html), [Hmisc](https://cran.r-project.org/web/packages/Hmisc/index.html), [Matrix](https://cran.r-project.org/web/packages/Matrix/index.html), [matrixStats](https://cran.rstudio.com/web/packages/matrixStats/index.html), [plyr](https://cran.r-project.org/web/packages/plyr/index.html), [irlba](https://cran.r-project.org/web/packages/irlba/index.html), 
[grDevices](https://stat.ethz.ch/R-manual/R-devel/library/grDevices/html/00Index.html),
[patchwork](https://cran.r-project.org/web/packages/patchwork/index.html),
[ggplot2](https://cloud.r-project.org/web/packages/ggplot2/index.html).
SuperCell uses [velocyto.R](https://github.com/velocyto-team/velocyto.R) for RNA velocity.



```{r install imported packages, eval=FALSE}
install.packages("igraph")
install.packages("RANN")
install.packages("WeightedCluster")
install.packages("corpcor")
install.packages("weights")
install.packages("Hmisc")
install.packages("Matrix")
install.packages("patchwork")
install.packages("plyr")
install.packages("irlba")
```

Installing SuperCell package from gitHub

```{r library, eval=FALSE}
if (!requireNamespace("remotes")) install.packages("remotes")
remotes::install_github("GfellerLab/SuperCell")

library(SuperCell)
```

## Examples

1. [Building and analyzing metacells with SuperCell](./vignettes/a_SuperCell.Rmd)
2. [RNA velocity applied to SuperCell object](./vignettes/c_RNAvelocity_for_SuperCell.Rmd)
3. [Building metacells with SuperCell and alayzing them with a standard Seurat pipeline](https://github.com/GfellerLab/SIB_workshop/blob/main/workbooks/Workbook_1__cancer_cell_lines.md)
4. [Data integration of metacells built with SuperCell](https://github.com/GfellerLab/SIB_workshop/blob/main/workbooks/Workbook_2__COVID19_integration.md)

## [License](./LICENSE)

SuperCell is developed by the group of David Gfeller at University of Lausanne.

SuperCell can be used freely by academic groups for non-commercial purposes under the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0) (see [license](./LICENSE)). 
The product is provided free of charge, and, therefore, on an "as is" basis, without warranty of any kind.

FOR-PROFIT USERS

If you plan to use SuperCell or any data provided with the script in any for-profit application, you are required to obtain a separate license. 
To do so, please contact Nadette Bulgin [nbulgin\@licr.org](mailto:nbulgin@licr.org) at the Ludwig Institute for Cancer Research Ltd.

If required, FOR-PROFIT USERS are also expected to have proper licenses for the tools used in SuperCell, including the R packages igraph, RANN, WeightedCluster, corpora, weights, Hmisc, Matrix, ply, irlba, grDevices, patchwork, ggplot2 and velocyto.R

For scientific questions, please contact Mariia Bilous ([mariia.bilous\@unil.ch](mailto:mariia.bilous@unil.ch)) or David Gfeller ([David.Gfeller\@unil.ch](mailto:David.Gfeller@unil.ch)).

## How to cite

If you use SuperCell in a publication, please cite: [Bilous et al. Metacells untangle large and complex single-cell transcriptome networks, BMC Bioinformatics (2022).](https://doi.org/10.1186/s12859-022-04861-1)

Owner

  • Name: Computational Cancer Biology Lab
  • Login: GfellerLab
  • Kind: organization
  • Location: Lausanne

Lab from David Gfeller from the University of Lausanne

GitHub Events

Total
  • Watch event: 17
  • Issue comment event: 6
  • Pull request event: 1
  • Fork event: 1
  • Create event: 1
Last Year
  • Watch event: 17
  • Issue comment event: 6
  • Pull request event: 1
  • Fork event: 1
  • Create event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • AngryMaciek (3)
  • Phoenix12580 (1)
Pull Request Authors
  • leonardHerault (4)
  • mariiabilous (1)
  • pormr (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 545 last-month
  • Total docker downloads: 48
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 2
  • Total maintainers: 1
cran.r-project.org: SuperCell

Simplification of scRNA-Seq Data by Merging Together Similar Cells

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 545 Last month
  • Docker Downloads: 48
Rankings
Dependent packages count: 28.3%
Dependent repos count: 34.9%
Average: 49.9%
Downloads: 86.7%
Maintainers (1)
Last synced: 10 months ago

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v3 composite
  • actions/upload-artifact main composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/check-release.yaml actions
  • actions/checkout v3 composite
  • actions/upload-artifact main composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • R >= 3.5.0 depends
  • Hmisc * imports
  • Matrix * imports
  • RANN * imports
  • Rtsne * imports
  • WeightedCluster * imports
  • bluster * imports
  • corpcor * imports
  • cowplot * imports
  • dbscan * imports
  • entropy * imports
  • ggplot2 * imports
  • grDevices * imports
  • gtools * imports
  • igraph * imports
  • irlba * imports
  • matrixStats * imports
  • methods * imports
  • patchwork * imports
  • plotfunctions * imports
  • plyr * imports
  • proxy * imports
  • rlang * imports
  • scales * imports
  • umap * imports
  • weights * imports
  • Seurat * suggests
  • SingleCellExperiment * suggests
  • SummarizedExperiment * suggests
  • knitr * suggests
  • remotes * suggests
  • rmarkdown * suggests
  • scater * suggests
  • testthat >= 3.0.0 suggests
  • velocyto.R * suggests