SpotClean
R package for decontaminating the spot swapping effect and recovering true expression in spatial transcriptomics data
Science Score: 46.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: nature.com -
✓Committers with academic emails
2 of 5 committers (40.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.4%) to scientific vocabulary
Keywords
Repository
R package for decontaminating the spot swapping effect and recovering true expression in spatial transcriptomics data
Statistics
- Stars: 32
- Watchers: 2
- Forks: 10
- Open Issues: 11
- Releases: 1
Topics
Metadata Files
README.md
SpotClean: a computational method to adjust for spot swapping in spatial transcriptomics data
SpotClean is a computational method to adjust for spot swapping in spatial transcriptomics data. Recent spatial transcriptomics experiments utilize slides containing thousands of spots with spot-specific barcodes that bind mRNA. Ideally, unique molecular identifiers at a spot measure spot-specific expression, but this is often not the case due to bleed from nearby spots, an artifact we refer to as spot swapping. SpotClean is able to estimate the contamination rate in observed data and decontaminate the spot swapping effect, thus increase the sensitivity and precision of downstream analyses.
Introduction
Spatial transcriptomics (ST), named Method of the Year 2020 by Nature Methods in 2020, is a powerful and widely-used experimental method for profiling genome-wide gene expression across a tissue. In a typical ST experiment, fresh-frozen (or FFPE) tissue is sectioned and placed onto a slide containing spots, with each spot containing millions of capture oligonucleotides with spatial barcodes unique to that spot. The tissue is imaged, typically via Hematoxylin and Eosin (H&E) staining. Following imaging, the tissue is permeabilized to release mRNA which then binds to the capture oligonucleotides, generating a cDNA library consisting of transcripts bound by barcodes that preserve spatial information. Data from an ST experiment consists of the tissue image coupled with RNA-sequencing data collected from each spot. A first step in processing ST data is tissue detection, where spots on the slide containing tissue are distinguished from background spots without tissue. Unique molecular identifier (UMI) counts at each spot containing tissue are then used in downstream analyses.
Ideally, a gene-specific UMI at a given spot would represent expression of that gene at that spot, and spots without tissue would show no (or few) UMIs. This is not the case in practice. Messenger RNA bleed from nearby spots causes substantial contamination of UMI counts, an artifact we refer to as spot swapping. On average, we observe that more than 30% of UMIs at a tissue spot did not originate from this spot, but from other spots contaminating it. Spot swapping confounds downstream inferences including normalization, marker gene-based annotation, differential expression and cell type decomposition.
We developed SpotClean to adjust for the effects of spot swapping in ST experiments. SpotClean is able to measure the per-spot contamination rates in observed data and decontaminate gene expression levels, thus increases the sensitivity and precision of downstream analyses. Our package SpotClean is built based on 10x Visium spatial transcriptomics experiments, currently the most widely-used commercial protocol, providing functions to load raw spatial transcriptomics data from 10x Space Ranger outputs, decontaminate the spot swapping effect, estimate contamination levels, visualize expression profiles and spot labels on the slide, and connect with other widely-used packages for further analyses. SpotClean can be potentially extended to other spatial transcriptomics data as long as the gene expression data in both tissue and background regions are available.
Installation
Install the GitHub version:
```{r} if(!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
devtools::installgithub("zijianni/SpotClean", buildmanual = TRUE, build_vignettes = TRUE)
```
Install the Bioconductor version:
```{r} if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install("SpotClean")
```
Load package after installation:
{r}
library(SpotClean)
Tutorial
After installing the package, access the vignette by running
{r}
vignette("SpotClean")
Citation
We appreciate it if you could cite our work when using SpotClean:
Ni, Z., Prasad, A., Chen, S. et al. SpotClean adjusts for spot swapping in spatial transcriptomics data. Nat Commun 13, 2971 (2022). https://doi.org/10.1038/s41467-022-30587-y
A BibTeX entry for LaTeX users can be found by running
{r}
citation("SpotClean")
Owner
- Name: Zijian Ni
- Login: zijianni
- Kind: user
- Location: Seattle, WA
- Company: Amazon
- Repositories: 2
- Profile: https://github.com/zijianni
Applied Scientist @ Amazon Prime ML
GitHub Events
Total
- Issues event: 5
- Watch event: 10
- Issue comment event: 2
Last Year
- Issues event: 5
- Watch event: 10
- Issue comment event: 2
Committers
Last synced: over 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Zijian Ni | 4****0@q****m | 57 |
| Zijian Ni | z****5@w****u | 43 |
| J Wokaty | j****y@s****u | 2 |
| Hannah A Pliner | h****r@g****m | 1 |
| Zijian Ni | 4****i | 1 |
Committer Domains (Top 20 + Academic)
Packages
- Total packages: 1
-
Total downloads:
- bioconductor 7,824 total
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 7
- Total maintainers: 1
bioconductor.org: SpotClean
SpotClean adjusts for spot swapping in spatial transcriptomics data
- Homepage: https://github.com/zijianni/SpotClean
- Documentation: https://bioconductor.org/packages/release/bioc/vignettes/SpotClean/inst/doc/SpotClean.pdf
- License: GPL-3
-
Latest release: 1.10.0
published 10 months ago
Rankings
Maintainers (1)
Dependencies
- R >= 4.2.0 depends
- Matrix * imports
- RColorBrewer * imports
- S4Vectors * imports
- Seurat * imports
- SpatialExperiment * imports
- SummarizedExperiment * imports
- dplyr * imports
- ggplot2 * imports
- grDevices * imports
- grid * imports
- methods * imports
- readbitmap * imports
- rhdf5 * imports
- rjson * imports
- rlang * imports
- stats * imports
- tibble * imports
- utils * imports
- viridis * imports
- BiocStyle * suggests
- R.utils * suggests
- knitr * suggests
- rmarkdown * suggests
- spelling * suggests
- testthat >= 2.1.0 suggests