SpotClean

R package for decontaminating the spot swapping effect and recovering true expression in spatial transcriptomics data

https://github.com/zijianni/spotclean

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: nature.com
✓
Committers with academic emails
2 of 5 committers (40.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.4%) to scientific vocabulary

Keywords

rna-seq spatial-transcriptomics

Last synced: 6 months ago · JSON representation

Repository

R package for decontaminating the spot swapping effect and recovering true expression in spatial transcriptomics data

Basic Info

Host: GitHub
Owner: zijianni
Language: R
Default Branch: master
Homepage:
Size: 4.01 MB

Statistics

Stars: 32
Watchers: 2
Forks: 10
Open Issues: 11
Releases: 1

Topics

rna-seq spatial-transcriptomics

Created almost 5 years ago · Last pushed over 1 year ago

Metadata Files

Readme

SpotClean: a computational method to adjust for spot swapping in spatial transcriptomics data

SpotClean is a computational method to adjust for spot swapping in spatial transcriptomics data. Recent spatial transcriptomics experiments utilize slides containing thousands of spots with spot-specific barcodes that bind mRNA. Ideally, unique molecular identifiers at a spot measure spot-specific expression, but this is often not the case due to bleed from nearby spots, an artifact we refer to as spot swapping. SpotClean is able to estimate the contamination rate in observed data and decontaminate the spot swapping effect, thus increase the sensitivity and precision of downstream analyses.

Introduction

Spatial transcriptomics (ST), named Method of the Year 2020 by Nature Methods in 2020, is a powerful and widely-used experimental method for profiling genome-wide gene expression across a tissue. In a typical ST experiment, fresh-frozen (or FFPE) tissue is sectioned and placed onto a slide containing spots, with each spot containing millions of capture oligonucleotides with spatial barcodes unique to that spot. The tissue is imaged, typically via Hematoxylin and Eosin (H&E) staining. Following imaging, the tissue is permeabilized to release mRNA which then binds to the capture oligonucleotides, generating a cDNA library consisting of transcripts bound by barcodes that preserve spatial information. Data from an ST experiment consists of the tissue image coupled with RNA-sequencing data collected from each spot. A first step in processing ST data is tissue detection, where spots on the slide containing tissue are distinguished from background spots without tissue. Unique molecular identifier (UMI) counts at each spot containing tissue are then used in downstream analyses.

Ideally, a gene-specific UMI at a given spot would represent expression of that gene at that spot, and spots without tissue would show no (or few) UMIs. This is not the case in practice. Messenger RNA bleed from nearby spots causes substantial contamination of UMI counts, an artifact we refer to as spot swapping. On average, we observe that more than 30% of UMIs at a tissue spot did not originate from this spot, but from other spots contaminating it. Spot swapping confounds downstream inferences including normalization, marker gene-based annotation, differential expression and cell type decomposition.

We developed SpotClean to adjust for the effects of spot swapping in ST experiments. SpotClean is able to measure the per-spot contamination rates in observed data and decontaminate gene expression levels, thus increases the sensitivity and precision of downstream analyses. Our package SpotClean is built based on 10x Visium spatial transcriptomics experiments, currently the most widely-used commercial protocol, providing functions to load raw spatial transcriptomics data from 10x Space Ranger outputs, decontaminate the spot swapping effect, estimate contamination levels, visualize expression profiles and spot labels on the slide, and connect with other widely-used packages for further analyses. SpotClean can be potentially extended to other spatial transcriptomics data as long as the gene expression data in both tissue and background regions are available.

Installation

Install the GitHub version:

```{r} if(!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")

devtools::installgithub("zijianni/SpotClean", buildmanual = TRUE, build_vignettes = TRUE)

```

Install the Bioconductor version:

```{r} if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")

BiocManager::install("SpotClean")

```

Load package after installation:

{r} library(SpotClean)

Tutorial

After installing the package, access the vignette by running

{r} vignette("SpotClean")

Citation

We appreciate it if you could cite our work when using SpotClean:

Ni, Z., Prasad, A., Chen, S. et al. SpotClean adjusts for spot swapping in spatial transcriptomics data. Nat Commun 13, 2971 (2022). https://doi.org/10.1038/s41467-022-30587-y

A BibTeX entry for LaTeX users can be found by running

{r} citation("SpotClean")

Owner

Name: Zijian Ni
Login: zijianni
Kind: user
Location: Seattle, WA
Company: Amazon

Repositories: 2
Profile: https://github.com/zijianni

Applied Scientist @ Amazon Prime ML

GitHub Events

Total

Issues event: 5
Watch event: 10
Issue comment event: 2

Last Year

Issues event: 5
Watch event: 10
Issue comment event: 2

Committers

Last synced: over 2 years ago

All Time

Total Commits: 104
Total Committers: 5
Avg Commits per committer: 20.8
Development Distribution Score (DDS): 0.452

Past Year

Commits: 5
Committers: 2
Avg Commits per committer: 2.5
Development Distribution Score (DDS): 0.4

Top Committers

Name	Email	Commits
Zijian Ni	4**0@q**m	57
Zijian Ni	z**5@w**u	43
J Wokaty	j**y@s**u	2
Hannah A Pliner	h**r@g**m	1
Zijian Ni	4****i	1

Committer Domains (Top 20 + Academic)

sph.cuny.edu: 1 wisc.edu: 1 qq.com: 1

Packages

Total packages: 1
Total downloads:
- bioconductor 7,824 total

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 7
Total maintainers: 1

bioconductor.org: SpotClean

SpotClean adjusts for spot swapping in spatial transcriptomics data

Homepage: https://github.com/zijianni/SpotClean
Documentation: https://bioconductor.org/packages/release/bioc/vignettes/SpotClean/inst/doc/SpotClean.pdf
License: GPL-3
Latest release: 1.10.0
published 10 months ago

Versions: 7
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 7,824 Total

Rankings

Dependent repos count: 0.0%

Dependent packages count: 0.0%

Forks count: 8.4%

Stargazers count: 12.6%

Average: 22.2%

Downloads: 89.9%

Maintainers (1)

zni25@wisc.edu

Last synced: 6 months ago

Dependencies

DESCRIPTION cran

R >= 4.2.0 depends
Matrix * imports
RColorBrewer * imports
S4Vectors * imports
Seurat * imports
SpatialExperiment * imports
SummarizedExperiment * imports
dplyr * imports
ggplot2 * imports
grDevices * imports
grid * imports
methods * imports
readbitmap * imports
rhdf5 * imports
rjson * imports
rlang * imports
stats * imports
tibble * imports
utils * imports
viridis * imports
BiocStyle * suggests
R.utils * suggests
knitr * suggests
rmarkdown * suggests
spelling * suggests
testthat >= 2.1.0 suggests

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science