spatopic
An R package for fast topic inference to identify tissue architecture in multiplexed images
Science Score: 39.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 6 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.5%) to scientific vocabulary
Repository
An R package for fast topic inference to identify tissue architecture in multiplexed images
Basic Info
- Host: GitHub
- Owner: xiyupeng
- License: gpl-3.0
- Language: R
- Default Branch: main
- Homepage: https://xiyupeng.github.io/SpaTopic/
- Size: 67.2 MB
Statistics
- Stars: 11
- Watchers: 1
- Forks: 2
- Open Issues: 0
- Releases: 3
Metadata Files
README.md
SpaTopic (SpatialTopic)
<!-- badges: start -->
<!-- badges: end -->
An R package for fast topic inference to identify tissue architecture in multiplexed images. It implements a novel spatial topic model to identify highly interpretable immunologic topics across multiple multiplexed images, simply given the cell location and cell type information as input.
In the R package, we adapt an approach originally developed for image segmentation in computer vision, incorporating spatial information into the flexible design of regions (image partitions, analogous to documents in language modeling). We further refined the approach to address unique challenges in cellular images and provide an efficient C++ implementation of the algorithm in this R package.
Compared to other KNN-based methods (such as KNN-kmeans, the default neighborhood analysis in Seurat v5 R package), SpaTopic runs much faster on large-scale image dataset with minimal memory usage. For example, when working on Nanostring CosMx NSCLC dataset, SpaTopic can spatially cluster 0.1 million of cells on a single image within 1 min on a regular Mac Air (See tutorial).
News: SpatialTopic now has been published in Nature Communications:
Peng, X., Smithy, J.W., Yosofvand, M. et al. Scalable topic modelling decodes spatial tissue architecture for large-scale multiplexed imaging analysis. Nat Commun 16, 6619 (2025). https://doi.org/10.1038/s41467-025-61821-y
Tutorial available to get start it
https://xiyupeng.github.io/SpaTopic/articles/SpaTopic.html
Installation
The R package SpaTopic now is available in CRAN and can be installed with the following code.
r
install.packages("SpaTopic")
The development version of SpaTopic can be installed from the GitHub repository.
``` r
install.packages("devtools")
devtools::install_github("xiyupeng/SpaTopic") ```
Dependency
SpaTopic requires dependency on the following R packages:
- Rcpp for C++ codes
- RcppProgress for C++ codes
- RcppArmadillo for C++ codes
- RANN for fast KNN
- foreach for parallel computing
- sf for spatial analysis
Usage
The required input of SpaTopic is a data frame containing cells within on a single image or a list of data frames for multiple images. Each data frame consists of four columns: The image ID, X, Y cell coordinates, and cell type information.
``` r library(SpaTopic) packageVersion("SpaTopic")
> [1] '1.2.0'
library(sf)
The input can be a data frame or a list of data frames
data("lung5") head(lung5)
image X Y type
1_1 image1 4215.889 158847.7 Dendritic
2_1 image1 6092.889 158834.7 Macrophage
3_1 image1 7214.889 158843.7 Neuroendocrine
4_1 image1 7418.889 158813.7 Macrophage
5_1 image1 7446.889 158845.7 Macrophage
6_1 image1 3254.889 158838.7 CD4 T
gibbs.res<-SpaTopicinference(lung5, ntopics = 7, sigma = 50, regionradius = 400) ```
``` print(gibbs.res)
> SpaTopic Results
> ----------------
> Number of topics: 7
> Perplexity: 11.31563
>
> Topic Content(Topic distribution across cell types):
> topic1 topic2 topic3 topic4
> Alveolar Epithelial Type 1 0.035870295 6.511503e-03 4.541367e-06 0.026643327
> Alveolar Epithelial Type 2 0.025386476 3.553900e-02 4.541367e-06 0.017427665
> Artery 0.007545591 2.624548e-06 9.128148e-04 0.001856373
> B 0.018581190 5.800251e-04 3.446035e-01 0.015203195
> Basal 0.025846292 7.753466e-01 1.730261e-03 0.089193312
> topic5 topic6 topic7
> Alveolar Epithelial Type 1 2.987411e-06 6.348481e-06 0.005341969
> Alveolar Epithelial Type 2 2.987411e-06 6.348481e-06 0.006994451
> Artery 5.320579e-03 2.044846e-02 0.006041096
> B 1.912242e-02 3.434528e-03 0.018434717
> Basal 4.902342e-03 6.348481e-06 0.006549552
> ...
>
> Use $Z.trace for posterior probabilities of topic assignments for each cell
> Use $cell_topics for final topic assignments for each cell
> Use $parameters for accessing model parameters
```
For detailed usage of SpaTopic, please check:
- The home page for the tutorial is available here.
- A step-by-step tutorial to reproduce the result on Cosmx lung cancer sample is available here.
Example Data
The example image used in the tutorial can be downloaded from here. It is stored in a Seurat v5 object.
Example Output
The algorithm generates two key statistics for further analysis:
- 1) topic content, a spatially-resolved topic distribution over cell types, and
- 2) topic assignment for each cell within images.
Topic Spatial Distribution over images
Topic Content
Citation
Peng, X., Smithy, J.W., Yosofvand, M. et al. Scalable topic modelling decodes spatial tissue architecture for large-scale multiplexed imaging analysis. Nat Commun 16, 6619 (2025). https://doi.org/10.1038/s41467-025-61821-y
Preprint: Xiyu Peng, James W. Smithy, Mohammad Yosofvand, Caroline E. Kostrzewa, MaryLena Bleile, Fiona D. Ehrich, Jasme Lee, Michael A. Postow, Margaret K. Callahan, Katherine S. Panageas, Ronglai Shen. Decoding Spatial Tissue Architecture: A Scalable Bayesian Topic Model for Multiplexed Imaging Analysis. bioRxiv. doi: https://doi.org/10.1101/2024.10.08.617293
Contact
If you have any problems, please contact:
Xiyu Peng (pansypeng124@gmail.com, pengx@stat.tamu.edu)
Owner
- Name: Xiyu
- Login: xiyupeng
- Kind: user
- Location: New York
- Company: Memorial Sloan Kettering Cancer Center
- Website: https://sites.google.com/view/xiyupeng/home
- Repositories: 1
- Profile: https://github.com/xiyupeng
interested in bioinformatics and computational biology
GitHub Events
Total
- Create event: 1
- Release event: 1
- Issues event: 5
- Watch event: 6
- Issue comment event: 2
- Push event: 27
- Fork event: 2
Last Year
- Create event: 1
- Release event: 1
- Issues event: 5
- Watch event: 6
- Issue comment event: 2
- Push event: 27
- Fork event: 2
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 5
- Total pull requests: 0
- Average time to close issues: 3 months
- Average time to close pull requests: N/A
- Total issue authors: 3
- Total pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 3
- Pull requests: 0
- Average time to close issues: 2 months
- Average time to close pull requests: N/A
- Issue authors: 3
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- xiyupeng (3)
- wzhangwhu (1)
- syparkmd (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 179 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 3
- Total maintainers: 1
cran.r-project.org: SpaTopic
Topic Inference to Identify Tissue Architecture in Multiplexed Images
- Homepage: https://github.com/xiyupeng/SpaTopic
- Documentation: http://cran.r-project.org/web/packages/SpaTopic/SpaTopic.pdf
- License: GPL (≥ 3)
-
Latest release: 1.2.0
published over 1 year ago