saccelerator

https://github.com/spatialhackathon/saccelerator

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
✓
Academic publication links
Links to: biorxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.7%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: SpatialHackathon
License: mit-0
Language: Python
Default Branch: native
Homepage: https://spatialhackathon.github.io/SACCELERATOR/
Size: 4.37 MB

Statistics

Stars: 23
Watchers: 3
Forks: 5
Open Issues: 62
Releases: 0

Created over 2 years ago · Last pushed 11 months ago

Metadata Files

Readme Contributing License Code of conduct Citation

SACCELERATOR - a flexible framework for applying spatially aware clustering methods

Spatial omics have transformed tissue architecture and cellular heterogeneity analysis by integrating molecular data with spatial localization. In spatially resolved transcriptomics, identifying spatial domains is critical for analysis of anatomical regions within heterogeneous datasets and understanding tissue function. Since 2020, more than 50 spatially aware clustering methods have been developed for this task. However, the reliability of existing benchmarks is undermined by their narrow focus on Visium and brain tissue datasets, as well as the dependence on questionable ground truth annotations. Here, we implemented a consensus framework that surpasses traditional benchmarking practices.

Our framework comprises a community-driven benchmark-like platform that streamlines data formatting, method integration, and metric evaluation while accommodating new methods and datasets. Currently, the platform includes 22 spatially aware clustering methods across 15 datasets spanning 9 technologies and diverse tissue types. The benchmark approach uncovered significant limitations in generalizability and reproducibility where methods that perform well on healthy tissues often falter on cancer samples. We also found that anatomical labels commonly used as ground truths are often biased, potentially error-prone, and in some cases, unsuitable for benchmarking efforts.

In light of these issues, we adopt a flexible expert-in-the-loop consensus-driven approach. This goes beyond traditional ensemble/consensus methods, and allows researchers to interact with intermediate results to determine which tools should be used to generate a consensus. We believe that the inclusion of an expert-in-the-loop is critical to ensure that the computational analysis matches the biological question at hand, and we believe that when the focus of the analysis is to uncover novel biological discoveries, tissue experts are accessible more often than not.

General setup

This framework has established (and allows users to contribute) "modules" in their preferred programming language (.. as long as that is either R or Python). A module is a set of scripts set up something in one of the following categories: a dataset, a computational method, or an evaluation metric. Interfaces between each category enable seamless integration of new data, methods, or metrics, thus enabling an extensible and community-driven framework.

Modules

This repository contains some templates and examples of how to implement your module so that it interfaces seamlessly with other modules in the workflow. For example, if you want to implement a new method, you do not need to worry about input data or evaluation metrics as long as you follow the template for reading input and writing output - if you correctly adhere to the input and output guidelines, you should be able to interface with our default data modules and default evaluation metrics modules.

The existing modules are: - data (currently 28) - LIBD Visium DLPFC dataset (4 samples, each with 3 replicates) - SEAADdata - STARmap-2018-mouse-cortex - STARmapplus - abcatlaswmbthalamus - cosmxliver - cosmxlung - her2st-breast-cancer - locuscoeruleus - merfishdevheart - mousebrainsagittalanterior - mousebrainsagittalposterior - mousekidneycoronal - osmfishSsp - pachtersimulation - slideseq2olfactorybulb - sotipsimulation - spatialDLPFC - stereoseqdevelopingDrosophilaembryoslarvae - stereoseqliver - stereoseqmouseembryo - stereoseqolfactorybulb - visiumbreastcancerSEDR - visiumchickenheart - visiumhdcancercolon - xenium-breast-cancer - xenium-mouse-brain-SergioSalas - methods (currently 24): - BANKSY - BayesSpace - CellCharter - DRSC - DeepST - Giotto - GraphST - SCAN-IT - SCMEB - SEDR - SOTIP - STAGATE - SpaceFlow - SpiceMix - bass - conST - maple - meringue - precast - scanpy - seurat - spaGCN - spatialGE - stardust - evaluation metrics (currently 17) - ARI - CHAOS - Calinski-Harabasz - Completeness - Davies-Bouldin - Entropy - FMI - Homogeneity - LISI - MCC - NMI - PAS - SpatialARI - Vmeasure - cluster-specific-silhouette - domain-specific-f1 - jaccard

Contributions

Our workflow is set up to allow everyone to contribute "modules", whether it is a dataset, a computational method, or an evaluation metric.

This repository contains some templates and examples of how to implement your module so that it interfaces seamlessly with other modules in the workflow. Please refer to our Contribution guide and the Module documentation for more details.

Contributing and Code of Conduct

Read our Contributing Guide and Code of Conduct.

Contributors

_{Jieran S.}	_{Søren Helweg Dam}	_niklasmueboe	_peicai	_kbiharie	_heylf
_Nav	_pakiessling	_{Qirong Mao}	_{zaira seferbekova}	_{Mark Robinson}	_{sebastiantiesmeyer}
_{Liya Zaygerman}	_{Meghan Turner}	_{Anastasiia Okhtienko}	_{Brian Long}	_{Giorgia Moranzoni}	_{Tom Chartrand}
_alam-shahul	_{Sven Twardziok}

Citation

If you are using SACCELERATOR please cite

Sun, J. et al. Beyond benchmarking: an expert-guided consensus approach to spatially aware clustering. bioRxiv https://doi.org/10.1101/2025.06.23.660861 (2025).

@article {saccelerator2025, author = {Sun, Jieran and Biharie, Kirti and Cai, Peiying and M{\"u}ller-B{\"o}tticher, Niklas and Kiessling, Paul and Turner, Meghan A. and Dam, S{\o}ren H. and Heyl, Florian and Kathirchelvan, Sarusan and Emons, Martin and Gunz, Samuel and Twardziok, Sven and El-Heliebi, Amin and Zacharias, Martin and SpaceHack 2.0 participants and Eils, Roland and Reinders, Marcel and Gottardo, Raphael and Kuppe, Christoph and Long, Brian and Mahfouz, Ahmed and Robinson, Mark D. and Ishaque, Naveed}, title = {Beyond benchmarking: an expert-guided consensus approach to spatially aware clustering}, year = {2025}, doi = {10.1101/2025.06.23.660861}, publisher = {Cold Spring Harbor Laboratory}, URL = {https://www.biorxiv.org/content/early/2025/06/27/2025.06.23.660861}, journal = {bioRxiv} }

License

We have adopted the "MIT No Attribution" (MIT-0) License. It is currently attributed to the "SpaceHack organizers", but please also make sure to add your name to your contributions. More on MIT-0 here

Owner

Name: SpaceHack
Login: SpatialHackathon
Kind: organization

Website: https://spatialhackathon.github.io/
Repositories: 3
Profile: https://github.com/SpatialHackathon

Bringing people together to address challenges in analysis of spatial transcriptomics data

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: >-
  SpaceHack 2.0: an expert in the loop consensus driven
  framework for spatially aware clustering
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - name: SpaceHack 2.0. Participants
repository-code: 'https://github.com/SpatialHackathon/SpaceHack2023'
url: 'https://spatialhackathon.github.io/past.html'
abstract: >-
  Spatial omics have transformed tissue architecture and
  cellular heterogeneity analysis by integrating molecular
  data with spatial localization. In spatially resolved
  transcriptomics, identifying spatial domains is critical
  for analysis of anatomical regions within heterogeneous
  datasets and understanding tissue function. Since 2020,
  more than 50 spatially aware clustering methods have been
  developed for this task. However, the reliability of
  existing benchmarks is undermined by their narrow focus on
  Visium and brain tissue datasets, as well as the
  dependence on questionable ground truth annotations. Here,
  we implemented a consensus framework that surpasses
  traditional benchmarking practices.


  Our framework comprises a community-driven benchmark-like
  platform that streamlines data formatting, method
  integration, and metric evaluation while accommodating new
  methods and datasets. Currently, the platform includes 22
  spatially aware clustering methods across 15 datasets
  spanning 9 technologies and diverse tissue types. The
  benchmark approach uncovered significant limitations in
  generalizability and reproducibility where methods that
  perform well on healthy tissues often falter on cancer
  samples. We also found that anatomical labels commonly
  used as ground truths are often biased, potentially
  error-prone, and in some cases, unsuitable for
  benchmarking efforts.


  In light of these issues, we adopt a flexible
  expert-in-the-loop consensus-driven approach. This goes
  beyond traditional ensemble/consensus methods, and allows
  researchers to interact with intermediate results to
  determine which tools should be used to generate a
  consensus. We believe that the inclusion of an
  expert-in-the-loop is critical to ensure that the
  computational analysis matches the biological question at
  hand, and we believe that when the focus of the analysis
  is to un cover novel biological discoveries, tissue
  experts are accessible more often than not.
license: MIT-0

GitHub Events

Total

Watch event: 2
Issue comment event: 4
Push event: 12
Pull request review comment event: 1
Pull request review event: 2
Pull request event: 4

Last Year

Watch event: 2
Issue comment event: 4
Push event: 12
Pull request review comment event: 1
Pull request review event: 2
Pull request event: 4

Dependencies

data/STARmap-2018-mouse-cortex/environment.yml conda

anndata 0.10.3.*
gdown 4.7.1.*
pandas 2.1.4.*
scipy 1.11.4.*

data/xenium-mouse-brain-SergioSalas/environment.yml conda

anndata 0.10.3.*
gdown 4.7.1.*
pandas 2.1.4.*
scipy 1.11.4.*

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science