https://github.com/broadinstitute/tangram

Spatial alignment of single cell transcriptomic data.

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
✓
Academic publication links
Links to: biorxiv.org, nature.com
✓
Committers with academic emails
3 of 17 committers (17.6%) from academic institutions
✓
Institutional organization owner
Organization broadinstitute has institutional domain (www.broadinstitute.org)
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.5%) to scientific vocabulary

Keywords

computational-biology gene-expression scrna-seq snrna-seq spatial-data visium

Last synced: 6 months ago · JSON representation

Repository

Spatial alignment of single cell transcriptomic data.

Basic Info

Host: GitHub
Owner: broadinstitute
License: bsd-3-clause
Language: Jupyter Notebook
Default Branch: master
Homepage:
Size: 133 MB

Statistics

Stars: 320
Watchers: 12
Forks: 60
Open Issues: 58
Releases: 9

Topics

computational-biology gene-expression scrna-seq snrna-seq spatial-data visium

Created over 5 years ago · Last pushed 8 months ago

Metadata Files

Readme License

Tangram is a Python package, written in PyTorch and based on scanpy, for mapping single-cell (or single-nucleus) gene expression data onto spatial gene expression data. The single-cell dataset and the spatial dataset should be collected from the same anatomical region/tissue type, ideally from a biological replicate, and need to share a set of genes. Tangram aligns the single-cell data in space by fitting gene expression on the shared genes. The best way to familiarize yourself with Tangram is to check out our tutorial and our documentation. \ If you don't use squidpy yet, check out our previous tutorial.

Tangram_overview

How to install Tangram

To install Tangram, make sure you have PyTorch and scanpy installed. If you need more details on the dependences, look at the environment.yml file.

set up conda environment for Tangram conda env create -f environment.yml
install tangram-sc from shell: conda activate tangram-env pip install tangram-sc
To start using Tangram, import tangram in your jupyter notebooks or/and scripts import tangram as tg ## Two ways to run Tangram

How to run Tangram at cell level

Load your spatial data and your single cell data (which should be in AnnData format), and pre-process them using tg.pp_adatas:

ad_sp = sc.read_h5ad(path) ad_sc = sc.read_h5ad(path) tg.pp_adatas(ad_sc, ad_sp, genes=None)

The function pp_adatas finds the common genes between adatasc, adatasp, and saves them in two adatas.uns for mapping and analysis later. Also, it subsets the intersected genes to a set of training genes passed by genes. If genes=None, Tangram maps using all genes shared by the two datasets. Once the datasets are pre-processed we can map:

ad_map = tg.map_cells_to_space(ad_sc, ad_sp)

The returned AnnData,ad_map, is a cell-by-voxel structure where ad_map.X[i, j] gives the probability for cell i to be in voxel j. This structure can be used to project gene expression from the single cell data to space, which is achieved via tg.project_genes.

ad_ge = tg.project_genes(ad_map, ad_sc)

The returned ad_ge is a voxel-by-gene AnnData, similar to spatial data ad_sp, but where gene expression has been projected from the single cells. This allows to extend gene throughput, or correct for dropouts, if the single cells have higher quality (or more genes) than spatial data. It can also be used to transfer cell types onto space.

How to run Tangram at cluster level

To enable faster training and consume less memory, Tangram mapping can be done at cell cluster level. This modification was introduced by Sten Linnarsson.

Prepare the input data as the same you would do for cell level Tangram mapping. Then map using following code:

ad_map = tg.map_cells_to_space( ad_sc, ad_sp, mode='clusters', cluster_label='subclass_label')

Provided clusterlabel must belong to adsc.obs. Above example code is to map at 'subclasslabel' level, and the 'subclasslabel' is in ad_sc.obs.

To project gene expression to space, use tg.project_genes and be sure to set the cluster_label argument to the same cluster label in mapping.

ad_ge = tg.project_genes( ad_map, ad_sc, cluster_label='subclass_label')

How to run Tangram with refinements to improve consistency

To improve the Tangram mapping consistency, more refinement can be introduced to Tangram mapping. This modification was introduced by the research group Data Science in Systems Biology at the Technical University of Munich (lead: Markus List.).

Prepare the input data as the same you would do for cell level Tangram mapping. Then map with the same function but wth more regularization parameters:

ad_map = tg.map_cells_to_space( ad_sc, ad_sp, mode='cells', cluster_label='subclass_label', lambda_r = 2.95e-9, lambda_l2 = 1.00e-18, lambda_neighborhood_g1 = 0.96, lambda_ct_islands = 0.17, lambda_getis_ord = 0.71)

If one of the lambdaneighborhoodg1, lambdactislands and lambdagetisord is non-zero, spatial information must be provided at adsc.obsm['spatial']. If lambdactislands is nonzero, clusterlabel needs to be provided. Above example code is to map at 'subclasslabel' level, and the 'subclasslabel' is in ad_sc.obs.

The detail information of these regularization parameters can be found at https://www.biorxiv.org/content/10.1101/2025.01.27.634996v1.

To project gene expression to space, use tg.project_genes and be sure to set the cluster_label argument to the same cluster label in mapping.

ad_ge = tg.project_genes( ad_map, ad_sc, cluster_label='subclass_label')

How Tangram works under the hood

Tangram instantiates a Mapper object passing the following arguments: - S: single cell matrix with shape cell-by-gene. Note that genes is the number of training genes. - G: spatial data matrix with shape voxels-by-genes. Voxel can contain multiple cells.

Then, Tangram searches for a mapping matrix M, with shape voxels-by-cells, where the element M_ij signifies the probability of cell i of being in spot j. Tangram computes the matrix M by maximizing the following:

where cossim is the cosine similarity. The meaning of the loss function is that gene expression of the mapped single cells should be as similar as possible to the spatial data _G, under the cosine similarity sense.

The above accounts for basic Tangram usage. In our manuscript, we modified the loss function in several ways so as to add various kinds of prior knowledge, such as number of cell contained in each voxels.

Frequently Asked Questions

Do I need a GPU for running Tangram?

Mapping with cluster mode is fine on a standard laptop. For mapping at single cell level, GPU is not required but is recommended. We run most of our mappings on a single P100 which maps ~50k cells in a few minutes.

How do I choose a list of training genes?

A good way to start is to use the top 1k unique marker genes, stratified across cell types, as training genes. Alternatively, you can map using the whole transcriptome. Ideally, training genes should contain high quality signals: if most training genes are rich in dropouts or obtained with bad RNA probes your mapping will not be accurate.

Do I need cell segmentation for mapping on Visium data?

You do not need to segment cells in your histology for mapping on spatial transcriptomics data (including Visium and Slide-seq). You need, however, cell segmentation if you wish to deconvolve the data (ie deterministically assign a single cell profile to each cell within a spatial voxel).

I run out of memory when I map: what should I do?

Reduce your spatial data in various parts and map each single part. If that is not sufficient, you will need to downsample your single cell data as well.

How to cite Tangram

Tangram has been released in the following publication

Biancalani* T., Scalia* G. et al. - Deep learning and alignment of spatially-resolved whole transcriptomes of single cells in the mouse brain with Tangram Nature Methods 18, 1352–1362 (2021)

Refinement strategies for Tangram has been released in the following publication

Stahl* M., Straßer* L. et al. - Refinement Strategies for Tangram for Reliable Single-Cell to Spatial Mapping bioRxiv

If you have questions, please contact the authors of the method: - Tommaso Biancalani - biancalt@gene.com
- Gabriele Scalia - gabriele.scalia@roche.com

PyPI maintainer: - Hejin Huang - huang.hejin@gene.com - Shreya Gaddam - gaddams@gene.com - Tommaso Biancalani - biancalt@gene.com - Ziqing Lu - luz21@gene.com

The artwork has been curated by: - Anna Hupalowska ahupalow@broadinstitute.org

Owner

Name: Broad Institute
Login: broadinstitute
Kind: organization
Location: Cambridge, MA

Website: http://www.broadinstitute.org/
Twitter: broadinstitute
Repositories: 1,083
Profile: https://github.com/broadinstitute

Broad Institute of MIT and Harvard

GitHub Events

Total

Issues event: 9
Watch event: 56
Issue comment event: 16
Push event: 3
Pull request event: 2
Fork event: 11

Last Year

Issues event: 9
Watch event: 56
Issue comment event: 16
Push event: 3
Pull request event: 2
Fork event: 11

Committers

Last synced: 9 months ago

All Time

Total Commits: 270
Total Committers: 17
Avg Commits per committer: 15.882
Development Distribution Score (DDS): 0.407

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
ziqlu0722	r**2@g**m	160
Tommaso Biancalani	t**l@b**g	22
gaddams	s**m@b**g	17
Tommaso Biancalani	t**l@t**6	13
gscalia	g**a@g**m	12
gaddamshreya	g**s@g**m	9
Tommaso Biancalani	t**l@p**m	9
Gaddam	g**s@n**m	7
Hejin Huang	h****3	6
Tommaso Biancalani	t**l@t**u	5
hejinhuang	h**0@g**m	2
Tommaso Biancalani	t**l@q**p	2
Gaddam	g**s@n**m	2
Gaddam	g**s@n**m	1
Tommaso Biancalani	t**l@Q**l	1
Tommaso Biancalani	t**l@b**o	1
Hejin0701	9****1	1

Committer Domains (Top 20 + Academic)

broadinstitute.org: 2 nl004.eth.ghpc1.sc1.roche.com: 1 ng047.eth.ghpc1.sc1.roche.com: 1 nl003.eth.ghpc1.sc1.roche.com: 1 gene.com: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 95
Total pull requests: 36
Average time to close issues: 4 months
Average time to close pull requests: 12 days
Total issue authors: 77
Total pull request authors: 10
Average comments per issue: 2.12
Average comments per pull request: 0.06
Merged pull requests: 25
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 10
Pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: 4 months
Issue authors: 9
Pull request authors: 2
Average comments per issue: 0.4
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

kguion1 (5)
giovp (4)
Lena926 (3)
abhishekmaj08 (3)
simonmfr (3)
jayypaul (2)
KunHHE (2)
Nick-Eagles (2)
asmlgkj (2)
quentinblampey (2)
pengjini (1)
chandraS76 (1)
peizyi (1)
VivianQM (1)
boyangzhang1993 (1)

Pull Request Authors

ziqlu0722 (23)
lewlin (3)
almaan (2)
merlestahl (2)
whatever60 (1)
alexanderchang1 (1)
HelloWorldLTY (1)
gaddamshreya (1)
anupriyatripathi (1)
giovp (1)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 3
Total downloads:
- pypi 1,326 last-month

Total dependent packages: 5
(may contain duplicates)
Total dependent repositories: 9
(may contain duplicates)
Total versions: 13
Total maintainers: 2

proxy.golang.org: github.com/broadinstitute/Tangram

Documentation: https://pkg.go.dev/github.com/broadinstitute/Tangram#section-documentation
License: bsd-3-clause
Latest release: v0.2.1
published about 5 years ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 5.4%

Average: 5.6%

Dependent repos count: 5.8%

Last synced: 6 months ago

proxy.golang.org: github.com/broadinstitute/tangram

Documentation: https://pkg.go.dev/github.com/broadinstitute/tangram#section-documentation
License: bsd-3-clause
Latest release: v0.2.1
published about 5 years ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 5.4%

Average: 5.6%

Dependent repos count: 5.8%

Last synced: 6 months ago

pypi.org: tangram-sc

Spatial alignment of single cell transcriptomic data.

Homepage: https://github.com/broadinstitute/Tangram
Documentation: https://tangram-sc.readthedocs.io/
License: bsd-3-clause
Latest release: 1.0.4
published about 3 years ago

Versions: 9
Dependent Packages: 5
Dependent Repositories: 9
Downloads: 1,326 Last month
Docker Downloads: 0

Rankings

Docker downloads count: 1.4%

Dependent repos count: 4.9%

Stargazers count: 5.2%

Forks count: 6.7%

Average: 6.7%

Dependent packages count: 7.3%

Downloads: 14.8%

Maintainers (2)

tbiancalani ziqlu

Last synced: 6 months ago

https://github.com/broadinstitute/tangram

Science Score: 54.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

How to install Tangram

How to run Tangram at cell level

How to run Tangram at cluster level

How to run Tangram with refinements to improve consistency

How Tangram works under the hood

Frequently Asked Questions

Do I need a GPU for running Tangram?

How do I choose a list of training genes?

Do I need cell segmentation for mapping on Visium data?

I run out of memory when I map: what should I do?

How to cite Tangram

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

proxy.golang.org: github.com/broadinstitute/Tangram

Rankings

proxy.golang.org: github.com/broadinstitute/tangram

Rankings

pypi.org: tangram-sc

Rankings

Maintainers (2)