digitaldlsorter

digitalDLSorteR: An R package to deconvolute bulk RNA-Seq using scRNA-Seq data

https://github.com/diegommcc/digitaldlsorter

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 18 DOI reference(s) in README
  • Academic publication links
    Links to: ncbi.nlm.nih.gov, nature.com, frontiersin.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.8%) to scientific vocabulary

Keywords

deconvolution deep-learning rna-seq single-cell
Last synced: 6 months ago · JSON representation

Repository

digitalDLSorteR: An R package to deconvolute bulk RNA-Seq using scRNA-Seq data

Basic Info
  • Host: GitHub
  • Owner: diegommcc
  • Language: R
  • Default Branch: master
  • Homepage:
  • Size: 339 MB
Statistics
  • Stars: 6
  • Watchers: 1
  • Forks: 4
  • Open Issues: 0
  • Releases: 0
Topics
deconvolution deep-learning rna-seq single-cell
Created over 5 years ago · Last pushed over 3 years ago
Metadata Files
Readme

README.md

digitalDLSorteR

R build status

An R package to deconvolute bulk RNA-seq from scRNA-seq data based on Neural Networks


The digitalDLSorteR R package provides a set of tools to deconvolute cell type proportions of bulk RNA-seq data through the development of context-specific deconvolution models based on single-cell RNA-seq (scRNA-seq) data. These models are able to accurately estimate cell type proportions of bulk RNA-seq samples from specific biological environments. For more details about the algorithm and the functionalities implemented in this package, see Torroja and Sanchez-Cabo, 2019, Mañanes et al., 2024, and https://diegommcc.github.io/digitalDLSorteR/.

Installation

digitalDLSorteR is available on CRAN and can be installed as follows:

r install.packages("digitalDLSorteR")

The version under development is available on GitHub:

r if (!requireNamespace("remotes", quietly = TRUE)) install.packages("remotes") remotes::install_github("diegommcc/digitalDLSorteR")

The package depends on the tensorflow R package, so a working Python interpreter with the Tensorflow Python library installed is needed. The installTFpython function provides an easy way to install a conda environment called digitaldlsorter-env with all necessary dependencies covered. We recommend installing the TensorFlow Python library in this way, although a custom installation is possible. See the Keras/TensorFlow installation and configuration article of the package website for more details.

r library("digitalDLSorteR") installTFpython(install.conda = TRUE)

Rationale of digitalDLSorteR

The algorithm consists of training Deep Neural Network (DNN) models with simulated bulk RNA-seq samples whose cell composition is known. These pseudo-bulk RNA-seq samples are generated by aggregating pre-characterized scRNA-seq data from specific biological environments. These models are able to accurately deconvolute new bulk RNA-seq samples from the same environment, as they are able to account for possible environmental-dependent transcriptional changes of specific cells, such as immune cells in complex diseases (e.g., specific subtypes of cancer or atherosclerosis). This aspect overcomes this limitation present in other methods. For instance, in the case of immune cells, published methods often rely on purified transcriptional profiles from peripheral blood mononuclear cells despite the fact that these cells are highly variable depending on environmental conditions. Thus, this feature together with the fact that scRNA-seq datasets improve over time (the more cells, the more variability learnt by the models) will lead to build more accurate and comprehensive models.

Usage

The package has two main ways of use:

  1. Using pre-trained models included in the digitalLDSorteRmodels (https://github.com/diegommcc/digitalDLSorteRmodels) R package to deconvolute new bulk RNA-seq samples from the same environment. So far, the available models allow to deconvolute samples from human breast cancer (GSE75688 data from Chung et al., 2017 used as reference), and colorectal cancer (GSE132465, GSE132257 and GSE144735 data from Lee, Hong, Etlioglu Cho et al., 2020 used as reference). For more details about this workflow, please see the Using pre-trained context-specific deconvolution models article. Disclaimer: these models intend to be a quick option to deconvolute samples from the same biological environment, but we strongly recommend generating new models with data manually curated by the users.
  2. Building new deconvolution models from pre-characterized scRNA-seq datasets. This is the main way to use digitalDLSorteR. For more information on the workflow, see the article Building new deconvolution models.

To use pre-trained context specific deconvolution models, digitalDLSorteR relies on the digitalDLSorteRmodels data R package. Therefore, it should be installed along with digitalDLSorteR from GitHub as follows:

r remotes::install_github("diegommcc/digitalDLSorteRmodels")

Once digitalDLSorteRmodels is loaded, the pre-trained models are available. See the article Using pre-trained context-specific deconvolution models for an example.

Final remarks

  • Regarding pre-trained models, if you generate new models and want to make them available through the digitalDLSorteRmodels R package for other users to use them, contact with us!
  • We provide some pre-trained models that take into account genes that seem to be relevant for these environmental conditions. However, as these genes might be different depending on the bulk RNA-seq to be deconvoluted, we strongly recommend creating new models through the workflow explained here.
  • Contributions and suggestions are welcome!

Citation

Please, if you use digitalDLSorteR in your research, cite Torroja and Sanchez-Cabo, 2019 (first description of the algorithm) and Mañanes et al., 2024 (version for spatial transcriptomics data whose development has served to improve digitalDLSorteR as well).

References

Chung, W., Eum, H. H., Lee, H. O., Lee, K. M., Lee, H. B., Kim, K. T., et al. (2017). Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer. Nat. Commun. 8 (1) 15081 doi:10.1038/ncomms15081
Lee, HO., Hong, Y., Etlioglu, H.E. et al. (2020). Lineage-dependent gene expression programs influence the immune landscape of colorectal cancer. Nat. Genet. 52 594-603 doi:10.1038/s41588-020-0636-z
Torroja, C. and Sánchez-Cabo, F. (2019). digitalDLSorter: A Deep Learning algorithm to quantify immune cell populations based on scRNA-seq data. Frontiers in Genetics 10 978 doi:10.3389/fgene.2019.00978
Mañanes, D., Rivero-García, I., Relaño, C., Jimenez-Carretero, D., Torres, M., Sancho, D., Torroja, C. and Sánchez-Cabo, F. (2024). SpatialDDLS: An R package to deconvolute spatial transcriptomics data using neural networks. Bioinformatics 40 2 doi:10.1093/bioinformatics/btae072

Owner

  • Name: Diego Mañanes
  • Login: diegommcc
  • Kind: user
  • Location: Madrid, Spain
  • Company: Spanish National Center for Cardiovascular Research (CNIC)

PhD student at Spanish National Center for Cardiovascular Research (CNIC). Int. in computational biology, data science, and immunology.

GitHub Events

Total
  • Watch event: 6
  • Push event: 2
Last Year
  • Watch event: 6
  • Push event: 2

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 111
  • Total Committers: 4
  • Avg Commits per committer: 27.75
  • Development Distribution Score (DDS): 0.387
Past Year
  • Commits: 2
  • Committers: 1
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Diego d****6@g****m 68
Diego Mañanes d****s@M****l 21
Diego Mañanes Cayero C****c@c****s 21
Diego Mañanes d****s@M****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 1
  • Total pull requests: 0
  • Average time to close issues: 4 days
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 2.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • Mingjiang-git (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 272 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 7
  • Total maintainers: 1
cran.r-project.org: digitalDLSorteR

Deconvolution of Bulk RNA-Seq Data Based on Deep Learning

  • Versions: 7
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 272 Last month
Rankings
Forks count: 14.9%
Stargazers count: 22.5%
Dependent packages count: 29.8%
Average: 31.8%
Dependent repos count: 35.5%
Downloads: 56.4%
Maintainers (1)
Last synced: over 1 year ago

Dependencies

DESCRIPTION cran
  • R >= 4.0.0 depends
  • Matrix * imports
  • Matrix.utils * imports
  • RColorBrewer * imports
  • S4Vectors * imports
  • SingleCellExperiment * imports
  • SummarizedExperiment * imports
  • dplyr * imports
  • ggplot2 * imports
  • ggpubr * imports
  • gtools * imports
  • keras * imports
  • methods * imports
  • pbapply * imports
  • reshape2 * imports
  • reticulate * imports
  • rlang * imports
  • stats * imports
  • tensorflow * imports
  • tidyr * imports
  • tools * imports
  • zinbwave * imports
  • BiocParallel * suggests
  • DelayedArray * suggests
  • DelayedMatrixStats * suggests
  • HDF5Array * suggests
  • knitr * suggests
  • rhdf5 * suggests
  • rmarkdown * suggests
  • testthat * suggests
.github/workflows/check-bioc.yml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/upload-artifact master composite
  • conda-incubator/setup-miniconda v2 composite
  • docker/build-push-action v1 composite
  • r-lib/actions/setup-pandoc master composite
  • r-lib/actions/setup-r master composite