scaden

Deep Learning based cell composition analysis with Scaden.

https://github.com/kevinmenden/scaden

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
✓
Committers with academic emails
1 of 3 committers (33.3%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (18.8%) to scientific vocabulary

Keywords

bioinformatics cell-composition-analysis deconvolution deep-learning machine-learning rna-seq single-cell-rna-seq

Last synced: 10 months ago · JSON representation

Repository

Deep Learning based cell composition analysis with Scaden.

Basic Info

Host: GitHub
Owner: KevinMenden
License: mit
Language: Python
Default Branch: master
Homepage: https://scaden.readthedocs.io
Size: 1020 KB

Statistics

Stars: 82
Watchers: 2
Forks: 29
Open Issues: 19
Releases: 0

Topics

bioinformatics cell-composition-analysis deconvolution deep-learning machine-learning rna-seq single-cell-rna-seq

Created about 7 years ago · Last pushed over 1 year ago

Metadata Files

Readme Changelog License

Single-cell assisted deconvolutional network

Scaden

Scaden is a deep-learning based algorithm for cell type deconvolution of bulk RNA-seq samples. It was developed at the DZNE Tübingen and the ZMNH in Hamburg. The method is published in Science Advances: Deep-learning based cell composition analysis from tissue expression profiles

A complete documentation is available here

Scaden overview. a) Generation of artificial bulk samples with known cell type composition from scRNA-seq data. b) Training of Scaden model ensemble on simulated training data. c) Scaden ensemble architecture. d) A trained Scaden model can be used to deconvolve complex bulk mixtures.

Installation guide

Scaden can be easily installed on a Linux system, and should also work on Mac. There are currently two options for installing Scaden, either using Bioconda or via pip.

pip

To install Scaden via pip, simply run the following command:

pip install scaden

GPU

If you want to make use of your GPU, you will have to additionally install tensorflow-gpu.

For pip:

pip install tensorflow-gpu

For conda:

conda install tensorflow-gpu

Docker

If you don't want to install Scaden at all, but rather use a Docker container, we provide that as well. For every release, we provide two version - one for CPU and one for GPU usage. To pull the CPU container, use this command:

docker pull ghcr.io/kevinmenden/scaden/scaden

For the GPU container:

docker pull ghcr.io/kevinmenden/scaden/scaden-gpu

Webtool (beta)

Additionally, we now proivde a web tool:

https://scaden.ims.bio

It contains pre-generated training datasets for several tissues, and all you need to do is to upload your expression data. Please note that this is still in preview.

Usage

We provide a detailed instructions for how to use Scaden at our Documentation page

A deconvolution workflow with Scaden consists of four major steps:

data simulation
data processing
training
prediction

If training data is already available, you can start at the data processing step. Otherwise you will first have to process scRNA-seq datasets and perform data simulation to generate a training dataset. As an example workflow, you can use Scaden's function scaden example to generate example data and go through the whole pipeline.

First, make an example data directory and generate the example data:

bash mkdir example_data scaden example --out example_data/

This generates the files "examplecounts.txt", "examplecelltypes.txt" and "examplebulkdata.txt" in the "example_data" directory. Next, you can generate training data:

bash scaden simulate --data example_data/ -n 100 --pattern "*_counts.txt

This generates 100 samples of training data in your current working directory. The file you need for your next step is called "data.h5ad". Now you need to perform the preprocessing using the training data and the bulk data file:

bash scaden process data.h5ad example_data/example_bulk_data.txt

As a result, you should now have a file called "processed.h5ad" in your directory. Now you can perform training. The following command performs training for 5000 steps per model and saves the trained weights to the "model" directory, which will be created:

bash scaden train processed.h5ad --steps 5000 --model_dir model

Finally, you can use the trained model to perform prediction:

bash scaden predict --model_dir model example_data/example_bulk_data.txt

Now you should have a file called "scaden_predictions.txt" in your working directory, which contains your estimated cell compositions.

1. System requirements

Scaden was developed and tested on Linux (Ubuntu 16.04 and 18.04). It was not tested on Windows or Mac, but should also be usable on these systems when installing with Pip or Bioconda. Scaden does not require any special hardware (e.g. GPU), however we recommend to have at least 16 GB of memory.

Scaden requires Python 3. All package dependencies should be handled automatically when installing with pip or conda.

Owner

Name: Kevin Menden
Login: KevinMenden
Kind: user
Location: Tübingen, Germany
Company: AIRAmed

Repositories: 6
Profile: https://github.com/KevinMenden

Research scientist interested in machine learning, data science and bioinformatics

GitHub Events

Total

Watch event: 11
Issue comment event: 1
Pull request event: 1
Fork event: 2
Create event: 1

Last Year

Watch event: 11
Issue comment event: 1
Pull request event: 1
Fork event: 2
Create event: 1

Committers

Last synced: over 2 years ago

All Time

Total Commits: 152
Total Committers: 3
Avg Commits per committer: 50.667
Development Distribution Score (DDS): 0.092

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
kevinmenden	k**n@t**e	138
Sergio Oller	s**r@g**m	13
eboileau	b**u@u**e	1

Committer Domains (Top 20 + Academic)

uni-heidelberg.de: 1 t-online.de: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 85
Total pull requests: 22
Average time to close issues: 3 months
Average time to close pull requests: 1 day
Total issue authors: 43
Total pull request authors: 5
Average comments per issue: 2.98
Average comments per pull request: 0.23
Merged pull requests: 19
Bot issues: 0
Bot pull requests: 1

Past Year

Issues: 2
Pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 2
Pull request authors: 1
Average comments per issue: 1.0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

KevinMenden (15)
khkk378 (7)
Zhaohui-Ruan (4)
mHagiw (3)
nagendraKU (3)
idalarsson (3)
hathawayxxh (3)
vdet (2)
bio-visualisation (2)
sydneysue (2)
Kai6662 (2)
ThomasThaewel (2)
HelloYiHan (2)
yunners (2)
zeehio (2)

Pull Request Authors

KevinMenden (16)
zeehio (4)
alex-d13 (2)
eboileau (1)
dependabot[bot] (1)

Top Labels

Issue Labels

enhancement (13) bug (4) question (3) stale (1) priority (1) dependencies (1) help wanted (1)

Pull Request Labels

enhancement (1) dependencies (1)

Packages

Total packages: 1
Total downloads:
- pypi 58 last-month

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 14
Total maintainers: 1

pypi.org: scaden

Cell type deconvolution using single cell data

Homepage: https://github.com/KevinMenden/scaden
Documentation: https://scaden.readthedocs.io/
License: MIT License
Latest release: 1.1.2
published about 5 years ago

Versions: 14
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 58 Last month

Rankings

Forks count: 7.7%

Stargazers count: 8.3%

Dependent packages count: 10.0%

Average: 13.2%

Downloads: 18.1%

Dependent repos count: 21.7%

Maintainers (1)

kmenden

Last synced: 10 months ago

Dependencies

setup.py pypi

anndata *
click *
h5py *
numpy *
pandas *
rich *
scikit-learn *
tensorflow >=2.0

scaden

Science Score: 10.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Single-cell assisted deconvolutional network

Installation guide

pip

GPU

Docker

Webtool (beta)

Usage

1. System requirements

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: scaden

Rankings

Maintainers (1)

Dependencies