rdscan
A snakemake workflow for regions of difference discovery in Mycobacterium tuberculosis complex (MTBC) samples
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 4 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.0%) to scientific vocabulary
Keywords
Repository
A snakemake workflow for regions of difference discovery in Mycobacterium tuberculosis complex (MTBC) samples
Basic Info
Statistics
- Stars: 6
- Watchers: 2
- Forks: 1
- Open Issues: 0
- Releases: 4
Topics
Metadata Files
README.md

pipeline for MTBC putative regions of difference discovery
Description
RDscan is a snakemake workflow to find deletions and putative regions of difference (RDs) in mycobacterium tuberculosis complex (MTBC) genomes, it is also capable to determine already known or user defined RDs.
Installation
The usage of this workflow is described in the Snakemake Workflow Catalog, alternatively it can be installed as described below.
Use the Conda package manager and BioConda channel to install RDscan.
If you do not have conda installed do the following:
```bash
Download Conda installer
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
Set permissions
chmod -X Miniconda3-latest-Linux-x86_64.sh
Install
bash Miniconda3-latest-Linux-x86_64.sh ```
Set up channels:
bash
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
Get RDscan snakemake workflow:
bash
git clone https://github.com/dbespiatykh/RDscan.git
Install all required dependencies:
bash
cd RDscan
conda install -c conda-forge mamba
mamba env create --file environment.yml
Usage
Rulegraph of the pipeline
Activate RDscan environment:
bash
conda activate RDscan
Run pipeline:
bash
snakemake --conda-frontend mamba --use-conda -j {Number of cores}
It is recommended to use dry run if you are running pipeline for the first time, to see if everything is in working order, for this you can use -n flag:
bash
snakemake -n
Output
Output in the results directory will contain four tables: RD_putative.tsv, RD_known.tsv, RD_known.xlsx, and RD_known.bin.tsv
Example of the RD_putative.tsv:
Table containing all discovered putative RDs.
RD - Known RDs that intersects with deletion breakpoints; SIZE - Estimated size of predicted deletion.
Values in cells represent deletion length in the sample.
| CHROM | START | END | SIZE | RD | TYPE | ERR015582 | ERR017778 | ERR017782 | ERR019852 | | --------- | ------ | ------ | ---- | --- | ---- | --------- | --------- | --------- | --------- | | NC000962 | 333828 | 338580 | 5800 | | DEL | 7113 | 7084 | 7050 | | NC000962 | 340400 | 340645 | 245 | | DEL | | | | | | NC000962 | 350935 | 351175 | 238 | | DEL | | 300 | 204 | 240 | | NC000962 | 361769 | 362988 | 1391 | | DEL | 1833 | 1392 | 1833 | 1390 |
Example of the RD_known.tsv:
Table containing proportion of coverage in particular RDs.
| Sample | N-RD25tbA | N-RD25tbB | N-RD25bov/cap | N-RD25das | | --------- | ---------- | ---------- | ------------- | --------- | | ERR015582 | 0.883562 | 0.856164 | 0.856164 | 0.808219 | | ERR017778 | 0 | 0 | 0 | 0.41791 | | ERR017782 | 1.021277 | 1.042553 | 1.106383 | 0.978723 | | ERR019852 | 0 | 0 | 0 | 0.386364 |
Example of the RD_known.xlsx:
Same as the RD_known.tsv, but in a XLSX format with applied contiditional formatting.\
Conditional formatting corresponds with threshold value in a config.yml file.

Binary version of the RD_known.bin.tsv:
| Sample | N-RD25tbA | N-RD25tbB | N-RD25bov/cap | N-RD25das | | --------- | ---------- | ---------- | ------------- | --------- | | ERR015582 | 0 | 0 | 0 | 0 | | ERR017778 | 1 | 1 | 1 | 0 | | ERR017782 | 0 | 0 | 0 | 0 | | ERR019852 | 1 | 1 | 1 | 0 |
Citation
If you use RDscan for your research, please cite the pipeline:
D. Bespiatykh, J. Bespyatykh, I. Mokrousov, and E. Shitikov, A Comprehensive Map of Mycobacterium tuberculosis Complex Regions of Difference, mSphere, Volume 6, Issue 4, 21 July 2021, Page e00535-21, https://doi.org/10.1128/mSphere.00535-21
All references for the tools utilized by the RDscan can be found in the CITATIONS.md file.
License
Owner
- Name: Dmitry Bespiatykh
- Login: dbespiatykh
- Kind: user
- Repositories: 4
- Profile: https://github.com/dbespiatykh
Bioinformatician
GitHub Events
Total
- Issues event: 3
- Watch event: 1
- Issue comment event: 3
- Push event: 4
Last Year
- Issues event: 3
- Watch event: 1
- Issue comment event: 3
- Push event: 4
Dependencies
- actions/checkout v1 composite
- actions/checkout v2 composite
- github/super-linter v4 composite
- snakemake/snakemake-github-action v1.23.0 composite