https://github.com/birgitrijvers/cancer_microbiome_scripts
This repository contains a collection of scripts used during researching the microbiome in tumours.
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 6 DOI reference(s) in README -
✓Academic publication links
Links to: ncbi.nlm.nih.gov -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.6%) to scientific vocabulary
Repository
This repository contains a collection of scripts used during researching the microbiome in tumours.
Basic Info
- Host: GitHub
- Owner: BirgitRijvers
- Language: Python
- Default Branch: master
- Size: 56.6 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Cancer Microbiome Scripts
The Cancer Microbiome Scripts repository is a collection of miscellaneous scripts used during analyzing the microbiome in tumour material based on metatranscriptomic data.
Created during the BSc graduation internship of Birgit Rijvers at the Clinical Bioinformatics group of Erasmus University Medical Center.
The main goal of this internship was to create a pipeline that: - Takes sequencing data (FASTQ) as input - Performs quality control - Maps reads to human reference genome (host depletion) - Removes reads that mapped - Taxonomically classifies the reads left - Reports quality metrics and reads counts at the main steps
In addition, scripts were created for various tasks. These are also provided in this repository.
The final version of this pipeline is available in another repository: https://github.com/BirgitRijvers/emc-cancermicro, but this repository contains the first versions of the pipeline which were written in Python.
Scripts included
Pipeline scripts
switchpipe.py
Runs the full pipeline, switches between commands based on single or paired end input data.
pipeline.py
Runs pipeline without QC on paired end data.
pipeline_single.py
Runs pipeline without QC on single end data.
Other scripts
ncbiidfetcher.py
Fetches taxonomy IDs from NCBI based on taxonomic names.
krakenclassificationcounter.py
Counts certain classifications (Human, root, unclassified & user specified) in Kraken2 output (not report style).
bracken_runner.py
Runs Bracken on all Kraken2 reports in a directory.
samplesheeter.py
Creates a CSV samplesheet compatible with nf-core pipelines based on (compressed) FASTQ files in a directory.
fastpcountextracter.py
Creates a CSV with QC readcounts extracted from JSON fastp output.
Citations
Tools
Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018 Sep 1;34(17):i884-i890. doi: 10.1093/bioinformatics/bty560. PMID: 30423086; PMCID: PMC6129281.
Wood, D.E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol 20, 257 (2019). https://doi.org/10.1186/s13059-019-1891-0
Vasimuddin Md, Sanchit Misra, Heng Li, Srinivas Aluru. Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems. IEEE Parallel and Distributed Processing Symposium (IPDPS), 2019. 10.1109/IPDPS.2019.00041
Twelve years of SAMtools and BCFtools Petr Danecek, James K Bonfield, Jennifer Liddle, John Marshall, Valeriu Ohan, Martin O Pollard, Andrew Whitwham, Thomas Keane, Shane A McCarthy, Robert M Davies, Heng Li GigaScience, Volume 10, Issue 2, February 2021, giab008, https://doi.org/10.1093/gigascience/giab008
Dabdoub, SM (2016). kraken-biom: Enabling interoperative format conversion for Kraken results (Version 1.2) [Software]. Available at https://github.com/smdabdoub/kraken-biom.
Entrez Programming Utilities Help [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2010-. Available from: https://www.ncbi.nlm.nih.gov/books/NBK25501/
Owner
- Login: BirgitRijvers
- Kind: user
- Repositories: 1
- Profile: https://github.com/BirgitRijvers
GitHub Events
Total
- Push event: 2
Last Year
- Push event: 2