eskapee

A NextFlow pipeline for ESKAPEE pathogen detection in metagenomes

https://github.com/royercj/eskapee

Science Score: 31.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

A NextFlow pipeline for ESKAPEE pathogen detection in metagenomes

Basic Info
  • Host: GitHub
  • Owner: royercj
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 2.92 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed over 2 years ago
Metadata Files
Readme Changelog License Code of conduct Citation

README.md

ESKAPEE pathogen detection pipeline

Welcome! This pipeline is currently a work in progress. I developed this pipeline as part of a Student Worksite Experience Project (SWEP) internship at Centers for Disease Control (sponsored by Leidos) in summer of 2023. This pipeline has been prototyped around detection of Escherichia coli from human metagenome samples, with plans to expand to full detection of all ESKAPEE pathogens. Please read below for planned updates, as well as pipeline premise and other information.

Currently in the works:

  • Update args for passing in Kraken2 database
  • Add additional ARG detection tools
  • Create condensed report from ARG tool results
  • Expand gene set to include other ESKAPEE pathogens
  • Convert drep module to subworkflow

Future updates to include:

  • Add additional functionality to accept WGS isolates in addition to metagenomes

Introduction

The ESKAPEE pathogens, an acronym for Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, Enterobacter species, and Escherichia coli pose significant global threats to human health. These pathogens may be antibiotic and treatment-resistant, and are frequently found in hospital and medical settings as infections of ports, catheters, and wounds. Identification of ESKAPEE pathogens may be challenging, as they frequently appear as commensals in the normal human microbiome, making distinction of pathogenic strains difficult. ESKAPEE pathogens may also be difficult to culture in the lab, or may lose virulence in culture, complicating their identification via traditional culture and PCR methods.

nf-core/eskapee

  1. Read QC (FastQC)
  2. Present QC for raw reads (MultiQC)

Usage

Note If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

Now, you can run the pipeline using:

bash nextflow run nf-core/eskapee \ -profile <docker/singularity/.../institute> \ --input samplesheet.csv \ --outdir <OUTDIR>

Warning: Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

For more details, please refer to the usage documentation and the parameter documentation.

Pipeline output

To see the the results of a test run with a full size dataset refer to the results tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the output documentation.

Credits

nf-core/eskapee was originally written by cjroyer.

We thank the following people for their extensive assistance in the development of this pipeline:

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on the Slack #eskapee channel (you can join with this invite).

Citations

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

Owner

  • Name: CJRoyer
  • Login: royercj
  • Kind: user

Graduate student in the Master's of Bioinformatics program at Georgia Tech. My focus is metagenomics, specifically human gut microbiome.

Citation (CITATIONS.md)

# nf-core/eskapee: Citations

## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)

> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.

## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/)

> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.

## Pipeline tools

- [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)

- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)
  > Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.

## Software packaging/containerisation tools

- [Anaconda](https://anaconda.com)

  > Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web.

- [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/)

  > Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506.

- [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/)

  > da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.

- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)
  > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.

GitHub Events

Total
Last Year

Dependencies

modules/nf-core/checkm/lineagewf/meta.yml cpan
modules/nf-core/custom/dumpsoftwareversions/meta.yml cpan
modules/nf-core/fastp/meta.yml cpan
modules/nf-core/fastqc/meta.yml cpan
modules/nf-core/gunzip/meta.yml cpan
modules/nf-core/kraken2/kraken2/meta.yml cpan
modules/nf-core/krona/ktimporttext/meta.yml cpan
modules/nf-core/maxbin2/meta.yml cpan
modules/nf-core/metabat2/metabat2/meta.yml cpan
modules/nf-core/multiqc/meta.yml cpan
modules/nf-core/unicycler/meta.yml cpan
pyproject.toml pypi