balrog-iso

A Nextflow pipeline for annotation AntiMicrobial Reisistance and othe features from isolated bacterial genomes

https://github.com/edwardbirdlab/balrog-iso

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

A Nextflow pipeline for annotation AntiMicrobial Reisistance and othe features from isolated bacterial genomes

Basic Info
  • Host: GitHub
  • Owner: edwardbirdlab
  • License: other
  • Language: Nextflow
  • Default Branch: main
  • Homepage:
  • Size: 959 KB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Created over 1 year ago · Last pushed 10 months ago
Metadata Files
Readme License Citation

README.md

Contributors Forks Stargazers Issues MIT License <!--![MIT License][(https://img.shields.io/badge/style-flat--squared-green.svg?style=flat-square)]--> Nextflow run with docker run with singularity DOI <!-- LinkedIn -->


BALROG-ISO

Bacterial Antimicrobial Resistance annOtation of Genomes - ISOlate whole genome

Report Bug · Request Feature

<!-- Dark mode image --> <!-- Light mode image --> <!-- Fallback image --> Nextflow Logo

About BALROG-ISO

BALROG-ISO (Bacterial Antimicrobial Resistance annOtation of Genomes - ISOlate whole genome) is a comprehensive high throughput Nextflow pipeline built to utilize next generaion short-reads for the investigation of bacterial antimicrobial resistance (AMR) and its mobility from whole genome sequences of bacterial isolates. While AMR characterization is the main goal of BALROG-ISO, it also provides the taxonomic classification, gene identities, and assignment of gene origin (i.e. plasmid or chromosome) for the submitted isolate(s).

[!NOTE] Updates to BALROG-ISO may occur periodically to help continually improve the pipeline. If you have any requests or recommended changes you'd like to see (i.e. usage with other data types), please reach out via email (edwardbirdlab@gmail.com | edwardbird@ksu.edu) or request feature.

If you experience any trouble or find bugs when running BALROG-ISO, please report issues or bugs and they will be addressed as soon as possible.

Not the BALROG pipeline you're looking for?

BALROG-MSR: Bacterial Antimicrobial Resistance annOtation of Genomes - Metagenomic Short Read
BALROG-MON: Bacterial Antimicrobial Resistance annOtation of Genomes - Metagenomic Oxford Nanopore

Workflow Overview

Nextflow Logo

*See sections below for details on subworkflows

Table of Contents

<!-- GETTING STARTED -->

Getting Started

Before you get too far along, familiarize yourself with this section to make sure this is the pipeline for you and your equipment and samples can meet the requirements. (Don't worry, there isn't too much to do).

1. What Data Do I Need?

BALROG-ISO in its current form expects Illuminia/Aviti paired-end, short-read data. BALROG-ISO in its standard configuration will require 100GB of RAM.

[!NOTE] If you would like to run BALROG-ISO with long-read data, feel free to request feature.

2. Dependencies

All dependencies are managed via Docker Containers and hosted on DockerHub. In addion to Nextflow, one of the following container runtime software packages will be required:
- Nextflow (>= 23.04.0.5857) - Install Nextflow - Docker/Singularity/Apptainer - Install Docker - Install Singularity - Install Apptainer

3. Installation

Preferred Method - Download Release sh wget https://github.com/edwardbirdlab/BALROG-ISO/archive/refs/tags/1.0.0.tar.gz tar -xzf 1.0.0.tar.gz Method 2 - Clone Repo sh git clone https://github.com/edwardbirdlab/BALROG-ISO

4. Creating a Sample Sheet

BALROG-ISO takes a CSV (Comma-Seperated-Value) sheet as the input. Note that the "sample" column will be the prefix of all output files for that sample. This version does not automatically combine reads of the same sample name, so please combine sequencing runs manually before starting the pipeline.

Example Format: sample,r1,r2 Sample_Name_1,/absolute/path/to/sample1_R1.fastq.gz,/absolute/path/to/sample1_R2.fastq.gz Sample_Name_2,/absolute/path/to/sample2_R1.fastq.gz,/absolute/path/to/sample2_R2.fastq.gz

5. Nextflow Configuration

When creating a Nextflow config, ensure a container runtime is enabled (Singularity/Apptainer/Docker). If you are using Slurm, you can use the incuded Beocat Slurm config as a template. Most nf-core configs will also be supported. If you have never created a Nextflow config, or are having issues, reach out to your local administration.
Nextflow Configuration - nf-core configs

6. Pipeline Configuration

If you want to change any parameters of BALROG-ISO from its default options, they can be changed using the "nextflow.config" file, or via command line. Configurable parameters will be outlined in the detailed sections below, as well as in the config file.

Required Parameters
sh --samplesheet /path/to/samplesheet --run_name "NameOfRun"

Optional Parameters
sh --sequencing_adapter_type illumina

Defines which adapter set to use.
Default: illumina (options = illumina, aviti, custom)

sh --custom_sequencing_adapter_r1 "ATGCATGC"

Sequence of the read 1 adapter.
Default: NaN

sh --custom_sequencing_adapter_r2 "ATGCATGC"

Sequence of the read 2 adapter.
Default: NaN

sh --fastp_minlen 100

The minimum read length.
Default: 100

sh --fastp_q 20

The minimum q-score threshold.
Default: 20

sh --busco_lineage bacteria_odb10

Sets which BUSCO lineage to use. Recommend changing if you have an expected taxon.
Default: bacteria_odb10

sh --params.plasmer_min_len = 500

Sets the minimum sequence length to be included in plasmid prediction. Not recommended to lower below 500.
Default = 500

sh -- params.plasmer_max_len = 500000

Sets the sequence length above which longer sequences are automatically predicted to be chromosomal in origin.
Default = 500000

sh --amrfinder_lineage Escherichia

Enables species-specific models in AMRFinderPlus. See AMRFinder documentation for supported species and how to supply the name of them.
Default: NaN

sh --resfinder_lineage "Escherichia coli"

Enables species-specific models in ResFinder. See ResFinder documentation for supported species and how to supply the name of them.
Default: NaN

(back to top)

<!-- RUNNING BALROG-ISO -->

Running BALROG-ISO

  1. Running the Whole Pipeline sh nextflow run /path/to/edwardbirdlab/BALROG-MON -c /path/to/config.cfg
  2. Generate Multi-QC sh nextflow run /path/to/edwardbirdlab/BALROG-MON -c /path/to/config.cfg --workflow-opt multiqc

    (back to top)

<!-- CORE STEPS OF WORKFLOW -->

Core Steps of Workflow

1. Quality Control

Raw QC

Human Read Removal Tool

Trimming

Final QC - FastQC : Trimmed Read

2. Read Assembly

Genome Assembly

Assembly Stats

  • QUAST : Assembly Metrics Report

Genome Completeness

  • BUSCO: Single-Copy Ortholog "Completeness"

3. Taxonomic Classification

4. Annotation

Sequence Origin Assignment - Plasmer : Plasmid prediction

Functional Genome Annotation - Prokka

MultiAMR Resistance Gene Annotation - hAMRonization : Unified ARG Results Report from...
1) CARD using RGI
2) AMRFinderPlus
3) ResFinder

5. Output Collection and Summary

<!-- CITATIONS -->

Citations

As there is currently no paper associated with BALROG-ISO, please cite this Github page. Also, I feel free to contact me (edwardbirdlab@gmail.com | edwardbird@ksu.edu) to let me know!

Many tools are used in this pipeline and its respective options. See 'CITATION.md' for the list of all tools used in this pipeline.

(back to top)

<!-- LICENSE -->

License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

<!-- CONTACT -->

Contact

Edward Bird - - edwardbirdlab@gmail.com | edwardbird@ksu.edu

(back to top)

Owner

  • Login: edwardbirdlab
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'BALROG-ISO'
message: 'If you use this software, please cite it as below.'
type: software
authors:
  - given-names: Edward
    family-names: Bird
    email: edwardbird@ksu.edu
    affiliation: Kansas State University
    orcid: 'https://orcid.org/0009-0006-3782-9367'
identifiers:
  - type: doi
    value: 
    description: Zenodo DOI
repository-code: 'https://github.com/edwardbirdlab/BALROG-ISO'
abstract: >-
    Bacterial Antimicrobial Resistance annOtation of Genomes - ISOlate
keywords:
  - nextflow
  - metagenome
  - amr
  - antibiotic
  - amrfinder
  - resfinder
  - card
  - resistance
  - isolate
  - clinical
  - enviromental
  - short read
license: MIT

GitHub Events

Total
  • Release event: 2
  • Push event: 45
  • Pull request event: 2
  • Create event: 3
Last Year
  • Release event: 2
  • Push event: 45
  • Pull request event: 2
  • Create event: 3