Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: ncbi.nlm.nih.gov
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.5%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: 01life
  • License: mit
  • Language: Perl
  • Default Branch: main
  • Size: 12.2 MB
Statistics
  • Stars: 10
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed 10 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation

README.md

ViromeFlowX

Nextflow run with conda run with docker run with singularity run with slurm

Introduction

ViromeFlowX is a comprehensive Nextflow-based automated workflow for mining viral genomes from metagenomic sequencing Data. Understanding the link between the human gut virome and diseases has garnered significant interest in the research community. Extracting virus-related information from metagenomic sequencing data is crucial for unravelling virus composition, host interactions, and disease associations. However, current metagenomic analysis workflows for viral genomes vary in effectiveness, posing challenges for researchers seeking the most up-to-date tools. To address this, we present ViromeFlowX, a user-friendly Nextflow workflow that automates viral genome assembly, identification, classification, and annotation. This streamlined workflow integrates cutting-edge tools for processing raw sequencing data for taxonomic annotation and functional analysis. Application to a dataset of 200 metagenomic samples yielded high-quality viral genomes. ViromeFlowX enables efficient mining of viral genomic data, offering a valuable resource to investigate the gut virome’s role in virus-host interactions and virus-related diseases.

ViromeFlowX workflow overview

Pipeline summary

  1. Quality Control ( trimmomatic bowtie2 )
  2. Assembly ( metaspades )
  3. Viral Taxonomic Classify ( Kraken2 )
  4. Viral Contigs Identification ( VirFinder VirSorter2 CheckV Cdhit )
  5. Gene Prediction & Functional Annotation ( Prodigal bedtools2 DIAMOND )
  6. Viral Taxonomic Classify Assignment ( usearch Blast taxonkit CoverM )

Getting Start

Pre-requisites

To ensure the smoothest possible analysis using ViromeFlowX, we recommend taking the time to pre-build both the software components and the reference databases before you begin your analysis. This preparatory step will help guarantee a more efficient and hassle-free experience.

  • Environment Setup: Most of the tools required by the pipeline can be conveniently installed using a Conda environment. Use the command below to create a new Conda environment based on the environment.yml configuration file.

bash conda env create -f environment.yml

Notes: The usearch tool is not supported for installation through conda. You need to manually download and install it. Please refer to the official documentation for installation instructions.

Install and Usage

  1. Clone the repository git clone https://github.com/01life/ViromeFlowX

  2. Prepare a samplesheet samplesheet.csv with your input data that looks as follows csv id,reads1,read2 sample_2,/PATH/sample_L002_R1.fastq.gz,/PATH/sample_L002_R2.fastq.gz sample_3,/PATH/sample_L003_R1.fastq.gz,/PATH/sample_L003_R2.fastq.gz

  3. Start running the pipeline nextflow run ViromeFlowX \ -profile <docker/singularity/conda/.../institute> \ --input samplesheet.csv \ --outdir <OUTDIR>

If you are new to Nextflow and nf-core, check the Nextflow installation guide. Ensure your setup passes the -profile test before processing real data.

  1. Advance usage

The pipeline will run QC -> Metaspades(min_len=1k) -> Identify(VirFinder、VirSorter2、CheckV) -> Taxonomic Classify(Kraken2) -> Geneset -> Taxonomic Classify Assignment (demovir、pfam、protein、crAss、genome) -> Abundance. You can also use --help to see the parameters. For comprehensive tutorials and implementation guidelines, please refer to our Usage Documentation.

bash nextflow run /path/to/project/ViromeFlowX --help

  1. Output information

To better understand the output files generated by ViromeFlowX and how to interpret them, refer to the Output Documentation.

Citation

If you found ViromeFlowX usefull in your research, please cite the publication: ViromeFlowX: a Comprehensive Nextflow-based Automated Workflow for Mining Viral Genomes from Metagenomic Sequencing Data.

Wang X, Ding Z, Yang Y, et al. ViromeFlowX: a Comprehensive Nextflow-based Automated Workflow for Mining Viral Genomes from Metagenomic Sequencing Data[J]. Microbial Genomics, 2024, 10(2): 001202.

Owner

  • Name: 01life
  • Login: 01life
  • Kind: organization

Citation (CITATIONS.md)

# nf-core/virome: Citations

## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)

> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.

## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/)

> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.

## Pipeline tools

- [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)

- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)
  > Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.

## Software packaging/containerisation tools

- [Anaconda](https://anaconda.com)

  > Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web.

- [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/)

  > Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506.

- [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/)

  > da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.

- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)
  > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.

GitHub Events

Total
  • Watch event: 4
  • Delete event: 3
  • Push event: 3
  • Pull request event: 3
  • Create event: 1
Last Year
  • Watch event: 4
  • Delete event: 3
  • Push event: 3
  • Pull request event: 3
  • Create event: 1

Dependencies

.github/workflows/awsfulltest.yml actions
  • actions/upload-artifact v3 composite
  • nf-core/tower-action v3 composite
.github/workflows/awstest.yml actions
  • actions/upload-artifact v3 composite
  • nf-core/tower-action v3 composite
.github/workflows/branch.yml actions
  • mshick/add-pr-comment v1 composite
.github/workflows/ci.yml actions
  • actions/checkout v3 composite
  • nf-core/setup-nextflow v1 composite
.github/workflows/fix-linting.yml actions
  • actions/checkout v3 composite
  • actions/setup-node v3 composite
.github/workflows/linting.yml actions
  • actions/checkout v3 composite
  • actions/setup-node v3 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v3 composite
  • mshick/add-pr-comment v1 composite
  • nf-core/setup-nextflow v1 composite
  • psf/black stable composite
.github/workflows/linting_comment.yml actions
  • dawidd6/action-download-artifact v2 composite
  • marocchino/sticky-pull-request-comment v2 composite
pyproject.toml pypi
environment.yml pypi
  • imbalanced-learn ==0.6.2
  • numpy ==1.20.3