viromeflowx
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: ncbi.nlm.nih.gov -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.5%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: 01life
- License: mit
- Language: Perl
- Default Branch: main
- Size: 12.2 MB
Statistics
- Stars: 10
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Introduction
ViromeFlowX is a comprehensive Nextflow-based automated workflow for mining viral genomes from metagenomic sequencing Data. Understanding the link between the human gut virome and diseases has garnered significant interest in the research community. Extracting virus-related information from metagenomic sequencing data is crucial for unravelling virus composition, host interactions, and disease associations. However, current metagenomic analysis workflows for viral genomes vary in effectiveness, posing challenges for researchers seeking the most up-to-date tools. To address this, we present ViromeFlowX, a user-friendly Nextflow workflow that automates viral genome assembly, identification, classification, and annotation. This streamlined workflow integrates cutting-edge tools for processing raw sequencing data for taxonomic annotation and functional analysis. Application to a dataset of 200 metagenomic samples yielded high-quality viral genomes. ViromeFlowX enables efficient mining of viral genomic data, offering a valuable resource to investigate the gut virome’s role in virus-host interactions and virus-related diseases.
Pipeline summary
- Quality Control (
trimmomaticbowtie2) - Assembly (
metaspades) - Viral Taxonomic Classify (
Kraken2) - Viral Contigs Identification (
VirFinderVirSorter2CheckVCdhit) - Gene Prediction & Functional Annotation (
Prodigalbedtools2DIAMOND) - Viral Taxonomic Classify Assignment (
usearchBlasttaxonkitCoverM)
Getting Start
Pre-requisites
To ensure the smoothest possible analysis using ViromeFlowX, we recommend taking the time to pre-build both the software components and the reference databases before you begin your analysis. This preparatory step will help guarantee a more efficient and hassle-free experience.
- Environment Setup: Most of the tools required by the pipeline can be conveniently installed using a Conda environment. Use the command below to create a new Conda environment based on the environment.yml configuration file.
bash
conda env create -f environment.yml
Notes: The
usearchtool is not supported for installation through conda. You need to manually download and install it. Please refer to the officialdocumentationfor installation instructions.
- Database Setup: Refer to the
Database Installation and Configurationfor downloading and configuring required databases.
Install and Usage
Clone the repository
git clone https://github.com/01life/ViromeFlowXPrepare a samplesheet
samplesheet.csvwith your input data that looks as followscsv id,reads1,read2 sample_2,/PATH/sample_L002_R1.fastq.gz,/PATH/sample_L002_R2.fastq.gz sample_3,/PATH/sample_L003_R1.fastq.gz,/PATH/sample_L003_R2.fastq.gzStart running the pipeline
nextflow run ViromeFlowX \ -profile <docker/singularity/conda/.../institute> \ --input samplesheet.csv \ --outdir <OUTDIR>
If you are new to Nextflow and nf-core, check the Nextflow installation guide. Ensure your setup passes the
-profile testbefore processing real data.
- Advance usage
The pipeline will run QC -> Metaspades(min_len=1k) -> Identify(VirFinder、VirSorter2、CheckV) -> Taxonomic Classify(Kraken2) -> Geneset -> Taxonomic Classify Assignment (demovir、pfam、protein、crAss、genome) -> Abundance. You can also use --help to see the parameters. For comprehensive tutorials and implementation guidelines, please refer to our Usage Documentation.
bash
nextflow run /path/to/project/ViromeFlowX --help
- Output information
To better understand the output files generated by ViromeFlowX and how to interpret them, refer to the Output Documentation.
Citation
If you found ViromeFlowX usefull in your research, please cite the publication: ViromeFlowX: a Comprehensive Nextflow-based Automated Workflow for Mining Viral Genomes from Metagenomic Sequencing Data.
Wang X, Ding Z, Yang Y, et al. ViromeFlowX: a Comprehensive Nextflow-based Automated Workflow for Mining Viral Genomes from Metagenomic Sequencing Data[J]. Microbial Genomics, 2024, 10(2): 001202.
Owner
- Name: 01life
- Login: 01life
- Kind: organization
- Repositories: 1
- Profile: https://github.com/01life
Citation (CITATIONS.md)
# nf-core/virome: Citations ## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/) > Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031. ## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/) > Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311. ## Pipeline tools - [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) - [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/) > Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924. ## Software packaging/containerisation tools - [Anaconda](https://anaconda.com) > Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web. - [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/) > Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506. - [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/) > da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671. - [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241) - [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/) > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
GitHub Events
Total
- Watch event: 4
- Delete event: 3
- Push event: 3
- Pull request event: 3
- Create event: 1
Last Year
- Watch event: 4
- Delete event: 3
- Push event: 3
- Pull request event: 3
- Create event: 1
Dependencies
- actions/upload-artifact v3 composite
- nf-core/tower-action v3 composite
- actions/upload-artifact v3 composite
- nf-core/tower-action v3 composite
- mshick/add-pr-comment v1 composite
- actions/checkout v3 composite
- nf-core/setup-nextflow v1 composite
- actions/checkout v3 composite
- actions/setup-node v3 composite
- actions/checkout v3 composite
- actions/setup-node v3 composite
- actions/setup-python v4 composite
- actions/upload-artifact v3 composite
- mshick/add-pr-comment v1 composite
- nf-core/setup-nextflow v1 composite
- psf/black stable composite
- dawidd6/action-download-artifact v2 composite
- marocchino/sticky-pull-request-comment v2 composite
- imbalanced-learn ==0.6.2
- numpy ==1.20.3
