veba

A modular end-to-end suite for in silico recovery, clustering, and analysis of prokaryotic, microeukaryotic, and viral genomes from metagenomes

https://github.com/jolespin/veba

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 8 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.6%) to scientific vocabulary

Keywords

bioinformatics genomics metagenomics metatranscriptomics microbial microbiome
Last synced: 4 months ago · JSON representation ·

Repository

A modular end-to-end suite for in silico recovery, clustering, and analysis of prokaryotic, microeukaryotic, and viral genomes from metagenomes

Basic Info
  • Host: GitHub
  • Owner: jolespin
  • License: agpl-3.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 55.2 MB
Statistics
  • Stars: 94
  • Watchers: 4
  • Forks: 10
  • Open Issues: 21
  • Releases: 26
Topics
bioinformatics genomics metagenomics metatranscriptomics microbial microbiome
Created over 3 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog License Citation

README.md

Maintainer License DOI:10.1186/s12859-022-04973-8

Forks Stargazers Issues

Schematic

What is VEBA?

The Viral Eukaryotic Bacterial Archaeal (VEBA) open-source software suite was developed with characterizing all domains of microorganisms as the primary objective (not post hoc adjustments) including prokaryotic, eukaryotic, and viral organisms. VEBA is an end-to-end metagenomics and bioprospecting software suite that can directly recover and analyze eukaryotic and viral genomes in addition to prokaryotic genomes with native support for candidate phyla radiation (CPR). VEBA implements a novel iterative binning procedure and an optional hybrid sample-specific/multi-sample framework that recovers more genomes than non-iterative methods. To optimize the microeukaryotic gene calling and taxonomic classifications, VEBA includes a consensus microeukaryotic database containing protists and fungi compiled from several existing databases. VEBA also provides a unique clustering-based dereplication strategy allowing for sample-specific genomes and proteins to be directly compared across non-overlapping biological samples. VEBA also automates biosynthetic gene cluster identification and novelty scores for bioprospecting.

VEBA's mission is to make robust (meta-)genomics/transcriptomics analysis effortless. The philosophy of VEBA is that workflows should be modular, generalizable, and easy-to-use with minimal intermediate steps. The approach implemented in VEBA is to (try and) think 2 steps ahead of what you may need to do and automate the task for you.

^__^


Citation

  • Espinoza JL, Phillips A, Prentice MB, Tan GS, Kamath PL, Lloyd KG, Dupont CL. Unveiling the microbial realm with VEBA 2.0: a modular bioinformatics suite for end-to-end genome-resolved prokaryotic, (micro)eukaryotic and viral multi-omics from either short- or long-read sequencing. Nucleic Acids Res. 2024 Jun 22:gkae528. doi: 10.1093/nar/gkae528. PMID: 38909293.
  • Espinoza JL, Dupont CL. VEBA: a modular end-to-end suite for in silico recovery, clustering, and analysis of prokaryotic, microeukaryotic, and viral genomes from metagenomes. BMC Bioinformatics. 2022 Oct 12;23(1):419. doi: 10.1186/s12859-022-04973-8. PMID: 36224545.

In addition to the above, please cite the software dependencies described under the Dependency Citation Table.

^__^


Tell developers what you do and what you need

The objective of VEBA is to provide high-quality metagenomics and metatranscriptomics workflows to the community. Understanding the user-base's needs will help me develop VEBA so it can make your research life easier.

Your insight matters, if you have 30 seconds to spare please fill out this quick 5 question Google Form (no e-mail needed).

^__^


Current

[!NOTE] As of v2.4.2, the binning-prokaryotic.py module is not entirely reproducible as SemiBin2 and Binette are stochastic (see GitHub issues). I've developed a work-around for Binette stochastic behvaior but SemiBin2 has not yet been resolved. This is expected to change in future versions.

For details on daily changes, please refer to the change log or release history

^__^


Getting started with VEBA

Installation and Database Configuration Guide for software installation and database configuration.

Usage and Resource Requirements Guide for parameters and module descriptions

Walkthrough Guides for tutorials and workflows on how to use VEBA

Visual Guides for video walkthroughs on how to use VEBA

Quick Guides for interpreting module outputs.

Test Data for testing installation/methods.

Usage Example:

e.g., Running preprocess module.

1) Syntax compatible with Conda:

source activate VEBA veba --module preprocess --params "{PARAMS}"

2) Syntax compatible with Conda and Docker/Singularity containers:

source activate VEBA-preprocess_env preprocess.py "{PARAMS}"

Check out the VEBA Change Log for details between each update and insight into what is being implemented in the upcoming version.

^__^


Output structure

VEBA's is built on the GenoPype archituecture which creates a reproducible and easy-to-navigate directory structure. GenoPype's philosophy is to use the same names for all files but to have sample names as subdirectories. This makes it easier to glob files for grepping, concatenating, etc. NextFlow support is in the works...

Example of GenoPype layout: ``` # Project directory project_directory/ # Temporary directory project_directory/tmp/ # Log directory project_directory/logs/ project_directory/logs/[step]__[program-name].e project_directory/logs/[step]__[program-name].o project_directory/logs/[step]__[program-name].returncode # Checkpoint directory project_directory/checkpoints/ project_directory/checkpoints/ # Intermediate directories for each step project_directory/intermediate/ project_directory/intermediate/[step]__[program-name]/ # Output directory project_directory/output/ # Commands project_directory/commands.sh ``` For *VEBA*, it has all the directories created by `GenoPype` above but is built for having multiple samples under the same project. Example of *VEBA*'s default directory layout: ``` ID="sample_1" # Main output directory veba_output/ # Assembly directory veba_output/assembly # Assembly output for ${ID} sample veba_output/assembly/${ID}/output/ # Prokaryotic binning for ${ID} sample veba_output/binning/prokaryotic/${ID}/output/ # Eukaryotic binning veba_output/binning/eukaryotic/${ID}/output/ # Viral binning veba_output/binning/viral/${ID}/output/ ``` The above are default output locations but they can be customized.

^__^


Frequently Asked Questions

If perusing the Frequently Asked Questions doesn't address your question, feel free to submit a [Question]

^__^


Owner

  • Name: Josh L. Espinoza
  • Login: jolespin
  • Kind: user
  • Location: Pacific Ocean
  • Company: J. Craig Venter Institute

I like nature, coding, and rock climbing. Staff Scientist at J. Craig Venter Institute

Citation (CITATIONS.md)

### Software Dependencies:

| Dependency         | Citation                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|--------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| antismash          | Blin K, Shaw S, Kloosterman AM, Charlop-Powers Z, van Wezel   GP, Medema MH, Weber T. antiSMASH 6.0: improving cluster detection and   comparison capabilities. Nucleic Acids Res. 2021 Jul 2;49(W1):W29-W35. doi:   10.1093/nar/gkab335. PMID: 33978755; PMCID: PMC8262755.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| barrnap            | Seemann, T. barrnap 0.9 : rapid ribosomal RNA prediction.   https://github.com/tseemann/barrnap                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| bbtools            | https://sourceforge.net/projects/bbmap/                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| bowtie2            | Langmead B, Salzberg SL. Fast gapped-read alignment with   Bowtie 2. Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth.1923. PMID:   22388286; PMCID: PMC3322381.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| busco              | Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO   Update: Novel and Streamlined Workflows along with Broader and Deeper   Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral   Genomes. Mol Biol Evol. 2021 Sep 27;38(10):4647-4654. doi:   10.1093/molbev/msab199. PMID: 34320186; PMCID: PMC8476166.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| checkm2            | Alex Chklovski, Donovan H. Parks, Ben J. Woodcroft, Gene W.   Tyson bioRxiv 2022.07.11.499243; doi:   https://doi.org/10.1101/2022.07.11.499243                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| checkv             | Nayfach S, Camargo AP, Schulz F, Eloe-Fadrosh E, Roux S,   Kyrpides NC. CheckV assesses the quality and completeness of   metagenome-assembled viral genomes. Nat Biotechnol. 2021 May;39(5):578-585.   doi: 10.1038/s41587-020-00774-7. Epub 2020 Dec 21. PMID: 33349699; PMCID:   PMC8116208.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| chopper            | Wouter De Coster, Rosa Rademakers, NanoPack2: population-scale   evaluation of long-read sequencing data, Bioinformatics, Volume 39, Issue 5,   May 2023, btad311, https://doi.org/10.1093/bioinformatics/btad311                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| clipkit            | Steenwyk JL, Buida TJ 3rd, Li Y, Shen XX, Rokas A. ClipKIT: A   multiple sequence alignment trimming software for accurate phylogenomic   inference. PLoS Biol. 2020 Dec 2;18(12):e3001007. doi:   10.1371/journal.pbio.3001007. PMID: 33264284; PMCID: PMC7735675.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| concoct            | Alneberg J, Bjarnason BS, de Bruijn I, Schirmer M, Quick J,   Ijaz UZ, Lahti L, Loman NJ, Andersson AF, Quince C. Binning metagenomic   contigs by coverage and composition. Nat Methods. 2014 Nov;11(11):1144-6.   doi: 10.1038/nmeth.3103. Epub 2014 Sep 14. PMID: 25218180.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| coverm             | https://github.com/wwood/CoverM                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| dada2              | Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJ, Holmes   SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nat   Methods. 2016 Jul;13(7):581-3. doi: 10.1038/nmeth.3869. Epub 2016 May 23.   PMID: 27214047; PMCID: PMC4927377.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| das_tool           | Sieber CMK, Probst AJ, Sharrar A, Thomas BC, Hess M, Tringe   SG, Banfield JF. Recovery of genomes from metagenomes via a dereplication,   aggregation and scoring strategy. Nat Microbiol. 2018 Jul;3(7):836-843. doi:   10.1038/s41564-018-0171-1. Epub 2018 May 28. PMID: 29807988; PMCID:   PMC6786971.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| dbcan2             | Han Zhang, Tanner Yohe, Le Huang, Sarah Entwistle, Peizhi Wu,   Zhenglu Yang, Peter K Busk, Ying Xu, Yanbin Yin, dbCAN2: a meta server for   automated carbohydrate-active enzyme annotation, Nucleic Acids Research,   Volume 46, Issue W1, 2 July 2018, Pages W95–W101,   https://doi.org/10.1093/nar/gky418                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| diamond            | Buchfink B, Reuter K, Drost HG. Sensitive protein alignments   at tree-of-life scale using DIAMOND. Nat Methods. 2021 Apr;18(4):366-368.   doi: 10.1038/s41592-021-01101-x. Epub 2021 Apr 7. PMID: 33828273; PMCID:   PMC8026399.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| fastani            | Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru   S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear   species boundaries. Nat Commun. 2018 Nov 30;9(1):5114. doi:   10.1038/s41467-018-07641-9. PMID: 30504855; PMCID: PMC6269478.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| fastp              | Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one   FASTQ preprocessor. Bioinformatics. 2018 Sep 1;34(17):i884-i890. doi:   10.1093/bioinformatics/bty560. PMID: 30423086; PMCID: PMC6129281.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| fastq_preprocessor | Espinoza JL, Dupont CL. VEBA: a modular end-to-end suite for   in silico recovery, clustering, and analysis of prokaryotic, microeukaryotic,   and viral genomes from metagenomes. BMC Bioinformatics. 2022 Oct   12;23(1):419. doi: 10.1186/s12859-022-04973-8. PMID: 36224545; PMCID:   PMC9554839.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| fasttree           | Price MN, Dehal PS, Arkin AP. FastTree 2--approximately   maximum-likelihood trees for large alignments. PLoS One. 2010 Mar   10;5(3):e9490. doi: 10.1371/journal.pone.0009490. PMID: 20224823; PMCID:   PMC2835736.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| featurecounts      | Liao Y, Smyth GK, Shi W. featureCounts: an efficient general   purpose program for assigning sequence reads to genomic features.   Bioinformatics. 2014 Apr 1;30(7):923-30. doi: 10.1093/bioinformatics/btt656.   Epub 2013 Nov 13. PMID: 24227677.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| flye               | Kolmogorov, M., Yuan, J., Lin, Y. et al. Assembly of long,   error-prone reads using repeat graphs. Nat Biotechnol 37, 540–546 (2019).   https://doi.org/10.1038/s41587-019-0072-8                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| genomad            | Antonio Pedro Camargo, Simon Roux, Frederik Schulz, Michal   Babinski, Yan Xu, Bin Hu, Patrick S. G. Chain, Stephen Nayfach, Nikos C.   Kyrpides bioRxiv 2023.03.05.531206; doi:   https://doi.org/10.1101/2023.03.05.531206                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| gtdbtk             | Chaumeil PA, Mussig AJ, Hugenholtz P, Parks DH. GTDB-Tk v2:   memory friendly classification with the genome taxonomy database.   Bioinformatics. 2022 Nov 30;38(23):5315-5316. doi:   10.1093/bioinformatics/btac672. PMID: 36218463; PMCID: PMC9710552.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| hmmer              | Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol.   2011 Oct;7(10):e1002195. doi: 10.1371/journal.pcbi.1002195. Epub 2011 Oct 20.   PMID: 22039361; PMCID: PMC3197634.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| humann             | Francesco Beghini, Lauren J McIver, Aitor Blanco-Míguez,   Leonard Dubois, Francesco Asnicar, Sagun Maharjan, Ana Mailyan, Paolo Manghi,   Matthias Scholz, Andrew Maltez Thomas, Mireia Valles-Colomer, George   Weingart, Yancong Zhang, Moreno Zolfo, Curtis Huttenhower, Eric A Franzosa,   Nicola Segata (2021) Integrating taxonomic, functional, and strain-level   profiling of diverse microbial communities with bioBakery 3 eLife 10:e65088   https://doi.org/                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| iqtree             | Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von   Haeseler A, Lanfear R. IQ-TREE 2: New Models and Efficient Methods for   Phylogenetic Inference in the Genomic Era. Mol Biol Evol. 2020 May   1;37(5):1530-1534. doi: 10.1093/molbev/msaa015. Erratum in: Mol Biol Evol.   2020 Aug 1;37(8):2461. PMID: 32011700; PMCID: PMC7182206.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| kofamscan          | Aramaki T, Blanc-Mathieu R, Endo H, Ohkubo K, Kanehisa M, Goto   S, Ogata H. KofamKOALA: KEGG Ortholog assignment based on profile HMM and   adaptive score threshold. Bioinformatics. 2020 Apr 1;36(7):2251-2252. doi:   10.1093/bioinformatics/btz859. PMID: 31742321; PMCID: PMC7141845.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| krona              | Ondov BD, Bergman NH, Phillippy AM. Interactive metagenomic   visualization in a Web browser. BMC Bioinformatics. 2011 Sep 30;12:385. doi:   10.1186/1471-2105-12-385. PMID: 21961884; PMCID: PMC3190407.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| maxbin2            | Wu YW, Tang YH, Tringe SG, Simmons BA, Singer SW. MaxBin: an   automated binning method to recover individual genomes from metagenomes using   an expectation-maximization algorithm. Microbiome. 2014 Aug 1;2:26. doi:   10.1186/2049-2618-2-26. PMID: 25136443; PMCID: PMC4129434.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| megahit            | Li D, Liu CM, Luo R, Sadakane K, Lam TW. MEGAHIT: an   ultra-fast single-node solution for large and complex metagenomics assembly   via succinct de Bruijn graph. Bioinformatics. 2015 May 15;31(10):1674-6. doi:   10.1093/bioinformatics/btv033. Epub 2015 Jan 20. PMID: 25609793.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| metabat2           | Kang DD, Li F, Kirton E, Thomas A, Egan R, An H, Wang Z.   MetaBAT 2: an adaptive binning algorithm for robust and efficient genome   reconstruction from metagenome assemblies. PeerJ. 2019 Jul 26;7:e7359. doi:   10.7717/peerj.7359. PMID: 31388474; PMCID: PMC6662567.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| metaeuk            | Levy Karin E, Mirdita M, Söding J. MetaEuk-sensitive,   high-throughput gene discovery, and annotation for large-scale eukaryotic   metagenomics. Microbiome. 2020 Apr 3;8(1):48. doi:   10.1186/s40168-020-00808-x. PMID: 32245390; PMCID: PMC7126354.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| metaflye           | Kolmogorov, M., Bickhart, D.M., Behsaz, B. et al. metaFlye:   scalable long-read metagenome assembly using repeat graphs. Nat Methods 17,   1103–1110 (2020). https://doi.org/10.1038/s41592-020-00971-x                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| metaspades         | Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. metaSPAdes: a   new versatile metagenomic assembler. Genome Res. 2017 May;27(5):824-834. doi:   10.1101/gr.213959.116. Epub 2017 Mar 15. PMID: 28298430; PMCID: PMC5411777.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| minimap2           | Heng Li, Minimap2: pairwise alignment for nucleotide   sequences, Bioinformatics, Volume 34, Issue 18, September 2018, Pages   3094–3100, https://doi.org/10.1093/bioinformatics/bty191                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| microbeannotator   | Ruiz-Perez, C.A., Conrad, R.E. & Konstantinidis, K.T.   MicrobeAnnotator: a user-friendly, comprehensive functional annotation   pipeline for microbial genomes. BMC Bioinformatics 22, 11 (2021).   https://doi.org/10.1186/s12859-020-03940-5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| mmseqs2            | Steinegger M, Söding J. MMseqs2 enables sensitive protein   sequence searching for the analysis of massive data sets. Nat Biotechnol.   2017 Nov;35(11):1026-1028. doi: 10.1038/nbt.3988. Epub 2017 Oct 16. PMID:   29035372.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| muscle             | Edgar RC. Muscle5: High-accuracy alignment ensembles enable   unbiased assessments of sequence homology and phylogeny. Nat Commun. 2022 Nov   15;13(1):6968. doi: 10.1038/s41467-022-34630-w. PMID: 36379955; PMCID:   PMC9664440.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| prodigal           | Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ.   Prodigal: prokaryotic gene recognition and translation initiation site   identification. BMC Bioinformatics. 2010 Mar 8;11:119. doi:   10.1186/1471-2105-11-119. PMID: 20211023; PMCID: PMC2848648.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| prodigal-gv        | Antonio Pedro Camargo, Simon Roux, Frederik Schulz, Michal   Babinski, Yan Xu, Bin Hu, Patrick S. G. Chain, Stephen Nayfach, Nikos C.   Kyrpides bioRxiv 2023.03.05.531206; doi:   https://doi.org/10.1101/2023.03.05.531206                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| pyrodigal          | Larralde, M., (2022). Pyrodigal: Python bindings and interface   to Prodigal, an efficient method for gene prediction in prokaryotes. Journal   of Open Source Software, 7(72), 4296, https://doi.org/10.21105/joss.04296                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| qiime2             | Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC,   Al-Ghalith GA, Alexander H, Alm EJ, Arumugam M, Asnicar F, Bai Y, Bisanz JE,   Bittinger K, Brejnrod A, Brislawn CJ, Brown CT, Callahan BJ,   Caraballo-Rodríguez AM, Chase J, Cope EK, Da Silva R, Diener C, Dorrestein   PC, Douglas GM, Durall DM, Duvallet C, Edwardson CF, Ernst M, Estaki M,   Fouquier J, Gauglitz JM, Gibbons SM, Gibson DL, Gonzalez A, Gorlick K, Guo J,   Hillmann B, Holmes S, Holste H, Huttenhower C, Huttley GA, Janssen S,   Jarmusch AK, Jiang L, Kaehler BD, Kang KB, Keefe CR, Keim P, Kelley ST,   Knights D, Koester I, Kosciolek T, Kreps J, Langille MGI, Lee J, Ley R, Liu   YX, Loftfield E, Lozupone C, Maher M, Marotz C, Martin BD, McDonald D, McIver   LJ, Melnik AV, Metcalf JL, Morgan SC, Morton JT, Naimey AT, Navas-Molina JA,   Nothias LF, Orchanian SB, Pearson T, Peoples SL, Petras D, Preuss ML, Pruesse   E, Rasmussen LB, Rivers A, Robeson MS 2nd, Rosenthal P, Segata N, Shaffer M,   Shiffer A, Sinha R, Song SJ, Spear JR, Swafford AD, Thompson LR, Torres PJ,   Trinh P, Tripathi A, Turnbaugh PJ, Ul-Hasan S, van der Hooft JJJ, Vargas F,   Vázquez-Baeza Y, Vogtmann E, von Hippel M, Walters W, Wan Y, Wang M, Warren   J, Weber KC, Williamson CHD, Willis AD, Xu ZZ, Zaneveld JR, Zhang Y, Zhu Q,   Knight R, Caporaso JG. Reproducible, interactive, scalable and extensible   microbiome data science using QIIME 2. Nat Biotechnol. 2019   Aug;37(8):852-857. doi: 10.1038/s41587-019-0209-9. Erratum in: Nat   Biotechnol. 2019 Sep;37(9):1091. PMID: 31341288; PMCID: PMC7015180. |
| rnaspades          | Bushmanova E, Antipov D, Lapidus A, Prjibelski AD. rnaSPAdes:   a de novo transcriptome assembler and its application to RNA-Seq data.   Gigascience. 2019 Sep 1;8(9):giz100. doi: 10.1093/gigascience/giz100. PMID:   31494669; PMCID: PMC6736328.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| samtools           | Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N,   Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup.   The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug   15;25(16):2078-9. doi: 10.1093/bioinformatics/btp352. Epub 2009 Jun 8. PMID:   19505943; PMCID: PMC2723002.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| seqkit             | Shen W, Le S, Li Y, Hu F. SeqKit: A Cross-Platform and   Ultrafast Toolkit for FASTA/Q File Manipulation. PLoS One. 2016 Oct   5;11(10):e0163962. doi: 10.1371/journal.pone.0163962. PMID: 27706213; PMCID:   PMC5051824.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| skani              | Shaw, J., Yu, Y.W. Fast and robust metagenomic sequence   comparison through sparse chaining with skani. Nat Methods 20, 1661–1665   (2023). https://doi.org/10.1038/s41592-023-02018-3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| spades             | Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M,   Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV,   Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. SPAdes: a new   genome assembly algorithm and its applications to single-cell sequencing. J   Comput Biol. 2012 May;19(5):455-77. doi: 10.1089/cmb.2012.0021. Epub 2012 Apr   16. PMID: 22506599; PMCID: PMC3342519.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| tiara              | Karlicki M, Antonowicz S, Karnkowska A. Tiara: deep   learning-based classification system for eukaryotic sequences.   Bioinformatics. 2022 Jan 3;38(2):344-350. doi:   10.1093/bioinformatics/btab672. PMID: 34570171; PMCID: PMC8722755.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| trnascan-se        | Chan PP, Lin BY, Mak AJ, Lowe TM. tRNAscan-SE 2.0: improved detection and   functional classification of transfer RNA genes. Nucleic Acids Res. 2021 Sep   20;49(16):9077-9096. doi: 10.1093/nar/gkab688. PMID: 34417604; PMCID:   PMC8450103.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| veryfasttree       | César Piñeiro, José M Abuín, Juan C Pichel, Very Fast Tree: speeding up   the estimation of phylogenies for large alignments through parallelization   and vectorization strategies, Bioinformatics, Volume 36, Issue 17, September   2020, Pages 4658–4659, https://doi.org/10.1093/bioinformatics/btaa582                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| virfinder          | Ren J, Ahlgren NA, Lu YY, Fuhrman JA, Sun F. VirFinder: a   novel k-mer based tool for identifying viral sequences from assembled   metagenomic data. Microbiome. 2017 Jul 6;5(1):69. doi:   10.1186/s40168-017-0283-5. PMID: 28683828; PMCID: PMC5501583.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |

GitHub Events

Total
  • Create event: 13
  • Release event: 4
  • Issues event: 68
  • Watch event: 17
  • Delete event: 7
  • Issue comment event: 54
  • Push event: 71
  • Pull request review event: 1
  • Pull request event: 35
  • Fork event: 4
Last Year
  • Create event: 13
  • Release event: 4
  • Issues event: 68
  • Watch event: 17
  • Delete event: 7
  • Issue comment event: 54
  • Push event: 71
  • Pull request review event: 1
  • Pull request event: 35
  • Fork event: 4

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 41
  • Total pull requests: 15
  • Average time to close issues: 15 days
  • Average time to close pull requests: less than a minute
  • Total issue authors: 8
  • Total pull request authors: 3
  • Average comments per issue: 1.15
  • Average comments per pull request: 0.0
  • Merged pull requests: 12
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 41
  • Pull requests: 15
  • Average time to close issues: 15 days
  • Average time to close pull requests: less than a minute
  • Issue authors: 8
  • Pull request authors: 3
  • Average comments per issue: 1.15
  • Average comments per pull request: 0.0
  • Merged pull requests: 12
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • jolespin (33)
  • nicolazarte (3)
  • javiercnav (3)
  • jamesck2 (2)
  • abissett (2)
  • jarrodscott (2)
  • ksyoungnm (1)
  • grendon (1)
  • joao1980 (1)
  • pereiramemo (1)
  • 411an13 (1)
  • Liping-L (1)
  • Andy-Thmpsn (1)
  • cmkobel (1)
  • bheimbu (1)
Pull Request Authors
  • jolespin (72)
  • zackhenny (1)
  • homer6 (1)
  • jarrodscott (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

install/docker/Dockerfile docker
  • mambaorg/micromamba 1.4.9 build