parallel-virfinder

Run virfinder in parallel

https://github.com/quadram-institute-bioscience/parallel-virfinder

Science Score: 75.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: springer.com
  • Academic email domains
  • Institutional organization owner
    Organization quadram-institute-bioscience has institutional domain (quadram.ac.uk)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Run virfinder in parallel

Basic Info
  • Host: GitHub
  • Owner: quadram-institute-bioscience
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 90.8 KB
Statistics
  • Stars: 7
  • Watchers: 5
  • Forks: 0
  • Open Issues: 0
  • Releases: 4
Created over 3 years ago · Last pushed over 3 years ago
Metadata Files
Readme License Citation

README.md

parallel-virfinder

Test-Package License Bioconda downloads Bioconda platforms

Run virfinder in parallel, saving both scores and FASTA file

Installation

Via conda:

bash conda install -c bioconda -c conda-forge parallel-virfinder

Usage

bash parallel-virfinder.py -i input.fasta -o output.csv -t THREADS [-f output.fasta]

Options

```text usage: parallel-virfinder.py [-h] -i INPUT -o OUTPUT [-f FASTA] [-n PARALLEL] [-t TMPDIR] [-s MIN_SCORE] [-p MAXPVALUE] [--no-check] [-v] [-d]

Execute virfinder on a FASTA file in parallel

options: -h, --help show this help message and exit -i INPUT, --input INPUT Input FASTA file -o OUTPUT, --output OUTPUT Output CSV file -f FASTA, --fasta FASTA Save FASTA file -n PARALLEL, --parallel PARALLEL Number of parallel processes [default: 4] -t TMPDIR, --tmpdir TMPDIR Temporary directory [default: /tmp]

VirFinder options: -s MINSCORE, --min-score MINSCORE Minimum score [default: 0.9] -p MAXPVALUE, --max-p-value MAXPVALUE Maximum p-value [default: 0.05]

Running options: --no-check Do not check dependencies at startup -v, --verbose Verbose output -d, --debug Debug output and do not remove temporary files

```

Test

Clone this repository, activate the conda environment and run:

```bash

Activate the appropriate conda environment, if needed

bash test/test.sh ```

Benchmark

If compared with a parallel implementation in R, this wrapper performs better (smaller times, smaller memroy usage). See benchmark.

Citations

If you use parallel-virfinder, please cite the following paper:

  • Ren, J., Ahlgren, N. A., Lu, Y. Y., Fuhrman, J. A., & Sun, F. (2017). VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data. Microbiome, 5(1), 1-20.

  • Telatin, A., Fariselli, P., & Birolo, G. (2021). Seqfu: a suite of utilities for the robust and reproducible manipulation of sequence files. Bioengineering, 8(5), 59.

License

VirFinder (see license) is free to use for academic or non commercial use only. SeqFu and this wrapper are free to use.

Owner

  • Name: Quadram Institute Bioscience
  • Login: quadram-institute-bioscience
  • Kind: organization

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: parallel-virfinder
message: >-
  Please cite VirFinder and SeqFu if using this
  package
type: software
authors:
  - given-names: Andrea
    family-names: Telatin
    orcid: 'https://orcid.org/0000-0001-7619-281X'

GitHub Events

Total
Last Year