parallel-virfinder
Run virfinder in parallel
https://github.com/quadram-institute-bioscience/parallel-virfinder
Science Score: 75.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: springer.com -
○Academic email domains
-
✓Institutional organization owner
Organization quadram-institute-bioscience has institutional domain (quadram.ac.uk) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.3%) to scientific vocabulary
Repository
Run virfinder in parallel
Basic Info
- Host: GitHub
- Owner: quadram-institute-bioscience
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 90.8 KB
Statistics
- Stars: 7
- Watchers: 5
- Forks: 0
- Open Issues: 0
- Releases: 4
Metadata Files
README.md
parallel-virfinder
Run virfinder in parallel, saving both scores and FASTA file
Installation
Via conda:
bash
conda install -c bioconda -c conda-forge parallel-virfinder
Usage
bash
parallel-virfinder.py -i input.fasta -o output.csv -t THREADS [-f output.fasta]
Options
```text usage: parallel-virfinder.py [-h] -i INPUT -o OUTPUT [-f FASTA] [-n PARALLEL] [-t TMPDIR] [-s MIN_SCORE] [-p MAXPVALUE] [--no-check] [-v] [-d]
Execute virfinder on a FASTA file in parallel
options: -h, --help show this help message and exit -i INPUT, --input INPUT Input FASTA file -o OUTPUT, --output OUTPUT Output CSV file -f FASTA, --fasta FASTA Save FASTA file -n PARALLEL, --parallel PARALLEL Number of parallel processes [default: 4] -t TMPDIR, --tmpdir TMPDIR Temporary directory [default: /tmp]
VirFinder options: -s MINSCORE, --min-score MINSCORE Minimum score [default: 0.9] -p MAXPVALUE, --max-p-value MAXPVALUE Maximum p-value [default: 0.05]
Running options: --no-check Do not check dependencies at startup -v, --verbose Verbose output -d, --debug Debug output and do not remove temporary files
```
Test
Clone this repository, activate the conda environment and run:
```bash
Activate the appropriate conda environment, if needed
bash test/test.sh ```
Benchmark
If compared with a parallel implementation in R, this wrapper performs better (smaller times, smaller memroy usage). See benchmark.
Citations
If you use parallel-virfinder, please cite the following paper:
Ren, J., Ahlgren, N. A., Lu, Y. Y., Fuhrman, J. A., & Sun, F. (2017). VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data. Microbiome, 5(1), 1-20.
Telatin, A., Fariselli, P., & Birolo, G. (2021). Seqfu: a suite of utilities for the robust and reproducible manipulation of sequence files. Bioengineering, 8(5), 59.
License
VirFinder (see license) is free to use for academic or non commercial use only. SeqFu and this wrapper are free to use.
Owner
- Name: Quadram Institute Bioscience
- Login: quadram-institute-bioscience
- Kind: organization
- Website: https://quadram.ac.uk/
- Repositories: 25
- Profile: https://github.com/quadram-institute-bioscience
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: parallel-virfinder
message: >-
Please cite VirFinder and SeqFu if using this
package
type: software
authors:
- given-names: Andrea
family-names: Telatin
orcid: 'https://orcid.org/0000-0001-7619-281X'