revica
A reference-based viral consensus genome assembly pipeline
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.7%) to scientific vocabulary
Repository
A reference-based viral consensus genome assembly pipeline
Basic Info
- Host: GitHub
- Owner: asereewit
- Language: Nextflow
- Default Branch: main
- Size: 288 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 3
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
REVICA
Revica is a reference-based viral consensus genome assembly pipeline for some of the most common respiratory viruses. Revica currently supports genome assembly of: - Enterovirus (EV) - Seasonal human coronavirus (HCOV) - Human metapneumovirus (HMPV) - Human respiratory syncytial virus (HRSV) - Human parainfluenza virus (HPIV) - Measles morbillivirus (MeV) - Influenza A virus (IAV) - Influenza B virus (IBV) - Human adenovirus (HAdV)
Workflow

Usage
Install Nextflow
Install Docker
To run Revica:
nextflow run asereewit/revica -r main -latest --input example_samplesheet.csv --output example_output -profile docker
on AWS:
nextflow run asereewit/revica -r main -latest --input example_samplesheet.csv --output example_output -profile docker -c your_nextflow_aws.config
Options
|Option|Explanation|
|------|-----------|
| --input | samplesheet in csv format with fastq information |
| --output | output directory (default: revicaoutput) |
| --db | (multi)fasta file to overwrite the bundled viral database |
| `--runname| name for the summary tsv file (default: 'run') |
|--skipfastqc| skip quality control using FastQC (default: false) |
|--skipfastp| skip adapters and reads trimming using fastp (default: false) |
|--runkraken2| run Kraken2 for classifying reads (default: false) |
|--kraken2db| Kraken2 database for reads classification, needs to be specified when using--runkraken2|
|--kraken2variantshostfilter| use reads that didn't map to the kraken2 database for downstream consensus calling |
|--savekraken2unclassifiedreads| save reads that didn't map to the specified kraken2 database |
|--savekraken2classifiedreads| save reads that map to the specified kraken2 database |
|--trimlen| minimum read length to keep (default:50) |
|--savetrimmedreads| save trimmed fastq |
|--sample| downsample fastq to a certain fraction or number of reads |
|--refminmediancov| minimum median coverage on a reference for consensus assembly (default: 3) |
|--refmingenomecov| minimum reference coverage percentage for consensus assembly (default: 60%) |
|--ivarconsensust| minimum frequency threshold to call consensus (default: 0.6) |
|--ivarconsensusq| minimum quality score threshold to call consensus (default: 20) |
|--ivarconsensus_m` | minimum depth to call consensus (default: 5) |
Usage notes
- Samplesheet example:
assets/samplesheet.csv - You can create a samplesheet using the bundled python script:
python bin/fastq_dir_samplesheet.py fastq_dir samplesheet_name.csv - Memory and CPU usage for pipeline processes can be adjusted in
conf/base.config - Process arguments can be adjusted in
conf/modules.config - You can use your own reference(s) for consensus genome assembly by specifying the
--dbparameter followed by your fasta file.- reference header format:
>reference_accession reference_tag reference_header_info - it's important to tag the fasta sequences for the same species or gene segments with the same name or abbreviation in the header section, otherwise the pipeline
will generate a consensus genome for every reference where the median coverage of the first alignment exceed the specified threshold (default 3).
- Revica works with segmented viral genomes, just keep the different gene segments separated and tag them in the reference fasta file
- reference header format:
- If you are using Docker on Linux, check out these post-installation steps (especially cgroup swap limit capabilities support) for configuring Linux to work better with Docker.
- By default, Docker has full access to full RAM and CPU resources of the host, but if you are using MacOS, go to Settings -> Resources in Docker Desktop to make sure enough resources are allocated to docker containers.
Contact
For bug reports please email aseree@uw.edu or raise an issue on Github.
Owner
- Login: asereewit
- Kind: user
- Repositories: 1
- Profile: https://github.com/asereewit
Citation (CITATION.cff)
message: "If you use this software, please cite it as below."
title: "Revica"
abstract: Revica is a reference-based viral consensus genome assembly pipeline
authors:
- family-names: Sereewit
given-names: Jaydee
orcid: https://orcid.org/0000-0002-7937-6398
- family-names: Greninger
given-names: Alexander
orcid: https://orcid.org/0000-0002-7443-0527
date-released: 2022-04-26
repository-code: https://github.com/greninger-lab/revica