https://github.com/cfia-ncfad/nf-ionampliseq
Ion Torrent AmpliSeq sequencing data analysis pipeline for FMDV and CSFV
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 12 DOI reference(s) in README -
✓Academic publication links
Links to: ncbi.nlm.nih.gov, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.6%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Ion Torrent AmpliSeq sequencing data analysis pipeline for FMDV and CSFV
Basic Info
Statistics
- Stars: 0
- Watchers: 0
- Forks: 1
- Open Issues: 1
- Releases: 3
Fork of peterk87/nf-ionampliseq
Created about 5 years ago
· Last pushed 10 months ago
https://github.com/CFIA-NCFAD/nf-ionampliseq/blob/master/
# CFIA-NCFAD/nf-ionampliseq
Read mapping, variant calling and consensus sequence generation workflow for Ion Torrent Ampliseq sequence data of [FMDV] and [CSFV].
[](https://doi.org/10.5281/zenodo.16821383)
[](https://github.com/CFIA-NCFAD/nf-ionampliseq/actions)
[](https://www.nextflow.io/)
[](https://docs.conda.io/en/latest/)
[](https://www.docker.com/)
[](https://apptainer.org/)
[](https://sylabs.io/docs/)
## Introduction
The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.
## Quick Start
1. Install [`nextflow`](https://nf-co.re/usage/installation)
2. Install either [`Docker`](https://docs.docker.com/engine/installation/) or [`Singularity`](https://www.sylabs.io/guides/3.0/user-guide/) for full pipeline reproducibility _(please only use [`Conda`](https://conda.io/miniconda.html) as a last resort; see [docs](https://nf-co.re/usage/configuration#basic-configuration-profiles))_
3. Download the pipeline and test it on a minimal dataset with a single command:
```bash
nextflow run CFIA-NCFAD/nf-ionampliseq -profile test,
```
> Please check [nf-core/configs](https://github.com/nf-core/configs#documentation) to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use `-profile ` in your command. This will enable either `docker` or `singularity` and set the appropriate execution settings for your local compute environment.
4. Start running your own analysis!
```bash
nextflow run CFIA-NCFAD/nf-ionampliseq -profile --input '/path/to/iontorrent/*.bam'
```
See [usage docs](docs/usage.md) for all of the available options when running the pipeline.
## Documentation
The CFIA-NCFAD/nf-ionampliseq pipeline comes with documentation about the pipeline which you can read at [https://CFIA-NCFAD/nf-ionampliseq/docs](https://CFIA-NCFAD/nf-ionampliseq/docs) or find in the [`docs/` directory](docs).
This workflow includes several built-in analysis packages for Ion Torrent AmpliSeq sequence data of [CSFV] and [FMDV]. Users can also specify their own analysis packages, however, these files must be compatible with the Ion Torrent Software Suite including [tmap] and [tvc]
### Input
There are three methods of specifying input: `--input`; `--rundir`; `--sample_sheet` and `--panel` (or `--ref_fasta` and `--bed_file`). For all input modes, the primary input is BAM files generated by [Torrent Suite][] [tmap][] tooling.
The simplest way of running this workflow is with `--input` pointing at your Ion Torrent [Torrent Suite] produced BAM files:
```bash
nextflow run CFIA-NCFAD/nf-ionampliseq -profile --input '/path/to/*.bam'
```
With BAM file inputs specified via `--input`, the sample name and correct AmpliSeq panel (either [CSFV] or [FMDV]) will be determined from the BAM file headers.
You can also specify the Ion Torrent sequencing run directory as input with `--rundir`. All BAM files matching `IonCode_*_rawlib.bam` will be run through the workflow with sample names retrieved from the `ion_params_00.json`.
```bash
nextflow run CFIA-NCFAD/nf-ionampliseq -profile --rundir /path/to/rundir
```
If you wish to use custom names and a specific AmpliSeq panel it is recommended that you specify the following:
- `--sample_sheet`
- CSV file with 2 columns:
- Column 1: sample name
- Column 2: path to raw BAM file from Ion Torrent (absolute path recommended)
- `--panel`
- either `csf` or `fmd` for built-in AmpliSeq panel, otherwise, the user will need to specify a reference genome(s) FASTA file (`--ref_fasta`) and detailed BED file (`--bed_file`)
### Steps
1. [BAM Sample Info](docs/output.md#bam-sample-info) - Sample info extracted from BAM file headers
2. [FASTQ Reads](docs/output.md#fastq-reads) - BAM to FASTQ output with [Samtools][].
3. [FastQC](docs/output.md#fastqc) - Read quality control using [FastQC][].
4. [Mash](docs/output.md#mash) - Top reference genome determination by [Mash][] screen
5. [TMAP](docs/output.md#tmap) - Read mapping using the Thermo Fisher mapper [tmap]
6. [Samtools](docs/output.md#samtools) - Read mapping stats calculation with [Samtools][]
7. [Mosdepth](docs/output.md#mosdepth) - Coverage stats calculated by [Mosdepth][]
8. [TVC](docs/output.md#tvc) - Variant calling using the Thermo Fisher variant caller [tvc]
9. [Bcftools](docs/output.md#bcftools) - Variant filtering for majority consensus sequence generation and variant statistics for MultiQC report.
10. [Consensus Sequence](docs/output.md#consensus-sequence) - Majority consensus sequence with `N` masking of low/no coverage positions using [Bcftools][].
11. [Edlib Pairwise Alignment](docs/output.md#edlib-pairwise-alignment) - Pairwise global alignment and edit distance between reference and consensus sequences using [Edlib].
12. [BLAST Analysis](docs/output.md#blast-analysis) - Optional nucleotide [BLAST][] analysis against a user-specified BLAST DB
13. [MultiQC](docs/output.md#multiqc) - Aggregate report describing results from the whole pipeline. Consensus sequences are embedded in the [MultiQC][] HTML report and can be downloaded from it.
14. [Pipeline information](docs/output.md#pipeline-information) - Report metrics generated during the workflow execution
### Output
For more information about the analysis steps and output of the pipeline, see the [output documenation](docs/output.md).
## Credits
CFIA-NCFAD/nf-ionampliseq was originally written by Peter Kruczkiewicz.
## Contributions and Support
If you encounter any issues when running this pipeline, please see the documentation above
If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).
The development of this pipeline tries to follow the guidelines and best-practices established by [nf-core](https://nf-co.re/) and was bootstrapped using [nf-core tools](https://nf-co.re/tools#creating-a-new-workflow). One day this pipeline may be added to nf-core.
## Citation
If you use CFIA-NCFAD/nf-ionampliseq for your analysis, please cite it using the following doi: [10.5281/zenodo.16821383](https://doi.org/10.5281/zenodo.16821383)
```text
Kruczkiewicz, P. and Lung, O. (2025) CFIA-NCFAD/nf-ionampliseq: 2.0.1. Zenodo. doi:10.5281/zenodo.16821383.
```
See [citations.bib](citations.bib) for tool and software citation information for the software and bioinformatics tools that CFIA-NCFAD/nf-ionampliseq uses.
You can cite the `nf-core` publication as follows:
> **The nf-core framework for community-curated bioinformatics pipelines.**
>
> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
>
> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).
> ReadCube: [Full Access Link](https://rdcu.be/b1GjZ)
[Bcftools]: https://samtools.github.io/bcftools/bcftools.html
[BLAST]: https://blast.ncbi.nlm.nih.gov/
[CSFV]: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi
[Edlib]: https://github.com/Martinsos/edlib
[FastQC]: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
[FMDV]: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=12110&lvl=3&lin=f&keep=1&srchmode=1&unlock
[Mash]: https://doi.org/10.1186/s13059-019-1841-x
[Mosdepth]: https://github.com/brentp/mosdepth
[MultiQC]: https://docs.seqera.io/multiqc
[Samtools]: https://www.htslib.org/
[TMAP]: https://github.com/iontorrent/TS/
[Torrent Suite]: https://github.com/iontorrent/TS
[TVC]: http://updates.iontorrent.com/tvc_standalone/
Owner
- Name: CFIA NCFAD - Genomics Unit
- Login: CFIA-NCFAD
- Kind: organization
Canadian Food Inspection Agency National Centre for Foreign Animal Disease
GitHub Events
Total
- Release event: 1
- Watch event: 1
- Issue comment event: 1
- Push event: 17
- Pull request event: 13
- Create event: 7
Last Year
- Release event: 1
- Watch event: 1
- Issue comment event: 1
- Push event: 17
- Pull request event: 13
- Create event: 7
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 0
- Total pull requests: 6
- Average time to close issues: N/A
- Average time to close pull requests: about 2 hours
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 6
- Average time to close issues: N/A
- Average time to close pull requests: about 2 hours
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- peterk87 (6)