Recent Releases of https://github.com/cfia-ncfad/nf-flu
https://github.com/cfia-ncfad/nf-flu - 3.3.8 - ambiguous bases in consensus fix
This bugfix patch release fixes an issue where a large number of ambiguous bases in the IRMA consensus can hinder
reference selection (#67). This release also addresses an issue with using the Clair3 Biocontainers image resulting in
incomplete variant calling results, affecting nf-flu executions with the docker or singularity profiles. The
official Clair3 image is used instead. nf-flu executions using Conda and Mamba are unaffected.
Changes
- Create majority consensus from IRMA
allAlleles.txtfiles for BLASTN search - Add
irma-alleles2fasta.v, statically compiled binary (irma-alleles2fasta) and Bash build script for parsing IRMAallAlleles.txtto output naive majority consensus (i.e. This bugfix patch release fixes an issue where a large number of ambiguous bases in the IRMA consensus can hinder reference selection (#67). This release also addresses an issue with using the Clair3 Biocontainers image resulting in incomplete variant calling results, affecting nf-flu executions with thedockerorsingularityprofiles. The official Clair3 image is used instead. nf-flu executions using Conda and Mamba are unaffected.
Changes
- Create majority consensus from IRMA
allAlleles.txtfiles for BLASTN search - Add
irma-alleles2fasta.v, statically compiled binary (irma-alleles2fasta) and Bash build script for parsing IRMAallAlleles.txtto output naive majority consensus (i.e. whatever the top non-dash allele is at each position) so that the sequence used for BLASTN search does not contain any ambiguous bases. - Updated nanopore.nf subworkflow to use IRMA majority consensus with no ambiguous bases for BLASTN search so that longer more contiguous matches are possible to aid in top reference sequence selection in some cases.
- Updated parseinfluenzablast_results.py to better handle extraction of sample name and segment number from BLASTN query accession/version (qaccver).
- Using official Clair3 Docker image and updating Clair3 to v1.0.5whatever the top non-dash allele is at each position) so that the sequence used for BLASTN search does not contain any ambiguous bases.
- Updated nanopore.nf subworkflow to use IRMA majority consensus with no ambiguous bases for BLASTN search so that longer more contiguous matches are possible to aid in top reference sequence selection in some cases.
- Updated parseinfluenzablast_results.py to better handle extraction of sample name and segment number from BLASTN query accession/version (qaccver).
- Using official Clair3 Docker image and updating Clair3 to v1.0.5
What's Changed
- Fix #67 where IRMA consensus ambiguous bases can result in poor reference sequence selection by @peterk87 in https://github.com/CFIA-NCFAD/nf-flu/pull/68
Full Changelog: https://github.com/CFIA-NCFAD/nf-flu/compare/3.3.7...3.3.8
- Nextflow
Published by peterk87 over 2 years ago
https://github.com/cfia-ncfad/nf-flu - 3.3.7 - IBV PB1/PB2 mislabeling bugfix patch release
This bugfix patch release fixes an issue with mislabeling of PB1 and PB2 segments for Influenza B virus results (#65).
What's Changed
- Fix PB1 and PB2 mislabeling in IBV by @peterk87 in https://github.com/CFIA-NCFAD/nf-flu/pull/66
Full Changelog: https://github.com/CFIA-NCFAD/nf-flu/compare/3.3.6...3.3.7
- Nextflow
Published by peterk87 over 2 years ago
https://github.com/cfia-ncfad/nf-flu - CFIA-NCFAD/nf-flu v3.3.6
Fixes
- docs updated to show proper profile to run test profiles for Illumina and Nanopore locally (#52)
test_nanoporeprofile has been updated to run locally with the test samplesheet.csv updated with URLs to FASTQ files at CFIA-NCFAD/nf-test-datasets- read samplesheet CSV in
parse_influenza_blast_results.pywith all columns read as string rather than inferred (#54) - handle cloud storage paths and non-HTTP/FTP URLs in user samplesheets (#55)
- Nextflow
Published by peterk87 over 2 years ago
https://github.com/cfia-ncfad/nf-flu - 3.3.5 - negative control fix
Fixes
- handling of empty IRMA
amended_consensus/when running a negative control or blank sequence (#47)
- Nextflow
Published by peterk87 almost 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.3.4 - subtype report summary columns fix
Fixes
- Subtyping report summary sheet "1_Subtype Predictions" shows only N subtype results
- Nextflow
Published by peterk87 almost 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.3.3 - subtyping report and Nanopore variant calling fixes
This release fixes issues with subtype report generation script (parse_influenza_blast_results.py), primarily subtype predictions being N/A for samples where the top BLAST hits are user-specified sequences for the HA and NA segments.
Fixes
- subtype prediction based off majority H/N prediction of all BLAST hits instead of just the top X matches (#40)
- the top hit for H/N can also be a user-specified sequence without subtype information
- top segment matches are now sorted by sample name, segment name and BLAST bitscore
- output concatenated Nanopore FASTQ to
${outdir}/fastqby default (#43) - Handle ambiguous bases in reference sequences by having Clair3 not convert those positions to N and Bcftools produce a warning instead of an error (#42)
Changes
- subtyping report results are now ordered in the same order as the input
samplesheet.csv, that is the order of the samples in the report is the same as the order of the samples in thesamplesheet.csvfile
- Nextflow
Published by peterk87 almost 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.3.2 - IBV subtype/genotype parsing fix
This patch release fixes an IBV subtype/genotype parsing issue when generating subtyping report using the new metadata format introduced in 3.3.0 (#32).
- Nextflow
Published by peterk87 almost 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.3.1 - Conda/Mamba profile fix
This release fixes an issue with Conda/Mamba env creation introduced in 3.3.0 when using conda/mamba profile (#35).
- Nextflow
Published by peterk87 almost 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.3.0 - updated Influenza sequence DB
This release migrates to more recently updated Influenza virus sequences since the last update for the NCBI Influenza DB FTP data was in 2020-10-13. By default, all Orthomyxoviridae virus sequences were parsed from the daily updated NCBI Viruses AllNucleotide.fa and AllNuclMetadata.csv.gz and uploaded to Figshare as Zstd compressed files. nf-flu no longer uses the influenza.fna.gz and genomeset.dat.gz files for Influenza sequences and metadata, respectively.
Fixes
- More up-to-date Influenza sequences database used by default (#24)
- Nextflow
Published by peterk87 almost 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.2.1 - NoDataError bugfix patch
Fixes
- Empty BLAST results file parsing
NoDataError(#27) (Thanks @MatFish for reporting this issue!)
- Nextflow
Published by peterk87 almost 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.2.0
Added
- Influenza B virus support (#14)
- Polars for faster parsing of BLAST results (#14)
Fixes
- Irregular Illumina paired-end FASTQ files not producing IRMA assemblies (#20)
Updates
- Updated README.md to include references and citations
- Nextflow
Published by peterk87 about 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.1.6 - Clair3 Biocontainers images
This is a patch release for a minor change to use Biocontainers Docker and Singularity images for Clair3 to avoid hitting limits on pulls from Docker Hub and since Biocontainers images are half the size of hkubal/clair3 images.
Also, updated CI workflow and added issue template forms for feature request and questions.
- Nextflow
Published by peterk87 about 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.1.5 - bug fixes and updates
Added
--use_mambato enable using Mamba in place of Conda when using-profile condafor faster creation of Conda environments
Updates
- Clair3: 0.1.10 -> 1.0.2
Fixes
- user-specified Clair3 models not being found (#11)
- Conda profile not enabling Conda (#15)
- IRMA wanting too much
/tmpspace; IRMA's tmp dir will be output to the current working directory of the process job (#13) (Thanks @Codes1985 for reporting and solving this issue!)
- Nextflow
Published by peterk87 about 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.1.4 - custom Clair3 model support
This release addresses issue #11 adding a new option --clair3_user_variant_model <PATH TO CLAIR3 MODEL> to allow user to provide a Clair3 model not included with Clair3, e.g. a Rerio Clair3 model for r10 flowcells.
- Nextflow
Published by peterk87 about 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.1.3 - patch lowercase subtype support
[3.1.3] - 2023-04-28
Patch release to fix issue to handle lowercase subtypes (e.g. h1n5) from NCBI Influenza DB.
- Nextflow
Published by peterk87 about 3 years ago
https://github.com/cfia-ncfad/nf-flu - 3.1.2 - Patch fix Channel issue for params.ref_db
Patch release to fix issue when user reference sequences FASTA specified, but Channel from file is not treated as a value. Code has been reverted to use file Nextflow function.
- Nextflow
Published by peterk87 almost 4 years ago
https://github.com/cfia-ncfad/nf-flu - 3.1.1 - Patch fix FASTA concatenation issue
Patch release to fix issue when a user-specified sequences FASTA is provided and the FASTA is concatenated with the NCBI influenza sequences FASTA, but there is no new-line character at the end of the FASTA files. New line characters are added to the FASTA files to avoid incorrect concatenation.
- Nextflow
Published by peterk87 almost 4 years ago
https://github.com/cfia-ncfad/nf-flu - 3.1.0 - Nanopore subworkflow
The workflow's name has been changed from nf-iav-illumina to nf-flu and the official repo for nf-flu will be CFIA-NCFAD/nf-flu going forward.
Version 3 is a major release adding a Nanopore influenza sequence analysis subworkflow using IRMA for initial assembly and BLAST against NCBI Influenza DB sequences and optionally, user-specified sequences to identify the top reference sequence for each segment for each sample. A standard read mapping/variant calling analysis is performed: for each sample, Nanopore reads are mapped separately against each gene segment reference sequence using Minimap2; variant calling of read alignments is performed using Clair3; depth-masked consensus sequence is generated using Bcftools. Consensus sequences are BLAST searched against NCBI Influenza (and user-specified sequences) to generate a BLAST summary report and H/N subtyping report. MultiQC is used to summarize results into an interactive HTML report.
NOTE: Read mapping/variant calling analysis has not been ported to the Illumina sequence analysis subworkflow.
3.1.0 changes
- Added back
bin/fastq_dir_to_samplesheet.pyfor Illumina--inputsamplesheet creation from Illumina FASTQ reads directory - Fixed issue #12. Nanopore sample sheet can specify a mix of single FASTQ files and/or directories containing FASTQ files. Different reads with the same sample name will be merged prior to analysis. FASTQs can be GZIP compressed and have the extensions:
.fastq,.fq,.fastq.gz,.fq.gz. Updated CI tests to test for this flexible sample sheet handling. - Switched to GitHub YAML form for bug report template from Markdown template.
- CI tests now output
results/pipeline_info/and.nextflow.logas artifacts for easier debugging of issues.
- Nextflow
Published by peterk87 about 4 years ago