Recent Releases of grenepipe
grenepipe - grenepipe v0.15.0
Notable Changes
- Upgrade freebayes 1.3.8 → 1.3.9, to fix freebayes/freebayes#764
- Add
min-contig-sizeandmax-contigs-per-groupoptions - Add script to collect and plot benchmarks
- Clean up log and benchmark paths
- Refine generate table script to accept accession numbers
- Python
Published by lczech 11 months ago
grenepipe - grenepipe v0.14.0
This update restructures some intermediate file names of the calling steps. Furthermore, we now request dummy done files for almost all rules, which are created once a rule finishes successfully. Usually, Snakemake should not need those, and be able to figure out if a rule was executed successfully or failed. However, there are some circumstances where this mechanism does not work, and Snakemake fails us. Hence, we now manually fix this via those dummy files.
As a consequence of these changes, note that when you upgrade to this version of grenepipe, any existing analysis run will likely want to re-compute most of its files, as the restructured file names and the dummy files will not be present.
Major Changes
- Add GenomicsDBImport as new default for merging GATK HaplotypeCaller gvcfs #55
- Switch to new picard command line syntax with dashes
- Split java options from memory resources in config (with backwards compatibility)
- Restructure calling intermediate files
- Request
donefiles for (almost) all rules - Upgrade tools:
- picard 3.2.0 → 3.3.0
- gatk4 4.5.0.0 → 4.6.1.0
- freebayes 1.3.7 → 1.3.8, to fix freebayes/freebayes#796
Bugfixes
- Fix picard CollectMultipleMetrics java memory issue #60
- Fix freebayes issue freebayes/freebayes#796
- Make bowtie indexing a non-local rule
- Fix slurm config number of threads for variant calling
- Python
Published by lczech about 1 year ago
grenepipe - grenepipe v0.13.4
Notable Changes
- Refine default resource config to work around https://github.com/snakemake/snakemake/issues/2997 and https://github.com/snakemake/snakemake/issues/3191
- Fix HAF-pipe merge samples script, and refine success checks for HAF-pipe rules
- Python
Published by lczech about 1 year ago
grenepipe - grenepipe v0.13.3
Notable Changes
- Update slurm cluster config files for new snakemake executor, fix #51, fix #54
- Add
--shallow(no directory recursion) option togenerate-tablescript @meixilin - Add
F1/R2naming scheme (Genome Sequence Archive) togenerate-tablescript
- Python
Published by lczech about 1 year ago
grenepipe - grenepipe v0.13.2
Notable Changes
- Fix conda setup for samtools rules, which broke with recent pandas/numpy versions (again)
- Python
Published by lczech about 1 year ago
grenepipe - grenepipe v0.13.1
Minor update for some quality of life after the last big release of v0.13.0
Notable Changes
- Add extra snakemake logging to work around Snakemake bug (https://github.com/snakemake/snakemake/issues/2974)
- Add backwards compatibility with the
config.yamlpre-v0.13.0
- Python
Published by lczech over 1 year ago
grenepipe - grenepipe v0.13.0
This is a major update of grenepipe, as conda suddenly broke backwards compatibility, and so nothing was working any more. We used this opportunity to updated almost all tools in the pipeline to their most recent versions, and also added some new features. Furthermore, we restructured the internal files for compliance with the Snakemake workflow catalog, and restructured the output files of the pipeline for more convenience and clarity.
Major Changes
- Upgrade from Snakemake v6.0.5 to v8.15.2 by default
- python 3.7.10 → 3.12
- pandas 1.3.1 → 2.2.2
- numpy 1.21.2 → 2.0.0
- Upgrade tools:
- adapterremoval 2.3.1 → 2.3.3
- bcftools 1.16 → 1.20
- bowtie2 2.4.1 → 2.5.4
- bwa 0.7.17 → 0.7.18
- cutadapt 2.10 → 4.9
- fastqc 0.11.9 → 0.12.1
- freebayes 1.3.1 → 1.3.7
- gatk4 4.1.4.1 → 4.5.0.0
- multiqc 1.10.1 → 1.22.3
- picard 2.27.4 → 3.2.0
- qualimap 2.2.2d → 2.3.0
- samtools 1.16.1 → 1.20
- seqkit 2.2.0 → 2.8.2
- snpeff 4.3.1t → 5.2
- vep ensembl 104 → 112
- Restructure all output files for more user convenience. See wiki for details.
New Features
- Add variant calling from bam files workflow (instead of starting from fastq files) #47
- Add automatic download of reference genome and known references #41
- Make trimming tool optional #35
Bugfixes
- Fix conda env for bcftools #37
- Deactivate bcftools stats by default
- Add picard-compatible default to bwa mem
- Add reference genome name check #44
- Update ref genome extensions for new GATK
- Python
Published by lczech over 1 year ago
grenepipe - grenepipe v0.12.2
Notable Changes
- Fix snakemake wrapper utils, which suddenly broke backwards compatibility (see here)
- Fix vcftools, which also suddenly broke, despite fixed version (see here)
- Python
Published by lczech over 2 years ago
grenepipe - grenepipe v0.12.1
Notable Changes
- Add single end mode to generate table script
- Add java options to picard tools
- Replace GATK by Picard for CreateSequenceDictionary rule
- Fix minor platform issue with empty cells in samples table
- Python
Published by lczech almost 3 years ago
grenepipe - grenepipe v0.12.0
This release restructures the merging of mapped bam files prior to variant calling, see Processing of the mapped reads for details. It has breaking changes in the config file (e.g., renaming the entry for the samples table to data: samples-table, see below), so you will need to use the new config file to start an analysis.
Furthermore, some tool versions were updated (although those should be non-breaking changes), and the overall robustness of the conda environments has been greatly increased, in particular for running on MacOS. All environments work with conda and with mamba now, but we still strongly recommend to use mamba; in our tests, conda needs 4h, and mamba 15min to install all environments.
Notable Changes
- Change
data: samplestodata: samples-tablein the config file, and change some tool parameter keys - Rework mapped read merging to occur before all other bam processing
- Fix bam read group ID tag to match sample units. The ID was equal to the sample name before; now with the reworked merging of units, we also use the unit in the read group ID
- Add further config validations of the samples table
- Improved tests and CI, testing on Ubuntu and MacOS now, with conda and mamba
Bug Fixes
- Fix and update several conda environments, in particular for picard and qualimap, which were not solving with
condabefore. This fixes #25 and #11. As a result, some tools have been upgraded:- bcftools 1.13 → 1.16
- picard 2.20.1 → 2.27.4
- samtools 1.12 → 1.16.1
- vcflib 1.0.2 → 1.0.3
- Fix #29, correctly set bwa mem2 threads
- Python
Published by lczech about 3 years ago
grenepipe - grenepipe v0.11.1
Bug Fixes
- Fix temp directory in
samtools sortforbwa mem2andbwa alnmapping tools #28
- Python
Published by lczech about 3 years ago
grenepipe - grenepipe v0.11.0
Getting closer to v1.0.0! For now, this and the next couple of releases will still have some changes in the config.yaml that are not backwards compatible though, but hoping to make it future proof in the long run that way.
Notable Changes
- De-indent reference genome and known variants in config
- Switch to our improved version of HAF-pipe (https://github.com/petrov-lab/HAFpipe-line/pull/9)
- Add HAF-pipe per-sample concatenated tables
- Add flag to turn off bcftools stats for avoiding variant calling to get MultiQC report
- Add target for per-sample merged bam output
- Add bcftools filter tool for VCF filtering
- Add dbsnp option to all GATK tools, rename GATK config params
- Add interleaved fastq test to generate table script for more robustness
Bug Fixes
- Fix samtools tmp dir creation issues throughout the pipeline
- Fix typo vqrs instead of vqsr
- Fix GATK helper function location
- Fix vep plugin download issue
- Fix mapdamage conda env
- Fix vcf index file duplication
- Fix missing R package for MapDamage
- Add flag files to avoid incomplete job executions
- Python
Published by lczech about 3 years ago
grenepipe - grenepipe v0.10.0
This is the release version accompanying the grenepipe publication:
grenepipe: A flexible, scalable, and reproducible pipeline
to automate variant calling from sequence reads. Lucas Czech and Moises Exposito-Alonso. Bioinformatics. 2022. doi:10.1093/bioinformatics/btac600 [pdf]
Notable Changes
- Add HAF-pipe rules for computing haplotype-based allele frequencies
- Add proper 1001g-based known-variants file as an example and for testing
- Add copy samples script
Bug Fixes
- Fix hard filter name duplication in filtered VCF
- Fix bcftools sample sorting order issue
- Ungroup GATK HaplotypeCaller merging
- Ungroup filter merge step
- Add progressbar and termcolor python dependencies to grenepipe env
- Python
Published by lczech over 3 years ago
grenepipe - grenepipe v0.9.0
Notable Changes
- Properly implement GATK Variant Quality Score Recalibration (VQSR)
- Add
bcftools callfor individual samples instead of combined calling - Add read clipping with BamUtil
- Add SeqKit for reporting statistics of the reference genome
- Add
bcftools statsreporting on the finalvcf - Add settings for keeping/removing intermediate files, to save disk storage
Bug Fixes
- Fix all issues related to
samtool sortneeding a temporary directory to properly run - Fix using optional mapping steps in combination with DeDup
- Fix fastqc log output, which was not properly reported
- Python
Published by lczech over 3 years ago
grenepipe - grenepipe v0.8.0
Notable Changes
- Change main grenepipe environment to Snakemake 6.0.5 (for now)
- Add platform info to bam read group tags #20
- Add support for local snpEff database #13
- Remove shadow rules, as they create too many intermediate files
Bug Fixes
- Fix snpEff download directory to work without trailing slash #12
- Fix R1 matching in generate table script
- Switch from pandas
appendtoconcatto avoid deprecation
- Python
Published by lczech over 3 years ago
grenepipe - grenepipe v0.7.0
Notable Changes
- Add
SeqPrepfor trimming - Remove write-protection from intermediate called files, as this caused more trouble than benefits
Bug Fixes
- Fix general incompatibility issues caused by conflicting versions of python/pandas/numpy in rule-specific conda environments
- Python
Published by lczech almost 4 years ago
grenepipe - grenepipe v0.6.1
Bug Fixes
- Fix an issue where the requested wall time in the cluster profile was ignored for some mapping tools.
- Python
Published by lczech almost 4 years ago
grenepipe - grenepipe v0.6.0
Notable Changes
- Add
bwa mem2for mapping - Add better support for genomes with small contigs / scaffolds to avoid submitting individual compute jobs for each of them
- Refactor
FastQCrules to use both files of paired end reads, or merge trimmed reads - Declare the
restrict-regionssetting experimental
Bug Fixes
- Fix
picard CollectMultipleMetricsmissing dependency - Fix missing fasta dict dependency in filtering rule
- Fix incompatibility issue with python 3.8 and pandas under snakemake > 6
- Fix incompatibility between samtools 1.9 and python 3.7
- Python
Published by lczech over 4 years ago
grenepipe - grenepipe v0.5.1
Bug Fixes
- Fix the order of mpileup columns in merged pilup files to fit with the order of the input samples table.
- Python
Published by lczech over 4 years ago
grenepipe - grenepipe v0.5.0
This release adds better support for bam files, adds support for (m)pileup files, and provides fixes for conda dependency issues on clusters where conda and python are not well set up.
Notable Changes
- Add targets and rules to create bam files only, see here
- Add targes and rules to create (m)pileup files
- Allow changing the bcftools call calling method
Bug Fixes
- Fix conda dependency specifications for Bowtie2, Cutadapt, Picard, Trimmomatic, bwa, freebayes, MultiQC, VEP, snpeff, and all GATK tools
- Fix generate table script to work with symlinks
- Fix prepare step issue with gz-compressed reference genomes on CentOS
- Fix incompatibility between bcftools and Picard by switching to vcflib for merging
- Fix issue with bcftools calls containing IUPAC ambiguity codes
- Fix issue with freebayes not sorting large vcf files properly
- Python
Published by lczech over 4 years ago
grenepipe - grenepipe v0.4.0
Grenepipe is now a single-command tool. The prep step is no longer needed!
Notable Changes
- Refactor to remove prep step requirement
- Add VEP annotation tool
- Change SnpEff config setup, moving the database name to the
paramssection - Change SnpEff stats output directory to
annotated
Bug Fixes
- Fix yet more missing dependency issues in the MultiQC and bwa wrappers
- Python
Published by lczech almost 5 years ago
grenepipe - grenepipe v0.3.0
Notable Changes
- Add bwa aln mapping tool
- Add extra parameters config settings for tools that did not have them yet
- Remove vqsr filtering / GATK VariantRecalibrator
- Add testing setup and test cases for all tools
Bug Fixes
- Fix issue with Picard CollectMultipleMetrics omitting empty files
- Fix bowtie2 conda environment issues due to changed dependency
- Disable incompatibility between bowtie2 and picard
- Disable recalibrate-base-qualities usage wihtout known-variants
- Python
Published by lczech almost 5 years ago
grenepipe - grenepipe v0.2.0
Official release for the arXiv manuscript describing grenepipe:
Lucas Czech, Moises Exposito-Alonso. grenepipe: A flexible, scalable, and reproducible pipeline to automate variant and frequency calling from sequence reads. arXiv. 2021. arXiv:2103.15167
Notable Changes
- Refine helper scripts for sample table generation and for FTP download
- Improve documentation in Wiki
- Better parallelization of FreeBayes
- Add Picard CollectMultipleMetrics QC statistics
- Add samtools view filtering step after mapping
- Add frequency table output
Bug Fixes
- Fix freebayes threads counting
- Fix fastqc rule based on wrapper issue
- Fix vcfpy version parsing issue
- Fix gzip compression issue in prep rule
- Fix slurm submission log file path setup
- Python
Published by lczech almost 5 years ago
grenepipe - grenepipe v0.1.0
Initial release of grenepipe. Has sufficient documentation to get it working (hopefully), is stable enough for production (presumably), but still contains bugs (probably), and needs more features (definitely).
- Python
Published by lczech about 5 years ago