Recent Releases of nf-pipeline-regenie
nf-pipeline-regenie - v1.9.4
This release fixes a bug in rare variants workflows and introduces a few minor upgrades and a config file restructure to increase flexibility. The pipeline now uses regenie v4.0
Upgrades
- Fixed a bug when exporting log files for regenie step2 in rare variants test
- Add the possibility of running joint burden tests in step 2 (like minp, acat, sbat). This can be set using the new parameter
rarevars_joint_test - Now use regenie v4.0
- Starting resources required for regenie step2 rare variant analysis are now set at 8 cpus and 42G memory
Code refactoring
- Use regenie directly from the official docker images for more flexibility
- Updated docker image with upgrades to plink2 (now 20241206) and python packages
- Minor changes to the configuration files
- Nextflow
Published by edg1983 over 1 year ago
nf-pipeline-regenie - v1.9.3
This release fixes issues related to using a list of SNPs / genes to limit the analysis in multi-model or multi-project modalities.
The pipeline multi-model and multi-project modalities now work correctly when a list of SNPs / genes is configured with regenie_extract_snps or regenie_extract_genes.
Two new columns are now accepted in the multi-project input table to configure a restricted analysis based on a list of SNVs and/or a list of genes (see also the updated docs):
- extractsnpslist (optional): a text file containing a list of variants IDs to restrict GWAS analysis
- extractgeneslist (optional): a text file containing a list of gene IDs to restrict rare variant analysis
- Nextflow
Published by edg1983 about 2 years ago
nf-pipeline-regenie - v1.9.2
Minor release fixing a few bugs affecting mainly the multi-project execution:
- correctly manage missing additional columns in the multi-project input table
- not setting the cat covar column caused the pipeline to crash since the catcovar was set to null
- not configuring additional datasets in a multi-project table caused the pipeline to crash
- analysis reports were not generated even if requested
- Nextflow
Published by edg1983 over 2 years ago
nf-pipeline-regenie - v1.9.1
Updates in this release:
- Move to regenie version 3.3
- This release fixes problems with conditional / interaction analysis. To ensure consistency of conditional / interaction analysis across chunks, you must provide an additional genotype dataset containing the SNPs you want to condition on (or test interaction for).
- There are 2 new parameters to specify the additional dataset for conditional / interaction testing:
additional_geno_fileandadditional_geno_format. See the documentation for more details. - Fix the issue causing specified chromosome(s) to be ignored when using a bgen input
- Fix the issue with reports when there were no hits passing the top-hits threshold
- Require Nextflow version < 23.07 to avoid problems in report generation due to the new
--no-homedefault for singularity in newer Nextflow versions
- Nextflow
Published by edg1983 over 2 years ago
nf-pipeline-regenie - v1.9
This is a massive new release including many new features:
- analysis is now fully parallelized in step 1 for both L0 and L1 levels. This reduces the resources requested for L1 regression and speeds up analysis, especially when analyzing a large batch of binary phenotypes
- the pipeline can now accept for step 2 a dataset provided in multiple chunk files (convenient for large WGS cohorts where data may be provided as multiple chunk files)
- new multi-projects mode that allows you to provide a manifest file describing multiple projects to be run together. This provides a convenient way to configure similar analyses when you have pre-processed phenotype/covariate data. For example, you want to test various combinations of covariates for the same phenotypes, or you want to replicate the same analysis using different conditional snp lists or GxE / GxG variables.
- VCF input dataset is now converted to bgen instead of pgen to improve perfomances
- drop the need for a snplist file to split step2 execution when you use BGEN input. This simplifies the input and avoids unnecessary additional step for snplist generation.
- you can now perform GxE and GxG analysis at step2 by providing a covariate or a variant ID to test interaction
- you can perform conditional analysis in step2 providing a list of SNP to condition on
- analysis can now be restricted to a specific variant list and/or gene list (not compatible with step2 split analysis)
- improved reports rendering for long tables of top hits / top loci
- various bug fixes
- Nextflow
Published by edg1983 over 2 years ago
nf-pipeline-regenie - v1.8.1
This version mainly fixes issues for multi-model execution with complex analyses and adds some details to the documentation.
- mixed models with and without covariates are now processed correctly in multi-model run
- fix a bug in multi-model parsing, now complex model chunks are handled correctly
- fix a bug in HTML report generation, causing only one phenotype report to be generated when multiple phenotypes were tested in the same run. Now one report is generated for each phenotype
- better documentation for multi-model output
- Nextflow
Published by edg1983 almost 3 years ago
nf-pipeline-regenie - v1.8
This release introduces many improvements in the code and some new features.
Improvements
- refactor the code for multi-models execution: analysis runs are now managed using project meta-data channel without the need to spawn additional nextflow processes. This makes monitoring, execution and resuming easier to manage
groupKeyis now implemented to ensure each phenotype is analyzed independently without waiting for all processes to complete at each step before proceeding- phenotype and covariates validation are now performed with Python scripts, removing dependency on Java and jbang
- generation of HTML reports is now based on quarto with notebooks based on Python and the GWASLab packages. This reduces the amount of time and resources needed to generate the reports
- The docker container has been reorganized into one main container for analysis and one managing only the reports generation to be more flexible in case we want to improve the reports.
New features
- Easier to configure multi-models execution: now you can run multi-models using the same config file structure and adding the
models_tableparameters - Better HTML reports based on quarto and using the GWASLab python package. Reports now include regional plots for the top loci.
- A more organized workflow for multi-models execution that allows better control of the computational resources
- Better conversion of VCF / BCF to PGEN with more control on importing options
- Improved documentation on GitHub pages
- Nextflow
Published by edg1983 almost 3 years ago
nf-pipeline-regenie - v1.7.5
This update contains critical fixes for the multi-models mode. The actual execution of a single GWAS run has not changed.
A new configuration template is now present in the templates folder to facilitate the setup of multi-models runs.
In the previous version, the multi-model submission system failed most of the time due to a bug in the configuration template. Please update to this version if you want to use the multi-models mode!
- Nextflow
Published by edg1983 almost 3 years ago
nf-pipeline-regenie - v1.7.4
This is the first version ready for public release.
The pipeline can perform GWAS and rare variant analysis at a large scale using regenie and massive parallelization. All major data format are accepted (been, bed, pgen, vcf) and multiple models can be tested automatically starting from a models definition table and a phenotypes table.
Results will include full summary stats, top hits (SNPs annotated with overlapping genes for GWAS or genes), and clumped loci annotated with overlapping and nearby genes (for SNPs only). An HTML report with basic plots can also be generated for each tested phenotype (although this is not suggested when analyzing a large number of phenotypes).
- Nextflow
Published by edg1983 almost 3 years ago