nf-gwas

A nextflow pipeline to perform state-of-the-art genome-wide association studies.

https://github.com/mfz/nf-gwas

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

A nextflow pipeline to perform state-of-the-art genome-wide association studies.

Basic Info
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 10 months ago
Metadata Files
Readme License Citation

README.md

Attempt to get nf-gwas running on AllOfUS research platform

This is an attempt to get nf-gwas running on the AllOfUS research platform. I am only interested in obtaining the regenie GWAS results, not in creating Manhattan plots etc.

This is the first time I am dealing with nextflow and/or the Google Cloud Platform. So there are likely better ways ...

The Dockerfile is modified to include the Google Cloud SDK and bgenix, which is used to split the bgen files.

Nextflow stages input files (for example .bgen) by copying them to the working directory. Even if only a small slice is needed, the whole file gets copied. To avoid this, splitbgen.nf can be used to split the bgen files into chunks beforehand.

Similarly, splitancestry.nf can be used to split the input PLINK files by ancestry. splitancestry.nf also sets FID to IID in the .fam files, to agree with the .bgen sample files.

Some modifications were made to the nf-gwas pipeline to restrict functionality to only run regenie, and to try to avoid copying too many files.

The AllOfUS specific configuration is in conf/aou.conf. The Google cloud profile can be found in ~/.nextflow/config on the AllOfUS research platform. It provides the gls profile. The spot profile to enable running on spot instances was added to nextflow.config.

Steps:

  • create ancestry file with columns FID, IID, ancestry and run splitancestry.nf workflow vv(this workflow has to be run outside of nf-gwas directory) using profile gls

  • run splitbgen.nf workflow using profile gls

  • configure conf/aou with phenotype and covariates

  • run main.nf workflow using profiles gls and spot

Excerpts from original README.md

nf-gwas is a Nextflow pipeline to run biobank-scale genome-wide association studies (GWAS) analysis. The pipeline automatically performs numerous pre- and post-processing steps, integrates regression modeling from the REGENIE package and currently supports single-variant, gene-based and interaction testing. All modules are structured in sub-workflows which allows to extend the pipeline to other methods and tools in future. nf-gwas includes an extensive reporting functionality that allows to inspect thousands of phenotypes and navigate interactive Manhattan plots directly in the web browser.

Citation

Please cite our paper if you use nf-gwas:

Schönherr S, Schachtl-Riess JF, Di Maio S*, Filosi M, Mark M, Lamina C, Fuchsberger C, Kronenberg F, Forer L. Performing highly parallelized and reproducible GWAS analysis on biobank-scale data. NAR Genom Bioinform. 2024 Feb 7;6(1):lqae015. doi: 10.1093/nargab/lqae015. PMID: 38327871; PMCID: PMC10849172.

License

nf-gwas is MIT Licensed and was developed at the Institute of Genetic Epidemiology, Medical University of Innsbruck, Austria.

Contact

Owner

  • Login: mfz
  • Kind: user
  • Location: Reykjavik, Iceland

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Schönherr"
  given-names: "Sebastian"
- family-names: "Forer"
  given-names: "Lukas"
title: "nf-gwas - A nextflow pipeline to perform GWAS."
version: 0.3.4
url: "https://github.com/genepi/nf-gwas"

GitHub Events

Total
  • Push event: 41
  • Create event: 1
Last Year
  • Push event: 41
  • Create event: 1