https://github.com/animesh/bgwas

R package to perform Bayesian Genome-Wide Association Studies

https://github.com/animesh/bgwas

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: biorxiv.org, ncbi.nlm.nih.gov
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

R package to perform Bayesian Genome-Wide Association Studies

Basic Info
  • Host: GitHub
  • Owner: animesh
  • License: gpl-2.0
  • Default Branch: master
  • Homepage:
  • Size: 606 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of n-mounier/bGWAS
Created over 3 years ago · Last pushed over 3 years ago

https://github.com/animesh/bGWAS/blob/master/

[![](https://travis-ci.org/n-mounier/bGWAS.svg?branch=master)](https://travis-ci.org/n-mounier/bGWAS)
[![](https://img.shields.io/badge/version-1.0.2-informational.svg)](https://github.com/n-mounier/bGWAS)
[![](https://img.shields.io/badge/lifecycle-maturing-9cf.svg)](https://www.tidyverse.org/lifecycle/#maturing)
[![](https://img.shields.io/github/last-commit/n-mounier/bGWAS.svg)](https://github.com/n-mounier/bGWAS/commits/master)
[![](https://img.shields.io/badge/license-GPL--2.0-lightgrey.svg)](https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html)

# bGWAS 



:arrow\_right: ESHG poster is available
[here](doc/P17.051.A_NinonMounier.pdf).

:information\_source: `bGWAS` has been updated to version 1.0.2.  
:warning: 28/10/2019 : The variance of the prior effects has been
modified. If you used a previous version of the package, please re-run
yout analysis using this new version to get more accurate results.  
Check the [NEWS](NEWS.md) to learn more about what has been modified\!

:warning: If you downloaded the Z-Matrix files before 20/08/2019, they
are now obsolete and you will not be able to use them with the newest
version of the package.  
Note: some Prior GWASs have been removed, you can find more details
[here](doc/ZMatrices.md).

## Overview

bGWAS is an R-package to perform a Bayesian GWAS (Genome Wide
Association Study), using summary statistics from a conventional GWAS as
input. The aim of the approach is to increase power by leveraging
information from related traits and by comparing the observed Z-scores
from the focal phenotype (provided as input) to prior effects. These
prior effects are directly estimated from publicly available GWASs
(currently, a set of 38 studies, last update 20-08-2019 - hereinafter
referred to as prior GWASs or risk factors). Only prior GWASs having
a significant causal effect on the focal phenotype, identified using a
multivariable Mendelian Randomization (MR) approach, are used to
calculate the prior effects. Causal effects are estimated masking the
focal chromosome to ensure independence, and the prior effects are
estimated as described in the figure below.  


Observed and prior effects are compared using Bayes Factors.
Significance is assessed by calculating the probability of observing a
value larger than the observed BF (P-value) given the prior
distribution. This is done by decomposing the analytical form of the BFs
and using an approximation for most BFs to make the computation faster.
Prior, posterior and direct effects, alongside BFs and p-values are
returned. Note that prior, posterior and direct effects are estimated on
the Z-score scale, but are automatically rescaled to beta scale if
possible.

The principal functions available are:

  - **`bGWAS()`**  
    main function that calculates prior effects from prior GWASs,
    compares them to observed Z-scores and returns an object of class
    *bGWAS*

  - **`list_priorGWASs()`**  
    directly returns information about the prior GWASs that can be used
    to calculate prior effects

  - **`select_priorGWASs()`**  
    allows a quick selection of prior GWASs (to include/exclude specific
    studies when calculating prior effects)

  - **`extract_results_bGWAS()`**  
    returns results (prior, posterior and direct estimate /
    standard-error + p-value from BF for SNPs) from an object of class
    *bGWAS*

  - **`manhattan_plot_bGWAS()`**  
    creates a Manhattan Plot from an object of class *bGWAS*

  - **`extract_MRcoeffs_bGWAS()`**  
    returns multivariable MR coefficients (1 estimate using all
    chromosomes + 22 estimates with 1 chromosome masked) from an object
    of class *bGWAS*

  - **`coefficients_plot_bGWAS()`**  
    creates a Coefficients Plot (causal effect of each prior GWASs on
    the focal phenotype) from an object of class *bGWAS*

  - **`heatmap_bGWAS()`**  
    creates a heatmap to represent, for each significant SNP, the
    contribution of each prior GWAS to the estimated prior effect from
    an object of class *bGWAS*

All the functions available and more details about their usage can be
found in the [manual](doc/bGWAS-manual.pdf).

## Installation

You can install the current version of `bGWAS` with:

``` r
# Directly install the package from github
# install.packages("remotes")
remotes::install_github("n-mounier/bGWAS")
library(bGWAS)
```

    ## Warning: package 'dplyr' was built under R version 3.6.2



## Usage

To run the analysis with `bGWAS` two inputs are needed:

#### 1\. The *GWAS* results to be tested

Can be a regular (space/tab/comma-separated) file or a gzipped file
(.gz) or a `data.frame`. Must contain the following columns, which can
have alternative names:  

    SNP-identifier: `rs` or `rsid`, `snp`, `snpid`, `rnpid` Alternate (effect) allele: `a1` or `alt`, `alts` Reference allele: `a2` or `a0`, `ref` Z-statistics: `z` or `Z`, `zscore`
If the Z-statistics is not present, it will be automatically calculated from effect size and standard error, in which case the following columns should be provided:
    Effect-size: `b` or `beta`, `beta1` Standard error: `se` or `std`
If you want the prior/posterior/corrected effects to be rescaled, please make sure to provide effect sizes and standard errors instead of (or in addition to) Z-statistics. #### 2\. Prior *GWASs* - Z-Matrix files These files should be downloaded separately and stored in `~/ZMatrices` or in the folder specified when launching the analysis. These files contains the Z-scores for all prior GWASs :
      *ZMatrix\_MR.csv.gz*: Z-scores (strong instruments only) used for multivariable MR, *ZMatrix\_Full.csv.gz*: Z-scores (all SNPs) used to calculate the prior Z-scores, *AvailableStudies.tsv*: A file containing information about the prior GWASs available.
You can download these files using this [link](https://drive.switch.ch/index.php/s/jvSwoIxRgCKUSI8) or following the instructions below. Please note that your input GWAS will be merged with the Z-Matrix files (using rsid and alleles to align effects), and that the results reported will use the Z-Matrix files chr:pos information (GRCh37 - since UK10K data has been used to imputed the prior GWASs). - On UNIX/MACOSX, from a terminal: ``` bash wget https://drive.switch.ch/index.php/s/jvSwoIxRgCKUSI8/download -O ZMatrices.tar.gz tar xzvf ZMatrices.tar.gz ``` If you want to use your own set of prior GWASs, please have a look [here](doc/ZMatrices.md) to see how you can modify the files. ### Study Selection Before running your analysis, you can select the prior GWASs you want to include. You can use the function **`list_priorGWASs()`** to get some information about the prior GWASs available. You should remove traits that by definition are not independent from your trait of interest. For example, before analysing BMI results, make sure to exclude Height from the prior GWASs used. You can use the function **`select_priorGWASs()`** to automatically exclude/include some traits or some files. You should also check for sample overlap, and remove prior GWASs that come from the same consortium as your data. If there are individuals in common between your conventional GWAS and prior GWASs, it might induce some bias. ``` r # Obtain the list of prior GWASs AllStudies = list_priorGWASs() # Select only the ones for specific traits # select_priorGWASs will return the IDs of the files that are kept MyStudies = select_priorGWASs(include_traits=c("Heart Rate", "Body Mass Index", "Smoking")) # Match these IDs against the ones in the list of prior GWASs AllStudies[AllStudies$ID %in% MyStudies, ] ``` ## # A tibble: 6 x 10 ## File ## ## 1 All_ancestries_SNP_gwas_mc_merge_nogc.tbl.uniq.gz ## 2 META_STAGE1_GWASHR_SUMSTATS.txt ## 3 tag.cpd.tbl.gz ## 4 tag.evrsmk.tbl.gz ## 5 tag.former.tbl.gz ## 6 tag.logonset.tbl.gz ## Name ID Trait Consortium ## ## 1 Body Mass Index (GIANT) 1 Body Mass Index GIANT ## 2 Heart Rate (HRgene) 23 Heart Rate HRgene ## 3 Smoking - cigarettes per day (TAG) 35 Smoking TAG ## 4 Smoking - ever smoked (TAG) 36 Smoking TAG ## 5 Smoking - former smoker (TAG) 37 Smoking TAG ## 6 Smoking - age of onset (TAG) 38 Smoking TAG ## Reference Download Remarks N_SNPs N_Instruments ## ## 1 https://www.ncb https://portals.broad 6.81e6 10052 ## 2 https://www.ncb http://walker05.u.hpc 6.81e6 3229 ## 3 https://www.ncb https://www.med.unc.e Other GWAS Data 6.63e6 558 ## 4 https://www.ncb https://www.med.unc.e Other GWAS Data 6.78e6 94 ## 5 https://www.ncb https://www.med.unc.e Other GWAS Data 6.78e6 184 ## 6 https://www.ncb https://www.med.unc.e Other GWAS Data 6.77e6 34 ### Analysis - **Example A** ``` r # Using a small GWAS (400,000 SNPs, Timmers et al data - stored as a data.frame) # Please, not that this example is only for illustration, the method is designed # to be used genome-wide, and using such a low number of SNPs can not yield # interpretable results. data("SmallGWAS_Timmers2019") MyStudies = select_priorGWASs(include_traits=c("Blood Pressure", "Education"), include_files=c("cardiogram_gwas_results.txt", "All_ancestries_SNP_gwas_mc_merge_nogc.tbl.uniq.gz")) # 6 Prior GWASs used list_priorGWASs(MyStudies)[,c("Name", "Trait", "Reference")] ``` ## # A tibble: 6 x 3 ## Name Trait ## ## 1 Body Mass Index (GIANT) Body Mass Index ## 2 Coronary Artery Disease (CARDIoGRAM) Coronary Artery Disease ## 3 Years of Schooling (SSGAC) Education ## 4 Systolic Blood Pressure (ICBP) Blood Pressure ## 5 Diastolic Blood Pressure (ICBP) Blood Pressure ## 6 College Completion (SSGAC) Education ## Reference ## ## 1 https://www.ncbi.nlm.nih.gov/pubmed/25673413 ## 2 https://www.ncbi.nlm.nih.gov/pubmed/21378990 ## 3 https://www.ncbi.nlm.nih.gov/pubmed/27225129 ## 4 https://www.ncbi.nlm.nih.gov/pubmed/21909115 ## 5 https://www.ncbi.nlm.nih.gov/pubmed/21909115 ## 6 https://www.ncbi.nlm.nih.gov/pubmed/23722424 ``` r A = bGWAS(name="Test_UsingSmallDataFrame", GWAS = SmallGWAS_Timmers2019, prior_studies = MyStudies, stepwise_threshold = 0.05) # MR instruments will be selected using default parameter (1e-6) and distance-pruned (500Kb), # MR will be performed using a threshold of 0.05 to select studies, and the default shrinkage thresholds (1) will be used, # A subset of prior GWASs (MyStudies) will be used, and only the ones with at least 3 strong instruments will be kept to be tested and create the prior, # Significant SNPs will be identified using default parameters (p<5e-8) and distance-pruned (500kb), # No file will be saved. ``` - **Example B** ``` r # Using a GWAS from our list our prior GWASs # Using all other (37) GWASs to built the prior MyGWAS = 3 list_priorGWASs(MyGWAS)[, c("Name", "Trait", "Reference")] ``` ## # A tibble: 1 x 3 ## Name Trait ## ## 1 Coronary Artery Disease (CARDIoGRAM) Coronary Artery Disease ## Reference ## ## 1 https://www.ncbi.nlm.nih.gov/pubmed/21378990 ``` r B = bGWAS(name = "Test_UsingGWASfromPriorGWASs", GWAS = MyGWAS) # MR instruments will be selected using default parameters, # MR will be performed using default parameters (stepwise selection / shrinkage threshold) # All Prior GWASs except the one use as "GWAS" will be used to create the prior, # Significant SNPs will be identified using default parameters (p<5e-8) and distance-pruned (500kb) # No file will be saved. ``` ### Results **`bGWAS()`** returns an object of class bGWAS than can be handled in `R`. ``` r class(A) ``` ## [1] "bGWAS" ``` r print(A) ``` ## -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_ ## ## Analysis : "Test_UsingSmallDataFrame" ## bGWAS performed on 286,807 SNPs ## ## -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_ ## ## 4 studies used to build the prior : ## # A tibble: 4 x 3 ## study estimate std_error ## ## 1 EDUyears_2016_sumstat.txt 0.187 0.0255 ## 2 All_ancestries_SNP_gwas_mc_merge_nogc.tbl.uniq.gz -0.122 0.0248 ## 3 cardiogram_gwas_results.txt -0.439 0.0623 ## 4 DBP -0.248 0.0623 ## ## -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_ ## ## 14 significant SNPs identified : ## rs429358, rs56219465, rs7857118, rs12828640, rs56179563, rs9457925, rs1573644, rs3794695, rs9636202, rs9379844, rs4845625, rs59613878, rs34348584, rs4762753 ## ## -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_ ``` r print_log_bGWAS(A) ```
Show log ``` ## <<< Preparation of analysis >>> ## ## > Checking parameters ## The name of your analysis is: "Test_UsingSmallDataFrame". ## The Z-Matrix files are stored in "/Users/nmounier/ZMatrices". ## # Preparation of the data... ## The conventional GWAS used as input the object: "GWAS". ## SNPID column, ok - ALT column, ok - REF column, ok - BETA column, ok - SE column, ok ## Posterior effects will be rescaled using BETA and SE. ## The analysis will be run in the folder: "/Users/nmounier/Documents/SGG/Projects/Packaging/bGWAS". ## The p-value threshold used for selecting MR instruments is: 1e-06. ## The minimum number instruments required for each trait is: 3. ## The distance used for pruning MR instruments is: 500Kb. ## Distance-based pruning will be used for MR instruments. ## No shrinkage applied before performing MR. ## The p-value threshold used for stepwise selection is 0.05. ## Using MR_shrinkage as default for prior_shrinkage: ## No shrinkage applied before performing calculating the prior. ## The p-value threshold used for stepwise selection is 0.05. ## Significant SNPs will be identified according to p-value. The threshold used is :5e-08. ## The distance used for pruning results is: 500Kb. ## Distance-based pruning will be used for results. ## ## ## <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> ## <<< Identification of significant prior GWASs for MR >>> ## ## > Creating the Z-Matrix of strong instruments ## # Loading the ZMatrix... ## Selecting studies : ## 6 studies ## 209,840 SNPs ## # Adding data from the conventional GWAS : "GWAS" ## Done! ## 8,813 SNPs in common between prior studies and the conventional GWAS ## # Thresholding... ## 767 SNPs left after thresholding ## 6 studies left after thresholding ## Pruning MR instruments... ## distance : 500Kb ## 159 SNPs left after pruning ## 6 studies left after thresholding+pruning ## ## > Performing MR ## #Preparation of the MR analyses to identify significant studies... ## Conventionnal GWAS of interest : GWAS ## # Univariable regressions for each trait... ## Number of trait-specific instruments per univariable regression: ## . Body Mass Index (GIANT) : 59 ## . Coronary Artery Disease (CARDIoGRAM) : 6 ## . Years of Schooling (SSGAC) : 83 ## . Systolic Blood Pressure (ICBP) : 8 ## . Diastolic Blood Pressure (ICBP) : 8 ## . College Completion (SSGAC) : 4 ## Done! ## # Stepwise selection (all traits)... ## Studies tested (reaching p<0.05 in univariable models) : Years of Schooling (SSGAC) Body Mass Index (GIANT) Coronary Artery Disease (CARDIoGRAM) Systolic Blood Pressure (ICBP) Diastolic Blood Pressure (ICBP) ## Adding the first study :Years of Schooling (SSGAC) ## #Test if any study can be added with p<0.05 ## Adding one study :Systolic Blood Pressure (ICBP) ## Done! ## #Test if any study has p>0.05 now ## #Test if any study can be added with p<0.05 ## Adding one study :Body Mass Index (GIANT) ## Done! ## #Test if any study has p>0.05 now ## #Test if any study can be added with p<0.05 ## Adding one study :Coronary Artery Disease (CARDIoGRAM) ## Done! ## #Test if any study has p>0.05 now ## #Test if any study can be added with p<0.05 ## Adding one study :Diastolic Blood Pressure (ICBP) ## Done! ## #Test if any study has p>0.05 now ## Excluding one study :Systolic Blood Pressure (ICBP) ## Done! ## #Test if any study can be added with p<0.05 ## #Test if any study has p>0.05 now ## It converged! ## # Final regression... ## The studies used are: ## - Years of Schooling (SSGAC)- Body Mass Index (GIANT)- Coronary Artery Disease (CARDIoGRAM)- Diastolic Blood Pressure (ICBP) ## ## Estimating adjusted R-squared: ## - in-sample adjusted R-squared for the all-chromosomes multivariable regression is 0.5534 ## - out-of-sample R-squared (masking one chromosome at a time), for the multivariable regression will be estimated when calculating the prior. ## ## ## <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> ## <<< Estimation of the prior >>> ## ## > Creating the full Z-Matrix ## # Loading the ZMatrix... ## Selecting studies : ## 4 studies ## 6,811,310 SNPs ## # Adding data from the conventional GWAS : "GWAS" ## Done! ## 286,807 SNPs in common between prior studies and the conventional GWAS ## ## > Computing prior ## # Calculating the prior chromosome by chromosome... ## Chromosome 1 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 2 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 3 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 4 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 5 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 6 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 7 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 8 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 9 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 10 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 11 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 12 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 13 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 14 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 15 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 16 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 17 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 18 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 19 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 20 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 21 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## Chromosome 22 ## Running regression, ## Calculating prior estimates for SNPs on this chromosome, ## Calculating prior standard errors for SNPs on this chromosome, ## ## Out-of-sample R-squared for MR instruments across all chromosomes is 0.5206 ## ## Out-of-sample squared correlation for MR instruments across all chromosome is 0.5271 ## ## Correlation between prior and observed effects for all SNPs is 0.1855 ## ## Correlation between prior and observed effects for SNPs with GWAS p-value < 0.001 is 0.5944 ## Done! ## ## ## <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> ## <<< Calculation of Bayes Factors and p-values >>> ## ## > Calculating them for all SNPs ## # Computing observed Bayes Factor for all SNPs... ## Done! ## # Computing BF p-values... ## using a distribution approach: ## ... getting approximated p-values using non-linear quantiles ## ... checking p-values near significance threshold ## everything is ok! ## # Estimating p-values for posterior effects... ## Done! ## # Estimating p-values for direct effects... ## Done! ## > Pruning and identifying significant SNPs ## Identification based on BFs ## Starting with 286,807 SNPs ## # Selecting significant SNPs according to p-values... ## 30 SNPs left ## Done! ## # Pruning significant SNPs... ## distance : 500Kb ## 14 SNPs left ## Done! ## Identification based on posterior effects ## Starting with 286,807 SNPs ## # Selecting significant SNPs according to p-values... ## 44 SNPs left ## Done! ## # Pruning significant SNPs... ## distance : 500Kb ## 17 SNPs left ## Done! ## Identification based on direct effects ## Starting with 286,807 SNPs ## # Selecting significant SNPs according to p-values... ## 4 SNPs left ## Done! ## # Pruning significant SNPs... ## distance : 500Kb ## 2 SNPs left ## Done! ## ## ## <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> ## Time of the analysis: 2 minute(s) and 16 second(s). ```
- Functions to extract results from an object of class *bGWAS*: ``` r # by default, extract "BF" results... hits = extract_results_bGWAS(A, SNPs = "significant") hits ``` ## # A tibble: 14 x 14 ## rsid chrm_UK10K pos_UK10K alt ref beta se z_obs ## ## 1 rs429358 19 45411941 T C 0.106 0.00546 19.3 ## 2 rs56219465 15 78742579 A G 0.0380 0.00410 9.27 ## 3 rs7857118 9 22124140 A T 0.0240 0.00387 6.20 ## 4 rs12828640 12 111361298 A G -0.0229 0.00399 -5.74 ## 5 rs56179563 7 129685597 A G 0.0211 0.00406 5.19 ## 6 rs9457925 6 160848743 A G -0.0875 0.0145 -6.03 ## 7 rs1573644 15 91421283 T C 0.0201 0.00412 4.88 ## 8 rs3794695 16 72097827 T C -0.0260 0.00491 -5.30 ## 9 rs9636202 19 18449238 A G 0.0194 0.00441 4.40 ## 10 rs9379844 6 26291527 A G 0.0219 0.00399 5.50 ## 11 rs4845625 1 154422067 T C -0.0180 0.00389 -4.63 ## 12 rs59613878 4 3139152 T C 0.0206 0.00420 4.90 ## 13 rs34348584 3 27532704 T C 0.0201 0.00450 4.46 ## 14 rs4762753 12 20579969 T G 0.0259 0.00487 5.31 ## mu_prior_estimate mu_prior_std_error beta_prior_estimate beta_prior_std_error ## ## 1 -0.351 0.611 -0.00192 0.00334 ## 2 0.790 0.622 0.00324 0.00255 ## 3 2.61 0.876 0.0101 0.00339 ## 4 -2.34 0.680 -0.00934 0.00271 ## 5 3.37 0.737 0.0137 0.00299 ## 6 -1.72 0.631 -0.0250 0.00915 ## 7 2.43 0.696 0.0100 0.00287 ## 8 -1.69 0.608 -0.00831 0.00298 ## 9 3.12 0.630 0.0137 0.00278 ## 10 1.43 0.564 0.00571 0.00225 ## 11 -2.29 0.665 -0.00891 0.00259 ## 12 1.78 0.607 0.00750 0.00255 ## 13 2.11 0.698 0.00950 0.00314 ## 14 1.16 0.611 0.00564 0.00298 ## BF BF_p ## ## 1 7.00e19 5.90e-37 ## 2 2.11e 7 1.51e-14 ## 3 4.32e 6 1.77e-13 ## 4 2.27e 5 1.78e-11 ## 5 1.95e 5 2.28e-11 ## 6 8.63e 4 8.38e-11 ## 7 1.60e 4 1.35e- 9 ## 8 9.34e 3 3.36e- 9 ## 9 7.60e 3 4.79e- 9 ## 10 6.04e 3 7.12e- 9 ## 11 5.64e 3 8.03e- 9 ## 12 4.07e 3 1.42e- 8 ## 13 2.72e 3 2.89e- 8 ## 14 2.14e 3 4.43e- 8 ``` r all_results = extract_results_bGWAS(A, SNPs = "all") nrow(all_results) ``` ## [1] 286807 ``` r # but also possible to extract SNPs with significant posterior/direct effects extract_results_bGWAS(A, SNPs = "significant", results = "direct") ``` ## # A tibble: 2 x 15 ## rsid chrm_UK10K pos_UK10K alt ref beta se z_obs ## ## 1 rs429358 19 45411941 T C 0.106 0.00546 19.3 ## 2 rs11633958 15 78862064 T C -0.0418 0.00410 -10.2 ## mu_direct_estimate mu_direct_std_error beta_direct_estimate ## ## 1 19.7 1.17 0.108 ## 2 -9.99 1.18 -0.0410 ## beta_direct_std_error z_direct p_direct CRR ## ## 1 0.00640 16.8 2.96e-63 1.02 ## 2 0.00482 -8.50 1.90e-17 0.981 ``` r extract_MRcoeffs_bGWAS(A)[,1:12] ``` ## # A tibble: 4 x 12 ## name ## ## 1 Years of Schooling (SSGAC) ## 2 Body Mass Index (GIANT) ## 3 Coronary Artery Disease (CARDIoGRAM) ## 4 Diastolic Blood Pressure (ICBP) ## study estimate std_error Tstat P chrm1_estimate chrm1_std_error ## ## 1 EDUyears_201 0.187 0.0255 7.33 1.32e-11 0.187 0.0258 ## 2 All_ancestri -0.122 0.0248 -4.93 2.12e- 6 -0.108 0.0259 ## 3 cardiogram_g -0.439 0.0623 -7.05 6.18e-11 -0.469 0.0616 ## 4 DBP -0.248 0.0623 -3.98 1.07e- 4 -0.264 0.0619 ## chrm1_P chrm2_estimate chrm2_std_error chrm2_P ## ## 1 2.93e-11 0.186 0.0273 3.01e-10 ## 2 5.55e- 5 -0.124 0.0268 8.72e- 6 ## 3 4.27e-12 -0.439 0.0685 2.24e- 9 ## 4 3.78e- 5 -0.250 0.0653 1.93e- 4 - Functions for graphic representations: ``` r # Coefficients plot coefficients_plot_bGWAS(A) ``` ``` r # Manhattan plot using BFs p-values manhattan_plot_bGWAS(A) ``` ``` r # Manhattan plot using posterior p-values manhattan_plot_bGWAS(A, results="posterior") ``` ##### Aditionnaly, if `save_files=TRUE`, several files are created in the folder `.//` : - **.log** - log file - **PriorGWASs.tsv** - contains information about all prior GWASs (general info + status (used/removed) + univariable/multivariable MR estimates) - **CoefficientsByChromosome.csv** - contains the multivariable MR estimates when masking the focal chromosome (22 coefficients for each prior GWASs used for prior estimation) - **PriorBFp.csv** - contains BF and p-values, prior, posterior and direct effects estimates for all SNPs - **SignificantSNPs.csv** - contains BF and p-values, prior, posterior and direct effects estimates for a subset of significant SNPs (identified according to specified parameters) A detailed description of these files can be found [here](doc/OutputFiles.md). ## Runtime Analysis using all the 38 prior GWASs available, for a conventional GWAS containing \~7M SNPs in common with the prior studies \~ 25 minutes. Analysis using 6 prior GWASs, for a conventional GWAS containing \~ 300,000 SNPs in common with prior studies (see example A) \~ 2 minutes. Results from analyses performed on a MacBook Pro (Early 2015) - Processor : 2.9 GHz Intel Core i5 - Memory : 8 GB 1867 MHz DDR3. ## Application to Lifespan analysis This method has been applied to Lifespan analysis in [McDaid et al](https://www.ncbi.nlm.nih.gov/pubmed/28748955) and [Timmers et al](https://www.ncbi.nlm.nih.gov/pubmed/30642433). The most recent results (obtained using `bGWAS` version 1.0.2) are available [here](doc/Lifespan_Analysis.md) and summarised in [this poster](doc/P17.051.A_NinonMounier.pdf). ## Citation If you use the `bGWAS` package, please cite: [Ninon Mounier, Zoltn Kutalik, bGWAS: an R package to perform Bayesian Genome Wide Association Studies, Bioinformatics](https://doi.org/10.1093/bioinformatics/btaa549) ## Contact

Owner

  • Name: Ani
  • Login: animesh
  • Kind: user
  • Location: Norway
  • Company: Norwegian University of Science and Technology

A medical graduate from Delhi University with post-graduation in bioinformatics from Jawaharlal Nehru University, India.

GitHub Events

Total
Last Year