imds_wes

Quality control, annotation, association test, and post analysis

https://github.com/sirius-yang/imds_wes

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.7%) to scientific vocabulary
Last synced: 9 months ago · JSON representation ·

Repository

Quality control, annotation, association test, and post analysis

Basic Info
  • Host: GitHub
  • Owner: Sirius-Yang
  • Language: R
  • Default Branch: main
  • Size: 327 KB
Statistics
  • Stars: 9
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 1
Created about 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme Citation

README.md

Large-scale whole exome sequencing analyses identified protein-coding variants associated with immune-mediated diseases in 350770 adults

This repository contains code of quality control, annotation, association test for common (logisitic) and rare (SKAT) variants.

In each directory, we have included a README.txt file that provides more detailed information. We further provided important input or result files (excluding those in bed/bim/fam formats) to facilitate better visualization and understanding.


Workflow: QC -> Annotation -> Association -> PostAnalysis

The complete analysis workflow begins with quality control (QC) from Step1 to Step4.

Following QC, annotation starts; the primary analysis utilizes SnpEff for annotations of rare variants and ANNOVAR for common variants, while case-control enrichment employed VEP annotations.

Subsequently, variant and gene-based association tests are conducted separately for common and rare variants, and sensitivity analysis were further adopted to validate their robustness.

Finally, various post-analyses are performed, including BHR heritage analysis and correlation analysis, Cox survival analysis, Gene expression, MR analysis, annotating amino acid alterations, Proteomic-wide analysis, and PheWAS analysis. We have uploaded important codes for this section.


Plot 1:

Study Design. Created with Adobe, no analysis or code involved.

Plot 2:

  • (A) Results of gene-based collapse analysis for rare variants, main result 1. All codes provided (QC, SnpEff, GRM.sh, SAIGE.sh).
  • (B) Results of case-control enrichment. All codes provided (QC, VEP, Case_Control.py).

Plot 3:

  • (A) Results of variant-level association test for common variants. All codes provided (QC, Annovar, common.sh, clump.sh).
  • (B) Convergence of GWAS signals. Main GWAS code provided (GWAS.sh).
  • (C) Pleiotropy effects of common variants. Summarizes results from 3A, no additional code provided.

Plot 4:

  • (A) Burden heritability. All codes provided (Heritage.R).
  • (B) Genetic correlations of IMDs. All codes provided (Correlation.R).

Plot 5:

  • (A) Protein expression levels between mutation carriers and non-carriers. A simple t-test.
  • (B) MR analysis. All codes provided (MR.R).
  • (C) Annotation of amino acid alterations.

Plot 6:

  • (A) PheWAS analysis of rare variants. Similar to Plot 2A, no additional code provided.
  • (B) PheWAS analysis of common variants. Similar to Plot 3A, no additional code provided.
  • (C) PPI and clusters performed by STRING. Web-based API, no code used.
  • (D) Single-cell expression analysis.

Owner

  • Login: Sirius-Yang
  • Kind: user

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Large-scale whole exome sequencing analyses identified protein-coding variants associated with immune-mediated diseases in 350770 adults
message: >-
  If you are using code or pipelines from this
  repository,please consider citing our associated article
type: dataset
authors:
  - given-names: Liu
    family-names: Yang
    orcid: 'https://orcid.org/0000-0002-2435-1484'
  - given-names: Yanan
    family-names: Ou
  - given-names: Bangsheng
    family-names: Wu
    orcid: 'https://orcid.org/0000-0002-8807-6354'
identifiers:
  - type: doi
    value: 10.5281/zenodo.11307851
repository-code: 'https://github.com/Sirius-Yang/IMDs_WES'
abstract: >-
  This repository contains the code used to perform quality
  control, annotation, and association tests for the
  manuscript titled "Large-scale whole exome sequencing
  analyses identified protein-coding variants associated
  with immune-mediated diseases in 350770 adults"
keywords:
  - whole exome sequencing
  - immune-mediated disease
  - autoimmune disease
  - protein-coding variants
license: MIT
version: '1.0'
date-released: '2024-05-23'

GitHub Events

Total
  • Issues event: 2
  • Watch event: 8
  • Issue comment event: 3
  • Fork event: 1
Last Year
  • Issues event: 2
  • Watch event: 8
  • Issue comment event: 3
  • Fork event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 1
  • Total pull requests: 0
  • Average time to close issues: 5 days
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 2.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: 5 days
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 2.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • Lin-zikai (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels