imds_wes
Quality control, annotation, association test, and post analysis
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.7%) to scientific vocabulary
Repository
Quality control, annotation, association test, and post analysis
Basic Info
- Host: GitHub
- Owner: Sirius-Yang
- Language: R
- Default Branch: main
- Size: 327 KB
Statistics
- Stars: 9
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
Large-scale whole exome sequencing analyses identified protein-coding variants associated with immune-mediated diseases in 350770 adults
This repository contains code of quality control, annotation, association test for common (logisitic) and rare (SKAT) variants.
In each directory, we have included a README.txt file that provides more detailed information. We further provided important input or result files (excluding those in bed/bim/fam formats) to facilitate better visualization and understanding.
Workflow: QC -> Annotation -> Association -> PostAnalysis
The complete analysis workflow begins with quality control (QC) from Step1 to Step4.
Following QC, annotation starts; the primary analysis utilizes SnpEff for annotations of rare variants and ANNOVAR for common variants, while case-control enrichment employed VEP annotations.
Subsequently, variant and gene-based association tests are conducted separately for common and rare variants, and sensitivity analysis were further adopted to validate their robustness.
Finally, various post-analyses are performed, including BHR heritage analysis and correlation analysis, Cox survival analysis, Gene expression, MR analysis, annotating amino acid alterations, Proteomic-wide analysis, and PheWAS analysis. We have uploaded important codes for this section.
Plot 1:
Study Design. Created with Adobe, no analysis or code involved.
Plot 2:
- (A) Results of gene-based collapse analysis for rare variants, main result 1. All codes provided (QC, SnpEff, GRM.sh, SAIGE.sh).
- (B) Results of case-control enrichment. All codes provided (QC, VEP, Case_Control.py).
Plot 3:
- (A) Results of variant-level association test for common variants. All codes provided (QC, Annovar, common.sh, clump.sh).
- (B) Convergence of GWAS signals. Main GWAS code provided (GWAS.sh).
- (C) Pleiotropy effects of common variants. Summarizes results from 3A, no additional code provided.
Plot 4:
- (A) Burden heritability. All codes provided (Heritage.R).
- (B) Genetic correlations of IMDs. All codes provided (Correlation.R).
Plot 5:
- (A) Protein expression levels between mutation carriers and non-carriers. A simple t-test.
- (B) MR analysis. All codes provided (MR.R).
- (C) Annotation of amino acid alterations.
Plot 6:
- (A) PheWAS analysis of rare variants. Similar to Plot 2A, no additional code provided.
- (B) PheWAS analysis of common variants. Similar to Plot 3A, no additional code provided.
- (C) PPI and clusters performed by STRING. Web-based API, no code used.
- (D) Single-cell expression analysis.
Owner
- Login: Sirius-Yang
- Kind: user
- Repositories: 1
- Profile: https://github.com/Sirius-Yang
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: Large-scale whole exome sequencing analyses identified protein-coding variants associated with immune-mediated diseases in 350770 adults
message: >-
If you are using code or pipelines from this
repository,please consider citing our associated article
type: dataset
authors:
- given-names: Liu
family-names: Yang
orcid: 'https://orcid.org/0000-0002-2435-1484'
- given-names: Yanan
family-names: Ou
- given-names: Bangsheng
family-names: Wu
orcid: 'https://orcid.org/0000-0002-8807-6354'
identifiers:
- type: doi
value: 10.5281/zenodo.11307851
repository-code: 'https://github.com/Sirius-Yang/IMDs_WES'
abstract: >-
This repository contains the code used to perform quality
control, annotation, and association tests for the
manuscript titled "Large-scale whole exome sequencing
analyses identified protein-coding variants associated
with immune-mediated diseases in 350770 adults"
keywords:
- whole exome sequencing
- immune-mediated disease
- autoimmune disease
- protein-coding variants
license: MIT
version: '1.0'
date-released: '2024-05-23'
GitHub Events
Total
- Issues event: 2
- Watch event: 8
- Issue comment event: 3
- Fork event: 1
Last Year
- Issues event: 2
- Watch event: 8
- Issue comment event: 3
- Fork event: 1
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: 5 days
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 2.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: 5 days
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 2.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- Lin-zikai (1)