qgg

Statistical tools for Quantitative Genetic Analyses

https://github.com/psoerensen/qgg

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 14 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    1 of 8 committers (12.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.3%) to scientific vocabulary
Last synced: 7 months ago · JSON representation

Repository

Statistical tools for Quantitative Genetic Analyses

Basic Info
Statistics
  • Stars: 38
  • Watchers: 5
  • Forks: 8
  • Open Issues: 1
  • Releases: 0
Created about 10 years ago · Last pushed 8 months ago
Metadata Files
Readme Changelog

README.Rmd

---
output: github_document
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

### An R package for Quantitative Genetic and Genomic analyses

The **qgg** package was developed based on the hypothesis that certain regions on the genome, so-called *genomic features*, may be enriched for causal variants affecting the trait. Several genomic feature classes can be formed based on previous studies and different sources of information such as genes, chromosomes or biological pathways.  
  
**qgg** provides an infrastructure for efficient processing of large-scale genetic and phenotypic data including core functions for:  
 
* fitting linear mixed models  
* construction of genomic relationship matrices  
* estimating genetic parameters (heritability and correlation)  
* genomic prediction  
* single marker association analysis  
* gene set enrichment analysis   
  
**qgg** handles large-scale data by taking advantage of:  
  
* multi-core processing using [openMP](https://www.openmp.org/)  
* multithreaded matrix operations implemented in BLAS libraries (e.g. [OpenBLAS](https://www.openblas.net/), [ATLAS](https://math-atlas.sourceforge.net/) or [MKL](https://en.wikipedia.org/wiki/Math_Kernel_Library))  
* fast and memory-efficient batch processing of genotype data stored in binary files (e.g. [PLINK](https://www.cog-genomics.org/plink2) bedfiles)  
  
The **qgg** package provides a range of genomic feature modeling approaches, including genomic feature best linear unbiased prediction (GFBLUP) models, implemented using likelihood or Bayesian methods. Multiple features and multiple traits can be included in these models and different genetic models (e.g. additive, dominance, gene by gene and gene by environment interactions) can be used. Further extensions include a weighted GFBLUP model using differential weighting of the individual genetic marker relationships. Marker set tests, which are computationally very fast, can be performed. These marker set tests allow the rapid analyses of different layers of genomic feature classes to discover genomic features potentially enriched for causal variants. Marker set tests can thus facilitate more accurate prediction models.

### Install

You can install qgg from CRAN with:

```{r,  eval=FALSE, echo=TRUE}
install.packages("qgg")
```

The most recent version of `qgg` can be obtained from github:

```{r,  eval=FALSE, echo=TRUE}
library(devtools)
devtools::install_github("psoerensen/qgg")
```

### Tutorials
Below is a set of tutorials used for the qgg package:  

This tutorial provides a brief introduction to R package qgg using small simulated data examples.  
[Practicals_brief_introduction](https://psoerensen.github.io/qgtutorials/Quick-tutorials-for-qgg-package.pdf)  


This tutorial provides an introduction to R package qgg using 1000G data.  
[Practicals_1000G_tutorials](https://psoerensen.github.io/qgtutorials/1000G-tutorials-for-qgg-package.pdf)  


This tutorial provide a simple introduction to polygenic risk scoring (PRS) of complex
traits and diseases using simulated data. The practical will be a mix of theoretical and practical exercises in R that are used for
illustrating/applying the theory presented in the corresponding lecture notes on polygenic risk scoring.  
[Practicals_human_example](https://psoerensen.github.io/qgtutorials/Practicals_human_example.pdf)  


In this tutorial we will be analysing quantitative traits observed in a mice population. The mouse data
consist of phenotypes for traits related to growth and obesity (e.g. body weight, glucose levels in blood),
pedigree information, and genetic marker data.  
[Practicals_mouse_example](https://psoerensen.github.io/qgtutorials/Practicals_mouse_example.pdf)  

### Notes
Below is a set of notes for the quantitative genetic theory, statistical models and methods implemented in the qgg package:  

[Quantitative Genetics Theory](https://psoerensen.github.io/qgnotes/Quantitative-Genetics-Theory.pdf)  

[Estimation of Genetic Predisposition](https://psoerensen.github.io/qgnotes/Estimation-of-Genetic-Predisposition.pdf)  

[Estimation of Genetic Parameters](https://psoerensen.github.io/qgnotes/Estimation-of-Genetic-Parameters.pdf)  


[Linear Mixed Models](https://psoerensen.github.io/qgnotes/LMM.pdf)  

[Best Linear Unbiased Prediction Models](https://psoerensen.github.io/qgnotes/BLUP.pdf)  

[REstricted Maximum Likelihood Methods](https://psoerensen.github.io/qgnotes/REML.pdf)  

[Gene Set Enrichment Analysis](https://psoerensen.github.io/qgnotes/GSEA.pdf)  

[Bayesian Linear Regression Models](https://psoerensen.github.io/qgnotes/BLR.pdf)  


#### References  

1. Edwards SM, Thomsen B, Madsen P, Sørensen P. 2015. Partitioning of genomic variance reveals biological pathways associated with udder health and milk production traits in dairy cattle. *Genet Sel Evol* 47:60. doi:10.1186/s12711-015-0132-6  
2. Edwards SM, Sørensen IF, Sarup P, Mackay TFC, Sørensen P. 2016. Genomic prediction for quantitative traits is improved by mapping variants to gene ontology categories in *Drosophila melanogaster*. *Genetics* 203:1871–1883. doi:10.1534/genetics.116.187161
3. Ehsani A, Janss L, Pomp D, Sørensen P. 2015. Decomposing genomic variance using information from GWA, GWE and eQTL analysis. *Anim Genet* 47:165–173. doi:10.1111/age.12396
4. Fang L, Sahana G, Ma P, Su G, Yu Y, Zhang S, Lund MS, Sørensen P. 2017. Exploring the genetic architecture and improving genomic prediction accuracy for mastitis and milk production traits in dairy cattle by mapping variants to hepatic transcriptomic regions responsive to intra-mammary infection. *Genet Sel Evol* 49:1–18. doi:10.1186/s12711-017-0319-0
5. Fang L, Sahana G, Su G, Yu Y, Zhang S, Lund MS, Sørensen P. 2017. Integrating sequence-based GWAS and RNA-seq provides novel insights into the genetic basis of mastitis and milk production in dairy cattle. *Sci Rep* 7:45560. doi:10.1038/srep45560
6. Fang L, Sørensen P, Sahana G, Panitz F, Su G, Zhang S, Yu Y, Li B, Ma L, Liu G, Lund MS, Thomsen B. 2018. MicroRNA-guided prioritization of genome-wide association signals reveals the importance of microRNA-target gene networks for complex traits in cattle. *Sci Rep* 8:1–14. doi:10.1038/s41598-018-27729-y
7. Ørsted M, Rohde PD, Hoffmann AA, Sørensen P, Kristensen TN. 2017. Environmental variation partitioned into separate heritable components. *Evolution* (N Y) 72:136–152. doi:10.1111/evo.13391
8. Ørsted M, Hoffmann AA, Rohde PD, Sørensen P, Kristensen TN. 2018. Strong impact of thermal environment on the quantitative genetic basis of a key stress tolerance trait. *Heredity* (Edinb). doi:10.1038/s41437-018-0117-7
9. Rohde PD, Krag K, Loeschcke V, Overgaard J, Sørensen P, Kristensen TN. 2016. A quantitative genomic approach for analysis of fitness and stress related traits in a *Drosophila melanogaster model population*. *Int J Genomics* 2016:1–11.
10. Rohde PD, Demontis D, Cuyabano BCD, The GEMS Group, Børglum AD, Sørensen P. 2016. Covariance Association Test (CVAT) identify genetic markers associated with schizophrenia in functionally associated biological processes. *Genetics* 203:1901–1913. doi:10.1534/genetics.116.189498
11. Rohde PD, Gaertner B, Ward K, Sørensen P, Mackay TFC. 2017. Genomic analysis of genotype-by-social environment interaction for *Drosophila melanogaster*. *Genetics* 206:1969–1984. doi:10.1534/genetics.117.200642/-/DC1.1
12. Rohde PD, Østergaard S, Kristensen TN, Sørensen P, Loeschcke V, Mackay TFC, Sarup P. 2018. Functional validation of candidate genes detected by genomic feature models. *G3 Genes, Genomes, Genet* 8:1659–1668. doi:10.1534/g3.118.200082
13. Sarup P, Jensen J, Ostersen T, Henryon M, Sørensen P. 2016. Increased prediction accuracy using a genomic feature model including prior information on quantitative trait locus regions in purebred Danish Duroc pigs. *BMC Genet* 17:11. doi:10.1186/s12863-015-0322-9
14. Sørensen P, de los Campos G, Morgante F, Mackay TFC, Sorensen D. 2015. Genetic control of environmental variation of two quantitative traits of *Drosophila melanogaster* revealed by whole-genome sequencing. *Genetics* 201:487–497. doi:10.1534/genetics.115.180273
15. Sørensen IF, Edwards SM, Rohde PD, Sørensen P. 2017. Multiple trait covariance association test identifies gene ontology categories associated with chill coma recovery time in *Drosophila melanogaster*. *Sci Rep* 7:2413. doi:10.1038/s41598-017-02281-3

Owner

  • Name: Peter Sørensen
  • Login: psoerensen
  • Kind: user

My current research focus is to further develop and implement computational methods to exploit and integrate multiple layers of genome-wide experimental data.

GitHub Events

Total
  • Watch event: 3
  • Push event: 95
  • Fork event: 1
Last Year
  • Watch event: 3
  • Push event: 95
  • Fork event: 1

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 2,161
  • Total Committers: 8
  • Avg Commits per committer: 270.125
  • Development Distribution Score (DDS): 0.545
Past Year
  • Commits: 216
  • Committers: 2
  • Avg Commits per committer: 108.0
  • Development Distribution Score (DDS): 0.014
Top Committers
Name Email Commits
Peter Sørensen a****6@u****k 983
psoerensen p****o@m****k 976
IzelFourie i****n@g****m 153
Maria Adonay m****y@n****u 19
Palle Duun Rohde p****r@b****k 18
psoerensen p****n@r****g 8
mesh-007 1****7 3
Palle Duun Rohde p****r@h****k 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 16
  • Total pull requests: 1
  • Average time to close issues: 8 months
  • Average time to close pull requests: over 2 years
  • Total issue authors: 13
  • Total pull request authors: 1
  • Average comments per issue: 2.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • mishaploid (2)
  • CuihuaXia (2)
  • ShiLiangyu94 (2)
  • frucelee (1)
  • ben6uw (1)
  • aneumann-science (1)
  • benanders (1)
  • camult (1)
  • nlapier2 (1)
  • sofifagro (1)
  • dleopold (1)
  • wuqianniandelaok (1)
  • washjake (1)
Pull Request Authors
  • vplagnol (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 364 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 10
  • Total maintainers: 1
cran.r-project.org: qgg

Statistical Tools for Quantitative Genetic Analyses

  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 364 Last month
Rankings
Forks count: 8.0%
Stargazers count: 11.6%
Average: 18.1%
Downloads: 19.0%
Dependent repos count: 24.3%
Dependent packages count: 27.8%
Maintainers (1)
Last synced: over 1 year ago

Dependencies

DESCRIPTION cran
  • MASS * imports
  • MCMCpack * imports
  • Rcpp * imports
  • data.table * imports
  • parallel * imports
  • statmod * imports
  • stats * imports