SNPRelate
R package: parallel computing toolset for relatedness and principal component analysis of SNP data (Development version only)
Science Score: 39.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 8 DOI reference(s) in README -
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.5%) to scientific vocabulary
Keywords
Repository
R package: parallel computing toolset for relatedness and principal component analysis of SNP data (Development version only)
Basic Info
- Host: GitHub
- Owner: zhengxwen
- Language: C++
- Default Branch: master
- Homepage: http://www.bioconductor.org/packages/SNPRelate
- Size: 17.8 MB
Statistics
- Stars: 108
- Watchers: 12
- Forks: 25
- Open Issues: 43
- Releases: 7
Topics
Metadata Files
README.md
SNPRelate: Parallel computing toolset for relatedness and principal component analysis of SNP data
GNU General Public License, GPLv3
Features
Genome-wide association studies are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed SNPRelate (R package for multi-core symmetric multiprocessing computer architectures) to accelerate two key computations on SNP data: principal component analysis (PCA) and relatedness analysis using identity-by-descent measures. The kernels of our algorithms are written in C/C++ and highly optimized.
The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. The SNP GDS format in this package is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variation (SNV), insertion/deletion polymorphism (indel) and structural variation calls. It is strongly suggested to use SeqArray for large-scale whole-exome and whole-genome sequencing variant data instead of SNPRelate.
Bioconductor
Release Version: v1.42.1
http://www.bioconductor.org/packages/SNPRelate
News
- See package news.
Tutorials
http://www.bioconductor.org/packages/release/bioc/vignettes/SNPRelate/inst/doc/SNPRelate.html
Citations
Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS (2012). A High-performance Computing Toolset for Relatedness and Principal Component Analysis of SNP Data. Bioinformatics. DOI: 10.1093/bioinformatics/bts606.
Zheng X, Gogarten S, Lawrence M, Stilp A, Conomos M, Weir BS, Laurie C, Levine D (2017). SeqArray -- A storage-efficient high-performance data format for WGS variant calls. Bioinformatics. DOI: 10.1093/bioinformatics/btx145.
Installation
Bioconductor repository:
R if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install("SNPRelate")Development version from Github (for developers/testers only):
R library("devtools") install_github("zhengxwen/gdsfmt") install_github("zhengxwen/SNPRelate")Theinstall_github()approach requires that you build from source, i.e.makeand compilers must be installed on your system -- see the R FAQ for your operating system; you may also need to install dependencies manually.
Implementation with Intel Intrinsics
| Functions | No SIMD | SSE2 | AVX | AVX2 | AVX-512 | |:----------------------|:-------:|:----:|:---:|:----:|:-------:| | snpgdsDiss » | X | | snpgdsEIGMIX » | X | X | X | | snpgdsGRM » | X | X | X | . | | snpgdsIBDKING » | X | X | | X | | snpgdsIBDMoM » | X | | snpgdsIBS » | X | X | | snpgdsIBSNum » | X | X | | snpgdsIndivBeta » | X | X | P | X | | snpgdsPCA » | X | X | X | | snpgdsPCACorr » | X | | snpgdsPCASampLoading » | X | | snpgdsPCASNPLoading » | X | | ... |
X: fully supported; .: partially supported; P: POPCNT instruction.
Install the package from the source code with the support of Intel SIMD Intrinsics:
You have to customize the package compilation, see: CRAN: Customizing-package-compilation
Change ~/.R/Makevars to, assuming GNU Compilers (gcc/g++) or Clang compiler (clang++) are installed:
```sh
for C code
CFLAGS=-g -O3 -march=native -mtune=native
for C++ code
CXXFLAGS=-g -O3 -march=native -mtune=native ```
Owner
- Name: Xiuwen Zheng
- Login: zhengxwen
- Kind: user
- Location: Chicago
- Repositories: 13
- Profile: https://github.com/zhengxwen
GitHub Events
Total
- Issues event: 3
- Watch event: 7
- Issue comment event: 2
- Push event: 5
Last Year
- Issues event: 3
- Watch event: 7
- Issue comment event: 2
- Push event: 5
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Xiuwen Zheng | z****n@g****m | 362 |
| Xiuwen Zheng | x****g@a****m | 3 |
| Bioconductor Git-SVN Bridge | b****c@b****g | 3 |
| Kevin Murray | k****1 | 2 |
| Stephanie M. Gogarten | s****n@g****m | 1 |
| NikNakk | n****k@n****m | 1 |
| Dr. K. D. Murray | 1****9 | 1 |
| Dan Bolser | 5****k | 1 |
| Billsfriend | 4****d | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 184
- Total pull requests: 12
- Average time to close issues: about 2 months
- Average time to close pull requests: 20 days
- Total issue authors: 78
- Total pull request authors: 4
- Average comments per issue: 2.11
- Average comments per pull request: 0.67
- Merged pull requests: 10
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 0
- Average time to close issues: 4 months
- Average time to close pull requests: N/A
- Issue authors: 2
- Pull request authors: 0
- Average comments per issue: 0.5
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- zhengxwen (4)
- thierrygosselin (4)
- nottwy (3)
- kforner (3)
- AAvalos82 (3)
- Tman3 (2)
- elizeng (2)
- yangli-ai (2)
- smgogarten (2)
- jgx65 (2)
- evigorito (2)
- kroluk (2)
- rafalcode (1)
- jane-edgeloe (1)
- linsson (1)
Pull Request Authors
- kdm9 (2)
- CholoTook (2)
- Billsfriend (1)
- smgogarten (1)
- NikNakk (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- bioconductor 218,346 total
- Total dependent packages: 11
- Total dependent repositories: 0
- Total versions: 7
- Total maintainers: 1
bioconductor.org: SNPRelate
Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data
- Homepage: https://github.com/zhengxwen/SNPRelate
- Documentation: https://bioconductor.org/packages/release/bioc/vignettes/SNPRelate/inst/doc/SNPRelate.pdf
- License: GPL-3
-
Latest release: 1.42.0
published 9 months ago
Rankings
Maintainers (1)
Dependencies
- R >= 2.15 depends
- gdsfmt >= 1.8.3 depends
- SeqArray >= 1.12.0 enhances
- methods * imports
- BiocGenerics * suggests
- MASS * suggests
- Matrix * suggests
- RUnit * suggests
- knitr * suggests
- markdown * suggests
- parallel * suggests
- rmarkdown * suggests
- actions/checkout v3 composite
- r-lib/actions/setup-r f57f1301a053485946083d7a45022b278929a78a composite