Science Score: 46.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: nature.com -
✓Committers with academic emails
6 of 30 committers (20.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.7%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
R package for DNA methylation analysis
Basic Info
- Host: GitHub
- Owner: al2na
- Language: R
- Default Branch: master
- Homepage: https://bioconductor.org/packages/release/bioc/html/methylKit.html
- Size: 18.5 MB
Statistics
- Stars: 235
- Watchers: 17
- Forks: 101
- Open Issues: 69
- Releases: 3
Topics
Metadata Files
README.md
methylKit
Build Status
| | |
| - | - |
| Github | |
| Bioc Release |
|
Bioc Devel |
|
Introduction
methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods such as Agilent SureSelect methyl-seq. In addition, methylKit can deal with base-pair resolution data for 5hmC obtained from Tab-seq or oxBS-seq. It can also handle whole-genome bisulfite sequencing data if proper input format is provided.
Current Features
- Coverage statistics
- Methylation statistics
- Sample correlation and clustering
- Differential methylation analysis
- Feature annotation and accessor/coercion functions
- Multiple visualization options
- Regional and tiling windows analysis
- (Almost) proper documentation
- Reading methylation calls directly from Bismark(Bowtie/Bowtie2 alignment files
- Batch effect control
- Multithreading support (for faster differential methylation calculations)
- Coercion to objects from Bioconductor package GenomicRanges
- Reading methylation percentage data from generic text files
Staying up-to-date
You can subscribe to our googlegroups page to get the latest information about new releases and features (low-frequency, only updates are posted)
- https://groups.google.com/forum/#!forum/methylkit
To ask questions please use methylKit_discussion forum
- https://groups.google.com/forum/#!forum/methylkit_discussion
You can also check out the blogposts we make on using methylKit
- http://zvfak.blogspot.de/search/label/methylKit
Installation
in R console,
r
library(devtools)
install_github("al2na/methylKit", build_vignettes=FALSE,
repos=BiocManager::repositories(),
dependencies=TRUE)
if this doesn't work, you might need to add type="source" argument.
Install the development version
r
library(devtools)
install_github("al2na/methylKit", build_vignettes=FALSE,
repos=BiocManager::repositories(),ref="development",
dependencies=TRUE)
if this doesn't work, you might need to add type="source" argument.
How to Use
Typically, bisulfite converted reads are aligned to the genome and % methylation value per base is calculated by processing alignments. methylKit takes that % methylation value per base information as input. Such input file may be obtained from AMP pipeline for aligning RRBS reads. A typical input file looks like this:
``` chrBase chr base strand coverage freqC freqT chr21.9764539 chr21 9764539 R 12 25.00 75.00 chr21.9764513 chr21 9764513 R 12 0.00 100.00 chr21.9820622 chr21 9820622 F 13 0.00 100.00 chr21.9837545 chr21 9837545 F 11 0.00 100.00 chr21.9849022 chr21 9849022 F 124 72.58 27.42 chr21.9853326 chr21 9853326 F 17 70.59 29.41
```
methylKit reads in those files and performs basic statistical analysis and annotation for differentially methylated regions/bases. Also a tab separated text file with a generic format can be read in, such as methylation ratio files from BSMAP, see here for an example. Alternatively, read.bismark function can read SAM file(s) output by Bismark(using bowtie/bowtie2) aligner (the SAM file must be sorted based on chromosome and read start). The sorting must be done by unix sort or samtools, sorting using other tools may change the column order of the SAM file and that will cause an error.
Below, there are several options showing how to do basic analysis with methylKit.
Documentation
- You can look at the vignette here. This is the primary source of documentation. It includes detailed examples.
- You can check out the slides for a tutorial at EpiWorkshop 2013. This works with older versions of methylKit, you may need to update the function names.
- You can check out the tutorial prepared for EpiWorkshop 2012. This works with older versions of methylKit, you may need to update the function names.
- You can check out the slides prepared for EuroBioc 2018. This also includes more recent features of methylKit and is meant to give you a quick overview about what you can do with the package.
Downloading Annotation Files
Annotation files in BED format are needed for annotating your differentially methylated regions. You can download annotation files from UCSC table browser for your genome of interest. Go to [http://genome.ucsc.edu/cgi-bin/hgGateway]. On the top menu click on "tools" then "table browser". Select your "genome" of interest and "assembly" of interest from the drop down menus. Make sure you select the correct genome and assembly. Selecting wrong genome and/or assembly will return unintelligible results in downstream analysis.
From here on you can either download gene annotation or CpG island annotation.
- For gene annotation, select "Genes and Gene prediction tracks" from the "group" drop-down menu. Following that, select "Refseq Genes" from the "track" drop-down menu. Select "BED- browser extensible data" for the "output format". Click "get output" and on the following page click "get BED" without changing any options. save the output as a text file.
- For CpG island annotation, select "Regulation" from the "group" drop-down menu. Following that, select "CpG islands" from the "track" drop-down menu. Select "BED- browser extensible data" for the "output format". Click "get output" and on the following page click "get BED" without changing any options. save the output as a text file.
In addition, you can check this tutorial to learn how to download any track from UCSC in BED format (http://www.openhelix.com/cgi/tutorialInfo.cgi?id=28)
R script for Genome Biology publication
The most recent version of the R script in the Genome Biology manuscript is here.
Citing methylKit
If you used methylKit please cite:
- Altuna Akalin, Matthias Kormaksson, Sheng Li, Francine E. Garrett-Bakelman, Maria E. Figueroa, Ari Melnick, Christopher E. Mason. (2012). "methylKit: A comprehensive R package for the analysis of genome-wide DNA methylation profiles." Genome Biology , 13:R87.
If you used flat-file objects or over-dispersion corrected tests please consider citing:
- Wreczycka K, Gosdschan A, Yusuf D, Grüning B, Assenov Y, Akalin A. "Strategies for analyzing bisulfite sequencing data." J Biotechnol., 2017
and also consider citing the following publication as a use-case with specific cutoffs:
- Altuna Akalin, Francine E. Garrett-Bakelman, Matthias Kormaksson, Jennifer Busuttil, Lu Zhang, Irina Khrebtukova, Thomas A. Milne, Yongsheng Huang, Debabrata Biswas, Jay L. Hess, C. David Allis, Robert G. Roeder, Peter J. M. Valk, Bob Löwenberg, Ruud Delwel, Hugo F. Fernandez, Elisabeth Paietta, Martin S. Tallman, Gary P. Schroth, Christopher E. Mason, Ari Melnick, Maria E. Figueroa. (2012). "Base-Pair Resolution DNA Methylation Sequencing Reveals Profoundly Divergent Epigenetic Landscapes in Acute Myeloid Leukemia." PLoS Genetics 8(6).
Contact & Questions
e-mail to methylkit_discussion@googlegroups.com or post a question using the web interface.
if you are going to submit bug reports or ask questions, please send sessionInfo() output from R console as well.
Questions are very welcome, although we suggest you read the paper, documentation(function help pages and the vignette) and blog entries first. The answer to your question might be there already.
Contribute to the development
See the trello board for methylKit development. You can contribute to the methylKit development via github ([http://github.com/al2na/methylKit/]) by opening an issue and discussing what you want to contribute, we will guide you from there. In addition, you should:
Bump up the version in the DESCRIPTION file on the 3rd number. For example, the master branch has the version numbering as in "X.Y.1". If you make a change to master branch you should bump up the version in the DESCRIPTION file to "X.Y.2".
Add your changes to the NEWS file as well under the correct version and appropriate section. Attribute the changes to yourself, such as "Contributed by X"
License
Artistic License/GPL
Owner
- Name: Altuna Akalin
- Login: al2na
- Kind: user
- Location: Berlin
- Company: Berlin Institute for Medical Systems Biology
- Website: http://al2na.co
- Twitter: AltunaAkalin
- Repositories: 25
- Profile: https://github.com/al2na
doing stuff
GitHub Events
Total
- Issues event: 46
- Watch event: 25
- Delete event: 5
- Issue comment event: 41
- Push event: 25
- Pull request review event: 1
- Pull request event: 11
- Fork event: 4
- Create event: 8
Last Year
- Issues event: 46
- Watch event: 25
- Delete event: 5
- Issue comment event: 41
- Push event: 25
- Pull request review event: 1
- Pull request event: 11
- Fork event: 4
- Create event: 8
Committers
Last synced: over 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| alexg9010 | a****s@z****e | 258 |
| al2na | a****n@g****m | 245 |
| alexg9010 | a****0@g****m | 242 |
| Alexander Gosdschan | a****n@m****e | 58 |
| Sheng Li | s****6@g****m | 19 |
| Adrian Bierling | a****g@f****e | 18 |
| alexg9010 | a****0 | 17 |
| Altuna Akalin | a****n@a****l | 17 |
| a.akalin | a****n@b****8 | 15 |
| Nitesh Turaga | n****a@g****m | 14 |
| Sheng Li | S****6@g****m | 8 |
| Hervé Pagès | h****s@f****g | 7 |
| Herve Pages | h****s@f****g | 6 |
| Marcin Kosiński | m****i@g****m | 3 |
| hpages@fhcrc.org | h****s@f****g@b****8 | 3 |
| J Wokaty | j****y@s****u | 2 |
| Kasia Wreczycka | k****a@m****e | 2 |
| Jonas Daniel | j****s@o****o | 2 |
| vobencha | v****n@r****g | 2 |
| vobencha | v****a@g****m | 2 |
| J Wokaty | j****y | 2 |
| ala2027 | a****7@m****u | 1 |
| karl616 | k****m@g****m | 1 |
| mtmorgan@fhcrc.org | m****n@f****g@b****8 | 1 |
| lshep | s****l@g****m | 1 |
| Katarzyna Wreczycka | k****e@g****m | 1 |
| Martin Morgan | m****n@f****g | 1 |
| biobonnie | b****l@u****u | 1 |
| Sheng Li | s****8@m****u | 1 |
| Gosdschan | a****c@T****l | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 114
- Total pull requests: 64
- Average time to close issues: 4 months
- Average time to close pull requests: 24 days
- Total issue authors: 68
- Total pull request authors: 17
- Average comments per issue: 2.71
- Average comments per pull request: 1.08
- Merged pull requests: 33
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 30
- Pull requests: 12
- Average time to close issues: 10 days
- Average time to close pull requests: 2 days
- Issue authors: 19
- Pull request authors: 1
- Average comments per issue: 1.23
- Average comments per pull request: 0.08
- Merged pull requests: 7
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- alexg9010 (14)
- avilella (12)
- J-Moravec (5)
- al2na (4)
- BenTPTseng (3)
- DengEr-1993 (3)
- lvl2001 (3)
- desmodus1984 (3)
- yhj-j (2)
- CathG (2)
- igordot (2)
- QuietgraceH (2)
- gevro (2)
- Ge0rges (2)
- DiegoZavallo (2)
Pull Request Authors
- alexg9010 (41)
- karl616 (3)
- MarcinKosinski (3)
- bbarrilleaux (2)
- arsenew (2)
- abierling (2)
- dc1340 (1)
- ShengLi (1)
- avilella (1)
- CathG (1)
- GhislainFievet (1)
- robsyme (1)
- al2na (1)
- therealgenna (1)
- rekado (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 3
-
Total downloads:
- bioconductor 121,222 total
-
Total dependent packages: 3
(may contain duplicates) -
Total dependent repositories: 0
(may contain duplicates) - Total versions: 12
- Total maintainers: 2
bioconductor.org: methylKit
DNA methylation analysis from high-throughput bisulfite sequencing results
- Homepage: https://github.com/al2na/methylKit
- Documentation: https://bioconductor.org/packages/release/bioc/vignettes/methylKit/inst/doc/methylKit.pdf
- License: Artistic-2.0
-
Latest release: 1.34.0
published 10 months ago
Rankings
Maintainers (2)
proxy.golang.org: github.com/al2na/methylkit
- Documentation: https://pkg.go.dev/github.com/al2na/methylkit#section-documentation
-
Latest release: v0.99.2
published over 9 years ago
Rankings
proxy.golang.org: github.com/al2na/methylKit
- Documentation: https://pkg.go.dev/github.com/al2na/methylKit#section-documentation
-
Latest release: v0.99.2
published over 9 years ago
Rankings
Dependencies
- GenomicRanges >= 1.18.1 depends
- R >= 3.5.0 depends
- methods * depends
- GenomeInfoDb * imports
- IRanges * imports
- KernSmooth * imports
- R.utils * imports
- Rcpp * imports
- Rsamtools * imports
- S4Vectors >= 0.13.13 imports
- data.table >= 1.9.6 imports
- emdbook * imports
- fastseg * imports
- grDevices * imports
- graphics * imports
- gtools * imports
- limma * imports
- mclust * imports
- mgcv * imports
- parallel * imports
- qvalue * imports
- rtracklayer * imports
- stats * imports
- utils * imports
- BiocManager * suggests
- genomation * suggests
- knitr * suggests
- rmarkdown * suggests
- testthat >= 2.1.0 suggests
- actions/checkout v3 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
- actions/checkout v3 composite
- actions/upload-artifact v3 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite