gpeters-fetchnome
R package for fetching NOMe-seq data from BAM files
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 5 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
·
Repository
R package for fetching NOMe-seq data from BAM files
Basic Info
- Host: GitHub
- Owner: fmi-basel
- License: other
- Language: C++
- Default Branch: main
- Size: 198 KB
Statistics
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
- Releases: 1
Created over 2 years ago
· Last pushed about 1 year ago
Metadata Files
Readme
License
Citation
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# fetchNOMe
[](https://doi.org/10.5281/zenodo.8402785)
[](https://github.com/fmi-basel/gpeters-fetchNOMe/actions/workflows/R-CMD-check.yaml)
## Overview
**fetchNOMe** is an R package for fast and efficient retrieval of NOMe-seq data from BAM files. NOMe-seq (Nucleosome Occupancy and Methylome Sequencing) simultaneously captures endogenous CpG methylation and M.CviPI-induced GpC methylation, providing a high-resolution map of chromatin accessibility and DNA methylation.
This package is specifically designed to process BAM files generated from pipelines aligning bisulfite-converted DNA, such as Bismark, QuasR and BISCUIT aligners.
## Features
- Extract GpC (M.CviPI-induced) and CpG (endogenous) methylation directly from BAM files (`get_data_matrix_from_bams`, `get_molecule_data_list_from_bams`)
- Extract co-occurrence statistics from BAM files for footprint spectral analysis by the **nomeR** package (`get_ctable_from_bams`)
- Extract statistics of protected states (`get_protect_stats_from_bams`)
## Requirements
- Indexed BAM files from NOMe-seq experiments
- Index (.bai) file in the same location as the BAM
## Installation
You can install the development version of fetchNOMe from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("fmi-basel/gpeters-fetchNOMe")
```
## Usage
``` r
library(fetchNOMe)
# Define BAM file path
bam <- system.file("extdata", "QuasR_test.bam", package = "fetchNOMe")
# Define genome FASTA file path
genome <- system.file("extdata", "random_genome_700bp.fa", package = "fetchNOMe")
# Extract GCH protection states for a region of interest (ROI)
GCH_tbl_mat <- get_data_matrix_from_bams(
bamfiles = bam,
samplenames = "quasr_pe",
regions = GenomicRanges::GRanges(
seqnames = "random_genome_700bp",
strand = "+",
IRanges::IRanges(start = 200, end = 500)
),
genome = genome,
whichContext = "GCH",
remove_nonunique = FALSE,
clip_until_nbg = 0L,
max_bisC_meth = 1
)
# Extract WCG endogenous methylation states for a region of interest (ROI)
WCG_tbl_mat <- get_data_matrix_from_bams(
bamfiles = bam,
samplenames = "quasr_pe",
regions = GenomicRanges::GRanges(
seqnames = "random_genome_700bp",
strand = "+",
IRanges::IRanges(start = 200, end = 500)
),
genome = genome,
whichContext = "WCG",
remove_nonunique = FALSE,
clip_until_nbg = 0L,
max_bisC_meth = 1
)
# Extract GCH protection for all molecules within ROI
GCH_tbl_methlist <- get_molecule_data_list_from_bams(
bamfiles = bam,
samplenames = "quasr_pe",
regions = GenomicRanges::GRanges(
seqnames = "random_genome_700bp",
strand = "+",
IRanges::IRanges(start = 200, end = 500)
),
genome = genome,
whichContext = "GCH",
remove_nonunique = FALSE,
clip_until_nbg = 0L,
max_bisC_meth = 1,
min_frag_data_len = 50L,
min_frag_data_dens = 0.05
)
# Extract co-occurrence count tables for footprint spectral analysis
ctable <- get_ctable_from_bams(
bamfiles = bam,
samplenames = "quasr_pe",
regions = GenomicRanges::GRanges(
seqnames = "random_genome_700bp",
strand = "+",
IRanges::IRanges(start = 200, end = 500)
),
genome = genome,
alignerUsed = "QuasR",
min_frag_data_len = 0L,
min_frag_data_dens = 0.0,
max_spacing = 300,
remove_nonunique = FALSE,
clip_until_nbg = 0L,
max_bisC_meth = 1
)
```
## Citation
If you use `fetchNOMe` in your work, please cite:
> Evgeniy A. Ozonov. **fetchNOMe**: fast retrieval of NOMe-seq data from BAM files (https://doi.org/10.5281/zenodo.8402785). Available at: https://github.com/fmi-basel/gpeters-fetchNOMe
## License
This package is licensed under the MIT License.
Copyright (c) 2023 Friedrich Miescher Institute for Biomedical Research
Owner
- Name: Friedrich Miescher Institute for Biomedical Research
- Login: fmi-basel
- Kind: organization
- Location: Basel, Switzerland
- Website: http://www.fmi.ch/
- Repositories: 28
- Profile: https://github.com/fmi-basel
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: 'fetchNOMe: fast retrieval of NOMe-seq data from BAM files'
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Evgeniy A.
family-names: Ozonov
email: evgeniy.ozonov@fmi.ch
affiliation: >-
FMI, Basel, Switzerland (research group of Antoine
Peters)
orcid: 'https://orcid.org/0000-0003-4584-4939'
identifiers:
- type: doi
value: 10.5281/zenodo.8402785
repository-code: >-
https://github.com/fmi-basel/gpeters-fetchNOMe/releases/tag/v0.1.0
abstract: >-
fetchNOMe R package provides functionality for fast
retrieval of NOMe-seq data (M.CviPI-induced GpC
methylation as well as endogenous CpG methylation) from
BAM files generated by pipelines for alignment of
bisulfite converted cytosine methylation data.
license: MIT
GitHub Events
Total
- Member event: 2
- Push event: 7
- Create event: 1
Last Year
- Member event: 2
- Push event: 7
- Create event: 1
Dependencies
.github/workflows/R-CMD-check.yaml
actions
- actions/checkout v3 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION
cran
- R >= 3.4.0 depends
- GenomeInfoDb * imports
- GenomicRanges * imports
- IRanges * imports
- Rcpp >= 0.12.17 imports
- Rsamtools >= 2.13.1 imports
- parallel * imports
- tibble * imports
- testthat >= 3.0.0 suggests