gpeters-fetchnome

R package for fetching NOMe-seq data from BAM files

https://github.com/fmi-basel/gpeters-fetchnome

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.0%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

R package for fetching NOMe-seq data from BAM files

Basic Info
  • Host: GitHub
  • Owner: fmi-basel
  • License: other
  • Language: C++
  • Default Branch: main
  • Size: 198 KB
Statistics
  • Stars: 0
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created over 2 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# fetchNOMe




[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.8402785.svg)](https://doi.org/10.5281/zenodo.8402785)



[![R-CMD-check](https://github.com/fmi-basel/gpeters-fetchNOMe/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/fmi-basel/gpeters-fetchNOMe/actions/workflows/R-CMD-check.yaml)





## Overview

**fetchNOMe** is an R package for fast and efficient retrieval of NOMe-seq data from BAM files. NOMe-seq (Nucleosome Occupancy and Methylome Sequencing) simultaneously captures endogenous CpG methylation and M.CviPI-induced GpC methylation, providing a high-resolution map of chromatin accessibility and DNA methylation.

This package is specifically designed to process BAM files generated from pipelines aligning bisulfite-converted DNA, such as Bismark, QuasR and BISCUIT aligners.

## Features

- Extract GpC (M.CviPI-induced) and CpG (endogenous) methylation directly from BAM files (`get_data_matrix_from_bams`, `get_molecule_data_list_from_bams`)
- Extract co-occurrence statistics from BAM files for footprint spectral analysis by the **nomeR** package (`get_ctable_from_bams`)
- Extract statistics of protected states (`get_protect_stats_from_bams`)

## Requirements

 - Indexed BAM files from NOMe-seq experiments
 - Index (.bai) file in the same location as the BAM

## Installation

You can install the development version of fetchNOMe from [GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("fmi-basel/gpeters-fetchNOMe")
```
## Usage 

``` r
library(fetchNOMe)

# Define BAM file path
bam <- system.file("extdata", "QuasR_test.bam", package = "fetchNOMe")

# Define genome FASTA file path
genome <- system.file("extdata", "random_genome_700bp.fa", package = "fetchNOMe")

# Extract GCH protection states for a region of interest (ROI)
GCH_tbl_mat <- get_data_matrix_from_bams(
  bamfiles          = bam,
  samplenames       = "quasr_pe",
  regions           = GenomicRanges::GRanges(
                        seqnames = "random_genome_700bp",
                        strand   = "+",
                        IRanges::IRanges(start = 200, end = 500)
                      ),
  genome            = genome,
  whichContext      = "GCH",
  remove_nonunique  = FALSE,
  clip_until_nbg    = 0L,
  max_bisC_meth     = 1
)

# Extract WCG endogenous methylation states for a region of interest (ROI)
WCG_tbl_mat <- get_data_matrix_from_bams(
  bamfiles          = bam,
  samplenames       = "quasr_pe",
  regions           = GenomicRanges::GRanges(
                        seqnames = "random_genome_700bp",
                        strand   = "+",
                        IRanges::IRanges(start = 200, end = 500)
                      ),
  genome            = genome,
  whichContext      = "WCG",
  remove_nonunique  = FALSE,
  clip_until_nbg    = 0L,
  max_bisC_meth     = 1
)

# Extract GCH protection for all molecules within ROI
GCH_tbl_methlist <- get_molecule_data_list_from_bams(
  bamfiles           = bam,
  samplenames        = "quasr_pe",
  regions            = GenomicRanges::GRanges(
                         seqnames = "random_genome_700bp",
                         strand   = "+",
                         IRanges::IRanges(start = 200, end = 500)
                       ),
  genome             = genome,
  whichContext       = "GCH",
  remove_nonunique   = FALSE,
  clip_until_nbg     = 0L,
  max_bisC_meth      = 1,
  min_frag_data_len  = 50L,
  min_frag_data_dens = 0.05
)

# Extract co-occurrence count tables for footprint spectral analysis
ctable <- get_ctable_from_bams(
  bamfiles           = bam,
  samplenames        = "quasr_pe",
  regions            = GenomicRanges::GRanges(
                         seqnames = "random_genome_700bp",
                         strand   = "+",
                         IRanges::IRanges(start = 200, end = 500)
                       ),
  genome             = genome,
  alignerUsed        = "QuasR",
  min_frag_data_len  = 0L,
  min_frag_data_dens = 0.0,
  max_spacing        = 300,
  remove_nonunique   = FALSE,
  clip_until_nbg     = 0L,
  max_bisC_meth      = 1
)

																			
```


## Citation

If you use `fetchNOMe` in your work, please cite:

> Evgeniy A. Ozonov. **fetchNOMe**: fast retrieval of NOMe-seq data from BAM files (https://doi.org/10.5281/zenodo.8402785). Available at: https://github.com/fmi-basel/gpeters-fetchNOMe


## License

This package is licensed under the MIT License.
Copyright (c) 2023 Friedrich Miescher Institute for Biomedical Research

Owner

  • Name: Friedrich Miescher Institute for Biomedical Research
  • Login: fmi-basel
  • Kind: organization
  • Location: Basel, Switzerland

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: 'fetchNOMe: fast retrieval of NOMe-seq data from BAM files'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Evgeniy A.
    family-names: Ozonov
    email: evgeniy.ozonov@fmi.ch
    affiliation: >-
      FMI, Basel, Switzerland (research group of Antoine
      Peters)
    orcid: 'https://orcid.org/0000-0003-4584-4939'
identifiers:
  - type: doi
    value: 10.5281/zenodo.8402785
repository-code: >-
  https://github.com/fmi-basel/gpeters-fetchNOMe/releases/tag/v0.1.0
abstract: >-
  fetchNOMe R package provides functionality for fast
  retrieval of NOMe-seq data (M.CviPI-induced GpC
  methylation as well as endogenous CpG methylation) from
  BAM files generated by pipelines for alignment of
  bisulfite converted cytosine methylation data.
license: MIT

GitHub Events

Total
  • Member event: 2
  • Push event: 7
  • Create event: 1
Last Year
  • Member event: 2
  • Push event: 7
  • Create event: 1

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • R >= 3.4.0 depends
  • GenomeInfoDb * imports
  • GenomicRanges * imports
  • IRanges * imports
  • Rcpp >= 0.12.17 imports
  • Rsamtools >= 2.13.1 imports
  • parallel * imports
  • tibble * imports
  • testthat >= 3.0.0 suggests