RAMClustR

Assigning precursor-product ion relationships in indiscriminant MS/MS data

https://github.com/cbroeckl/ramclustr

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    2 of 10 committers (20.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.4%) to scientific vocabulary

Keywords from Contributors

mass-spectrometry metabolomics usegalaxy gass-chromatography annotations fuzzy-matching fuzzy-search similarity-measures genomics
Last synced: 7 months ago · JSON representation

Repository

Assigning precursor-product ion relationships in indiscriminant MS/MS data

Basic Info
  • Host: GitHub
  • Owner: cbroeckl
  • License: other
  • Language: R
  • Default Branch: master
  • Size: 89.6 MB
Statistics
  • Stars: 14
  • Watchers: 8
  • Forks: 17
  • Open Issues: 10
  • Releases: 0
Created over 12 years ago · Last pushed 9 months ago
Metadata Files
Readme Changelog License

README.md

Codecov

RAMClustR: Mass Spectrometry Metabolomics Feature Clustering and Interpretation

A feature clustering algorithm for non-targeted mass spectrometric metabolomics data. This method is compatible with gas and liquid chromatography coupled mass spectrometry, including indiscriminant tandem mass spectrometry data.

Documentation for users

Installation

The newest version of the package can be installed through conda from the bioconda channel:

bash conda install -c bioconda r-ramclustr

Or you can alternatively Install from R console:

install.packages("devtools", repos="http://cran.us.r-project.org", dependencies=TRUE)

library(devtools)

installgithub("cbroeckl/RAMClustR", buildvignettes = TRUE, dependencies = TRUE)

library(RAMClustR)

vignette("RAMClustR")

Introduction

Main clustering function output - see citation for algorithm description or vignette('RAMClustR') for a walk through. batch.qc. normalization requires input of three vectors (1) batch (2) order (3) qc. This is a feature centric normalization approach which adjusts signal intensities first by comparing batch median intensity of each feature (one feature at a time) QC signal intensity to full dataset median to correct for systematic batch effects and then secondly to apply a local QC median vs global median sample correction to correct for run order effects.

There are two pathways for using RAMClustR; You can use either use the main ramclustR function or the individual stepwise workflow.

Below is a small example of using main ramclustR function. ```R

Choose input file with feature column names mz_rt (expected by default).

Column with sample name is expected to be first (by default).

These can be adjusted with the featdelim and sampNameCol parameters.

wd <- getwd() filename <- file.path(wd, "testdata/peaks.csv") pheno <- file.path(wd, "testdata/phenoData.csv") print(filename) head(data.frame(read.csv(filename)), c(6L, 5L))

If the file contains features from MS1, assign those to the ms parameter.

If the file contains features from MS2, assign those to the idmsms parameter.

If you ran xcms for the feature detection, the assign the output to the xcmsObj parameter.

In this example we use a MS1 feature table stored in a csv file.

setwd(tempdir()) ramclustobj <- ramclustR( ms = filename, pheno_csv = pheno, st = 5, maxt = 1, blocksize = 1000 )

Investigate the deconvoluted features in the spectra folder in MSP format

or inspect the ramclustobj for feature retention times, annotations etc.

print(ramclustobj$ann) print(ramclustobj$nfeat) print(ramclustobj$SpecAbund[,1:6]) setwd(wd) ```

Individual stepwise workflow

alt text

Below is a small example of using Individual stepwise workflow. ```R set.seed(123) # to get reproducible results with jitters wd <- getwd() tmp <- tempdir() load(file.path("testdata", "test.rc.ramclustr.fillpeaks"))

setwd(tmp)

ramclustObj <- rc.get.xcms.data(xcmsObj = xdata) ramclustObj <- rc.expand.sample.names(ramclustObj = ramclustObj) ramclustObj <- rc.feature.replace.na(ramclustObj = ramclustObj) ramclustObj <- rc.feature.filter.blanks(ramclustObj = ramclustObj, blank.tag = "Blanc") ramclustObj <- rc.feature.normalize.qc(ramclustObj = ramclustObj, qc.tag = "QC") ramclustObj <- rc.feature.filter.cv(ramclustObj = ramclustObj) ramclustObj <- rc.ramclustr(ramclustObj = ramclustObj) ramclustObj <- rc.qc(ramclustObj = ramclustObj) ramclustObj <- do.findmain(ramclustObj = ramclustObj)

Investigate the deconvoluted features in the spectra folder in MSP format

or inspect the ramclustobj for feature retention times, annotations etc.

print(ramclustobj$ann) print(ramclustobj$nfeat) print(ramclustobj$SpecAbund[,1:6]) setwd(wd) ```

Documentation for developers

Installation

Developing with conda

bash git clone https://github.com/cbroeckl/RAMClustR.git cd RAMClustR conda env create -n ramclustr-dev -f=conda/environment-dev.yaml conda activate ramclustr-dev

Developing with docker

bash git clone https://github.com/cbroeckl/RAMClustR.git cd RAMClustR docker-compose build # To build the container docker-compose up -d # To start the container in detached mode docker exec -it ramclustr_container /bin/bash # To ssh into the container docker-compose down # To stop and remove the container along with its network

Testing

```R

Activate the ramclustr-dev environment

Run the below command on R console

devtools::test() ```

References

Broeckling CD, Afsar FA, Neumann S, Ben-Hur A, Prenni JE. RAMClust: a novel feature clustering method enables spectral-matching-based annotation for metabolomics data. Anal Chem. 2014 Jul 15;86(14):6812-7. doi: 10.1021/ac501530d. Epub 2014 Jun 26. PubMed PMID: 24927477.

Broeckling CD, Ganna A, Layer M, Brown K, Sutton B, Ingelsson E, Peers G, Prenni JE. Enabling Efficient and Confident Annotation of LC-MS Metabolomics Data through MS1 Spectrum and Time Prediction. Anal Chem. 2016 Sep 20;88(18):9226-34. doi: 10.1021/acs.analchem.6b02479. Epub 2016 Sep 8. PubMed PMID: 7560453.

Owner

  • Name: Corey Broeckling
  • Login: cbroeckl
  • Kind: user
  • Location: Fort Collins, CO, USA
  • Company: Colorado State University

GitHub Events

Total
  • Issue comment event: 16
  • Push event: 10
  • Pull request review comment event: 2
  • Pull request review event: 3
  • Pull request event: 6
  • Fork event: 1
Last Year
  • Issue comment event: 16
  • Push event: 10
  • Pull request review comment event: 2
  • Pull request review event: 3
  • Pull request event: 6
  • Fork event: 1

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 638
  • Total Committers: 10
  • Avg Commits per committer: 63.8
  • Development Distribution Score (DDS): 0.453
Past Year
  • Commits: 188
  • Committers: 4
  • Avg Commits per committer: 47.0
  • Development Distribution Score (DDS): 0.298
Top Committers
Name Email Commits
Corey Broeckling c****l 349
Zargham Ahmad z****2@g****m 132
cbroeckl c****g@c****u 46
Corey Broeckling c****l@c****u 42
hechth h****t@r****z 40
Steffen Neumann s****n@i****e 12
Matej Trojak t****k@m****z 9
Zargham Ahmad 4****d 5
rickhelmus 3****s 2
maximskorik m****k@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: about 1 year ago

All Time
  • Total issues: 40
  • Total pull requests: 14
  • Average time to close issues: 12 months
  • Average time to close pull requests: 23 days
  • Total issue authors: 16
  • Total pull request authors: 7
  • Average comments per issue: 3.93
  • Average comments per pull request: 1.64
  • Merged pull requests: 13
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 8 days
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 8.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • hechth (15)
  • sneumann (6)
  • rickhelmus (3)
  • Slycopersicum (2)
  • martenson (2)
  • jsaintvanne (2)
  • cbroeckl (1)
  • dwalke04 (1)
  • priyanka-1802 (1)
  • arpita-007 (1)
  • tsufz (1)
  • Phylloxera (1)
  • stanstrup (1)
  • eterlova (1)
  • brooklynnrm (1)
Pull Request Authors
  • hechth (8)
  • cbroeckl (3)
  • acquayefrank (3)
  • maximskorik (1)
  • xtrojak (1)
  • zargham-ahmad (1)
  • rickhelmus (1)
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 248 last-month
  • Total docker downloads: 42,221
  • Total dependent packages: 0
  • Total dependent repositories: 3
  • Total versions: 12
  • Total maintainers: 1
cran.r-project.org: RAMClustR

Mass Spectrometry Metabolomics Feature Clustering and Interpretation

  • Versions: 12
  • Dependent Packages: 0
  • Dependent Repositories: 3
  • Downloads: 248 Last month
  • Docker Downloads: 42,221
Rankings
Forks count: 4.6%
Stargazers count: 15.1%
Dependent repos count: 16.6%
Average: 19.5%
Docker downloads count: 23.1%
Dependent packages count: 28.9%
Downloads: 28.9%
Maintainers (1)
Last synced: 9 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.5.0 depends
  • RCurl * imports
  • Spectra * imports
  • dynamicTreeCut * imports
  • e1071 * imports
  • fastcluster * imports
  • ff * imports
  • ggplot2 * imports
  • gplots * imports
  • httr * imports
  • jsonlite * imports
  • pcaMethods * imports
  • preprocessCore * imports
  • readxl * imports
  • stringr * imports
  • utils * imports
  • webchem * imports
  • BiocManager * suggests
  • InterpretMSSpectrum * suggests
  • MSnbase * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • stringi * suggests
  • testthat * suggests
  • xcms * suggests
  • xml2 * suggests
.github/workflows/codecov.yml actions
  • actions/checkout v3 composite
  • conda-incubator/setup-miniconda v2 composite