Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 5 committers (20.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.6%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
R-based access to Mass-Spectrometry data
Basic Info
Statistics
- Stars: 24
- Watchers: 6
- Forks: 7
- Open Issues: 5
- Releases: 0
Topics
Metadata Files
README.md
R-based access to Mass-Spec data (RaMS) 
Table of contents: Overview - Installation - Usage - File types - Contact
Overview
RaMS is a lightweight package that provides rapid and tidy access to
mass-spectrometry data. This package is lightweight because it’s built
from the ground up rather than relying on an extensive network of
external libraries. No Rcpp, no Bioconductor, no long load times and
strange startup warnings. Just XML parsing provided by xml2 and data
handling provided by data.table. Access is rapid because an absolute
minimum of data processing occurs. Unlike other packages, RaMS makes
no assumptions about what you’d like to do with the data and is simply
providing access to the encoded information in an intuitive and
R-friendly way. Finally, the access is tidy in the philosophy of tidy
data. Tidy data neatly resolves
the ragged arrays that mass spectrometers produce and plays nicely with
other tidy data packages.
Installation
To install the stable version on CRAN:
r
install.packages('RaMS')
To install the current development version:
r
devtools::install_github("wkumler/RaMS", build_vignettes = TRUE)
Finally, load RaMS like every other package:
r
library(RaMS)
Usage
There’s only one main function in RaMS: the aptly named grabMSdata.
This function accepts the names of mass-spectrometry files as well as
the data you’d like to extract (e.g. MS1, MS2, BPC, etc.) and produces a
list of data tables. Each table is intuitively named within the list and
formatted tidily:
``` r msdatadir <- system.file("extdata", package = "RaMS") msdatafiles <- list.files(msdata_dir, pattern = "mzML", full.names=TRUE)
msdata <- grabMSdata(files = msdatafiles[2:4], grabwhat = c("BPC", "MS1")) ```
Some additional examples can be found below, but a more thorough
introduction can be found in the
vignette
or by typing vignette("Intro-to-RaMS", package = "RaMS") in the R
console after installation.
BPC/TIC data:
Base peak chromatograms (BPCs) and total ion chromatograms (TICs) have three columns, making them super-simple to plot with either base R or the popular ggplot2 library:
r
knitr::kable(head(msdata$BPC, 3))
| rt | int | filename | |---------:|---------:|:------------------| | 4.009000 | 11141859 | LB12HLAB.mzML.gz | | 4.024533 | 9982309 | LB12HLAB.mzML.gz | | 4.040133 | 10653922 | LB12HL_AB.mzML.gz |
r
plot(msdata$BPC$rt, msdata$BPC$int, type = "l", ylab="Intensity")
<!-- -->
r
library(ggplot2)
ggplot(msdata$BPC) + geom_line(aes(x = rt, y=int, color=filename)) +
facet_wrap(~filename, scales = "free_y", ncol = 1) +
labs(x="Retention time (min)", y="Intensity", color="File name: ") +
theme(legend.position="top")
<!-- -->
MS1 data:
MS1 data includes an additional dimension, the m/z of each ion measured, and has multiple entries per retention time:
r
knitr::kable(head(msdata$MS1, 3))
| rt | mz | int | filename | |------:|---------:|-----------:|:------------------| | 4.009 | 139.0503 | 1800550.12 | LB12HLAB.mzML.gz | | 4.009 | 148.0967 | 206310.81 | LB12HLAB.mzML.gz | | 4.009 | 136.0618 | 71907.15 | LB12HL_AB.mzML.gz |
This tidy format means that it plays nicely with other tidy data
packages. Here, we use
data.table and a few
other tidyverse packages to compare a molecule’s 13C and
15N peak areas to that of the base peak, giving us some clue
as to its molecular formula. Note also the use of the trapz function
(available in v1.3.2+) to calculate the area of the peak given the
retention time and intensity values.
``` r library(data.table) library(tidyverse)
M <- 118.0865 M13C <- M + 1.003355 M15N <- M + 0.997035
isodata <- imapdfr(lst(M, M13C, M15N), function(mass, isotope){ peakdata <- msdata$MS1[mz%between%pmppm(mass) & rt%between%c(7.6, 8.2)] cbind(peakdata, isotope) })
isodata %>% groupby(filename, isotope) %>% summarise(area=trapz(rt, int)) %>% pivotwider(namesfrom = isotope, valuesfrom = area) %>% mutate(ratio13C12C = M13C/M) %>% mutate(ratio15N14N = M15N/M) %>% select(filename, contains("ratio")) %>% pivotlonger(cols = contains("ratio"), namesto = "isotope") %>% groupby(isotope) %>% summarize(avgratio = mean(value), sdratio = sd(value), .groups="drop") %>% mutate(isotope=strextract(isotope, "(?<=).*(?=_)")) %>% knitr::kable() ```
| isotope | avgratio | sdratio | |:--------|----------:|----------:| | 13C | 0.0544072 | 0.0005925 | | 15N | 0.0033611 | 0.0001578 |
With natural abundances for 13C and 15N of 1.11% and 0.36%, respectively, we can conclude that this molecule likely has five carbons and a single nitrogen.
Of course, it’s always a good idea to plot the peaks and perform a manual check of data quality:
r
ggplot(iso_data) +
geom_line(aes(x=rt, y=int, color=filename)) +
facet_wrap(~isotope, scales = "free_y", ncol = 1)
<!-- -->
MS1 data typically consists of many individual chromatograms, so RaMS provides a small function that can bin it into chromatograms based on m/z windows.
r
msdata$MS1 %>%
arrange(desc(int)) %>%
mutate(mz_group=mz_group(mz, ppm=10, max_groups = 3)) %>%
qplotMS1data(facet_col = "mz_group")
<!-- -->
We also use the qplotMS1data function above, which wraps the typical
ggplot call to avoid needing to type out
ggplot() + geom_line(aes(x=rt, y=int, group=filename)) every time.
Both the mz_group and qplotMS1data functions were added in RaMS
version 1.3.2.
MS2 data:
DDA (fragmentation) data can also be extracted, allowing rapid and intuitive searches for fragments or neutral losses:
r
msdata <- grabMSdata(files = msdata_files[1], grab_what = "MS2")
For example, we may be interested in the major fragments of a specific molecule:
r
msdata$MS2[premz%between%pmppm(351.0817) & int>mean(int)] %>%
plot(int~fragmz, type="h", data=., ylab="Intensity", xlab="Fragment m/z")
<!-- -->
Or want to search for precursors with a specific neutral loss above a certain intensity:
r
msdata$MS2[, neutral_loss:=premz-fragmz][int>1e4] %>%
filter(neutral_loss%between%pmppm(126.1408, 5)) %>%
head(3) %>% knitr::kable()
| rt | premz | fragmz | int | voltage | filename | neutralloss | |---:|---:|---:|---:|---:|:---|---:| | 47.27750 | 351.0817 | 224.9409 | 16333.23 | 40 | Blank129I1Lpos20240207-MS3.mzML.gz | 126.1408 | | 47.35267 | 351.0818 | 224.9410 | 27353.09 | 40 | Blank129I1Lpos20240207-MS3.mzML.gz | 126.1408 | | 47.42767 | 351.0818 | 224.9410 | 33843.92 | 40 | Blank129I1Lpos_20240207-MS3.mzML.gz | 126.1408 |
SRM/MRM data
Selected/multiple reaction monitoring files don’t have data stored in
the typical MSn format but instead encode their values as chromatograms.
To extract data in this format, include "chroms" in the grab_what
argument:
r
chromsdata <- grabMSdata(files = msdata_files[7], grab_what = "chroms", verbosity = 0)
which has individual reactions separated by the chrom_type column (and
the associated index) with relevant target/fragment data:
r
knitr::kable(head(chromsdata$chroms, 3))
| chromtype | chromindex | targetmz | productmz | rt | int | filename | |:-----------|:------------|----------:|-----------:|---------:|----:|:-----------------| | TIC | 0 | NA | NA | 2.000000 | 0 | wkchrom.mzML.gz | | TIC | 0 | NA | NA | 2.048077 | 0 | wkchrom.mzML.gz | | TIC | 0 | NA | NA | 2.096154 | 0 | wk_chrom.mzML.gz |
Minifying MS files
As of version 1.1.0, RaMS has functions that allow irrelevant data to
be removed from the file to reduce file sizes. See the
vignette
for more details.
tmzML documents
Version 1.2.0 of RaMS introduced a new file type, the “transposed mzML” or “tmzML” file to resolve the large memory requirement when working with many files. See the vignette for more details, though note that I’ve largely deprecated this file type in favor of proper database solutions as in the speed & size comparison vignette.
File types
RaMS is currently limited to the modern mzML data format and the
slightly older mzXML format. Tools to convert data from other
formats are available through
Proteowizard’s
msconvert tool. Data can, however, be gzip compressed (file ending
.gz) and this compression actually speeds up data retrieval
significantly as well as reducing file sizes.
Currently, RaMS handles MS1, MS2, and
MS3 data. This should be easy enough to expand in the future,
but right now I haven’t observed a demonstrated need for higher
fragmentation level data collection.
Additionally, note that files can be streamed from the internet directly
if a URL is provided to grabMSdata, although this will usually take
longer than reading a file from disk:
``` r
Not run:
Find a file with a web browser:
browseURL("https://www.ebi.ac.uk/metabolights/MTBLS703/files")
Copy link address by right-clicking "download" button:
sampleurl <- paste0("https://www.ebi.ac.uk/metabolights/ws/studies/MTBLS703/", "download/acefcd61-a634-4f35-9c3c-c572ade5acf3?file=", "FILES/161024SmpLB12HLAB_pos.mzXML")
msdata <- grabMSdata(sampleurl, grabwhat="everything", verbosity=2) msdata$metadata ```
For an analysis of how RaMS compares to other methods of MS data access and alternative file types, consider browsing the speed & size comparison vignette.
Contact
Feel free to submit questions, bugs, or feature requests on the GitHub Issues page.
README last built on 2025-07-29
Owner
- Name: William
- Login: wkumler
- Kind: user
- Location: University of Washington, Seattle, WA
- Repositories: 2
- Profile: https://github.com/wkumler
Graduate student at the University of Washington
GitHub Events
Total
- Issues event: 5
- Watch event: 2
- Issue comment event: 5
- Push event: 3
- Pull request event: 2
- Create event: 1
Last Year
- Issues event: 5
- Watch event: 2
- Issue comment event: 5
- Push event: 3
- Pull request event: 2
- Create event: 1
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| wkumler | w****r@u****u | 412 |
| William | 4****r | 11 |
| Ricardo Cunha | 6****a | 6 |
| ricardobachertdacunha | c****a@i****e | 3 |
| Ethan Bass | e****s@g****m | 3 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 29
- Total pull requests: 25
- Average time to close issues: 3 months
- Average time to close pull requests: 12 days
- Total issue authors: 8
- Total pull request authors: 3
- Average comments per issue: 2.03
- Average comments per pull request: 1.28
- Merged pull requests: 24
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 5
- Pull requests: 7
- Average time to close issues: about 3 hours
- Average time to close pull requests: about 8 hours
- Issue authors: 3
- Pull request authors: 2
- Average comments per issue: 0.8
- Average comments per pull request: 1.14
- Merged pull requests: 6
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- wkumler (20)
- ethanbass (2)
- OmarAshkar (2)
- ricardobachertdacunha (1)
- YonghuiDong (1)
- RaynerQueiroz (1)
- plyush1993 (1)
- tentrillion (1)
Pull Request Authors
- wkumler (19)
- ethanbass (4)
- ricardobachertdacunha (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 622 last-month
- Total docker downloads: 21,613
- Total dependent packages: 1
- Total dependent repositories: 1
- Total versions: 6
- Total maintainers: 1
cran.r-project.org: RaMS
R Access to Mass-Spec Data
- Homepage: https://github.com/wkumler/RaMS
- Documentation: http://cran.r-project.org/web/packages/RaMS/RaMS.pdf
- License: MIT + file LICENSE
-
Latest release: 1.4.3
published over 1 year ago
Rankings
Maintainers (1)
Dependencies
- base64enc * imports
- data.table * imports
- utils * imports
- xml2 * imports
- DBI * suggests
- RSQLite * suggests
- dplyr * suggests
- ggplot2 * suggests
- knitr * suggests
- openxlsx * suggests
- plotly * suggests
- reticulate * suggests
- rmarkdown * suggests
- testthat * suggests
- tidyverse * suggests
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/upload-artifact main composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite