mzrecal

Recalibrate Mass Spectrometry data in mzML format

https://github.com/524d/mzrecal

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.1%) to scientific vocabulary

Keywords

golang mass-spectrometry mzidentml mzml proteomics recalibration
Last synced: 6 months ago · JSON representation ·

Repository

Recalibrate Mass Spectrometry data in mzML format

Basic Info
  • Host: GitHub
  • Owner: 524D
  • License: mit
  • Language: Go
  • Default Branch: master
  • Homepage:
  • Size: 6.26 MB
Statistics
  • Stars: 9
  • Watchers: 3
  • Forks: 0
  • Open Issues: 4
  • Releases: 10
Topics
golang mass-spectrometry mzidentml mzml proteomics recalibration
Created over 6 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

CodeQL Go Report Card License

mzRecal

What does mzRecal do?

mzrecal recalibrates mass spectrometry (MS1) data in mzML format, using peptide identifications in mzIdentML. mzRecal uses calibration functions based on the physics of the mass analyzer (FTICR, Orbitrap, TOF). The recalibration procedure was originally developed by Magnus Palmblad [1][2]. See also msRecal and recal2 for more information on the predecessors of mzRecal. Consuming and producing data in the same, open standard, format (mzML), mzRecal can be inserted into virtually any modular proteomics data analysis workflow, similar to msRecal [3]. This latest iteration of the software was described in an Application Note by Marissen and Palmblad in 2021 [4].

Check section Usage for a more complete description.

Running mzRecal

Ready-to-run executables of mzRecal for Linux and Microsoft Windows can be downloaded from https://github.com/524D/mzrecal/releases/latest (under "assets"). These executables have no external dependencies.

Compiling

mzRecal is written in Go. The software was tested with Go version 1.16

Linux

On any recent Ubuntu/Debian, to install the prerequisites and download/build the executable:

bash sudo apt install git git clone https://github.com/524D/mzrecal cd mzrecal; ./build.sh

The executables (both for Linux and for Windows) are put in directory ~/tools.

Windows

On Windows, to install the prerequisites and download/build the executable:

  • install Go using default install options
  • Install git using default install options
  • Restart Windows to add newly installed software to the PATH
  • Open git bash (from the Windows start menu)
  • Get mzRecal. From git bash prompt: git clone https://github.com/524D/mzrecal
  • Build mzRecal. From git bash prompt: cd mzrecal; ./build.sh. The executables (both for Windows and for Linux) are put in directory tools relative to the user's home directory.

Input and output

mzRecal uses file formats specified by the Proteomics Standards Initiative (PSI), notably mzML and mzIdentML.

For recalibration, a peak-picked mzML file and corresponding mzIdentML (file extension .mzid) file are needed as input. Running mzrecal produces a recalibrated mzML file, plus a file with recalibration parameters (.json format). The latter can be used to manually inspect the calibration for each spectrum.

Note that the output mzML file will not contain the index wrapper (which is optional according to the mzML specification, but still required by some software). The msconvert program from the ProteoWizard toolkit is recommended to add the index.

Results

Recalibration affects the MS1 spectra as well as the precursor masses of the MS2 spectra. Search engines commonly report the difference between theoretical mass and measured mass for identified peptides. The following plot shows the improvement of mzRecal on an Orbitrap and on a TOF dataset. ppm-histogram This plot was made by running plot-recal.R (included in the mzRecal repository)

Go packages for mzML and mzIdentML

The current version of the code embeds two internal Go packages, one for reading mzIdentML and one for reading/writing mzML files. These packages will likely be split into a separate module at a later time.

Usage

The following is printed by running mzrecal -help

```text USAGE: mzrecal [options]

This program can be used to recalibrate MS data in an mzML file using peptide identifications in an accompanying mzID file.

OPTIONS: -acceptprofile Accept non-peak picked (profile) input. This is a kludge, and will be removed when mzRecal can perform peak-picking. By setting "acceptprofile", the value of option "calmult" is automatically set to 0 and the default of "minpeak" is set to 100000 -cal filename filename for output of computed calibration parameters -calmult int only the topmost ( * ) peaks are considered for computing the recalibration. <1 means all peaks. (default 10) -charge range charge range of calibrants, or the string "ident". If set to "ident", only the charge as found in the mzIdentMl file will be used for calibration. (default "1:5") -debug range Print debug output for given spectrum range e.g. 3:6 -empty-non-calibrated Empty MS2 spectra for which the precursor was not recalibrated. -func function recalibration function to apply. If empty, a suitable function is determined from the instrument specified in the mzML file. Valid function names: FTICR, TOF, Orbitrap: Calibration function suitable for these instruments. POLY: Polynomial with degree (range 1:5) OFFSET: Constant m/z offset per spectrum. -mincals int minimum number of calibrants a spectrum should have to be recalibrated. If 0 (default), the minimum number of calibrants is set to the smallest number needed for the chosen recalibration function plus one. In any other case, if the specified number is too low for the calibration function, it is increased to the minimum needed value. -minpeak float minimum peak intensity to consider for computing the recalibration. (default 0) -mzid filename mzIdentMl filename -o filename filename of recalibrated mzML -ppmcal float 0 (default): remove outlier calibrants according to HUPO-PSI mzQC, the rest is accepted. > 0: max mz error (ppm) for accepting a calibrant for calibration -ppmuncal float max mz error (ppm) for trying to use calibrant for calibration (default 10) -quiet Don't print any output except for errors -rt range rt window range(s) (default "-10.0:10.0") -scorefilter string filter for PSM scores to accept. Format: ([]:[])... When multiple score names/CV terms are specified, the first one on the list that matches a score in the input file will be used. The default contains reasonable values for some common search engines and post-search scoring software: MS:1002257 (Comet:expectation value) MS:1001330 (X!Tandem:expectation value) MS:1001159 (SEQUEST:expectation value) MS:1002466 (PeptideShaker PSM score) (default "MS:1002257(0.0:1e-2)MS:1001330(0.0:1e-2)MS:1001159(0.0:1e-2)MS:1002466(0.99:)") -specfilter range range of spectrum indices to calibrate (e.g. 1000:2000). Default is all spectra -stage int 0 (default): do all calibration stages in one run 1: only compute recalibration parameters 2: perform recalibration using previously computed parameters -verbose Print more verbose progress information -version Show software version

BUILD-IN CALIBRANTS: In addition to the identified peptides, mzrecal will also use for recalibration a number of compounds that are commonly found in many samples. These compound are all assumed to have +1 charge. The following list shows the build-in compounds with their (uncharged) masses: cyclosiloxane6 (444.112748) cyclosiloxane7 (518.131539) cyclosiloxane8 (592.150331) cyclosiloxane9 (666.169122) cyclosiloxane10 (740.187913) cyclosiloxane11 (814.206705) cyclosiloxane12 (888.225496)

ENVIRONMENT VARIABLES: When environment variable MZRECAL_DEBUG=1, extra information is added to the JSON file that can help checking the performance of mzrecal.

USAGE EXAMPLES: mzrecal yeast.mzML Recalibrate yeast.mzML using identifications in yeast.mzid, write recalibrated result to yeast-recal.mzML and write recalibration coefficients yeast-recal.json. Default parameters are used.

mzrecal -ppmuncal 20 -scorefilter 'MS:1002257(0.0:0.001)' yeast.mzML Idem, but accept peptides with 20 ppm mass error and Comet expectation value <0.001 as potential calibrants

NOTES: The mzML file that is produced after recalibration does not contain an index. If an index is required, we recommend post-processing the output file with msconvert (http://proteowizard.sourceforge.net/download.html). ```

Acknowledgements

The authors gratefully acknowledge prior contributions from co-authors and collaborators in the development and testing of prior installments of the software.

Funding

mzRecal was made possible in part due to funding from the ELIXIR Implementation Study "Crowd-sourcing the annotation of public proteomics datasets to improve data reusability".

References

[1] Palmblad M, Bindschedler LV, Gibson TM, Cramer R (2006). Automatic internal calibration in liquid chromatography/Fourier transform ion cyclotron resonance mass spectrometry of protein digests. Rapid Commun. Mass Spectrom. 2006;20(20):3076-80.

[2] Palmblad M, van der Burgt YEM, Dalebout H, Derks RJE, Schoenmaker B, Deelder AM (2009). Improving mass measurement accuracy in mass spectrometry based proteomics by combining open source tools for chromatographic alignment and internal calibration. J. Proteomics. 2009;72(4):722-4.

[3] de Bruin JS, Deelder AM, Palmblad M (2012). Scientific workflow management in proteomics. Mol. Cell. Proteomics. 2012 Jul;11(7):M111.010595.

[4] Marissen R, Palmblad M (2021). mzRecal: universal MS1 recalibration in mzML using identified peptides in mzIdentML as internal calibrants. Bioinformatics. 2021 Feb 4;btab056.

Owner

  • Name: Rob Marissen
  • Login: 524D
  • Kind: user
  • Location: Leiden, the Netherlands
  • Company: Leiden University Medical Center

About 524D: Username "RM" was not available on GitHub, so I opted for the hexadecimal representation of the UTF-8/ASCII characters codes.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: mzRecal
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Rob
    family-names: Marissen
    email: r.j.marissen@lumc.nl
    affiliation: >-
      Center for Proteomics and Metabolomics, Leiden
      University Medical Center
    orcid: 'https://orcid.org/0000-0002-1220-9173'
  - given-names: Magnus
    family-names: Palmblad
    email: N.M.Palmblad@lumc.nl
    affiliation: >-
      Center for Proteomics and Metabolomics, Leiden
      University Medical Center
    orcid: 'https://orcid.org/0000-0002-5865-8994'
identifiers:
  - type: url
    value: >-
      https://github.com/524D/mzrecal/releases/tag/v1.1.0
    description: The URL of version 1.1.0 of the software
repository-code: 'https://github.com/524D/mzrecal'
abstract: >-
  mzRecal recalibrates mass spectrometry (MS1) data
  in mzML format, using peptide identifications in
  mzIdentML.
keywords:
  - proteomics
  - mass spectrometry
license: MIT

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: almost 3 years ago

All Time
  • Total issues: 3
  • Total pull requests: 0
  • Average time to close issues: 6 days
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 0.33
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • magnuspalmblad (3)
  • 524D (1)
Pull Request Authors
Top Labels
Issue Labels
enhancement (4) documentation (1)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads: unknown
  • Total docker downloads: 30
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 20
proxy.golang.org: github.com/524D/mzrecal
  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Docker Downloads: 30
Rankings
Stargazers count: 6.5%
Dependent packages count: 7.0%
Average: 8.0%
Forks count: 9.0%
Dependent repos count: 9.3%
Last synced: 6 months ago
proxy.golang.org: github.com/524d/mzrecal
  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 7.0%
Average: 8.2%
Dependent repos count: 9.3%
Last synced: 6 months ago

Dependencies

go.mod go
  • golang.org/x/net v0.0.0-20201224014010-6772e930b67b
  • gonum.org/v1/gonum v0.9.1
go.sum go
  • dmitri.shuralyov.com/gpu/mtl v0.0.0-20190408044501-666a987793e9
  • gioui.org v0.0.0-20210308172011-57750fc8a0a6
  • github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802
  • github.com/ajstarks/svgo v0.0.0-20180226025133-644b8db467af
  • github.com/boombuler/barcode v1.0.0
  • github.com/davecgh/go-spew v1.1.0
  • github.com/fogleman/gg v1.2.1-0.20190220221249-0403632d5b90
  • github.com/fogleman/gg v1.3.0
  • github.com/go-fonts/dejavu v0.1.0
  • github.com/go-fonts/latin-modern v0.2.0
  • github.com/go-fonts/liberation v0.1.1
  • github.com/go-fonts/stix v0.1.0
  • github.com/go-gl/glfw v0.0.0-20190409004039-e6da0acd62b1
  • github.com/go-latex/latex v0.0.0-20210118124228-b3d85cf34e07
  • github.com/golang/freetype v0.0.0-20170609003504-e2365dfdc4a0
  • github.com/jung-kurt/gofpdf v1.0.0
  • github.com/jung-kurt/gofpdf v1.0.3-0.20190309125859-24315acbbda5
  • github.com/phpdave11/gofpdf v1.4.2
  • github.com/phpdave11/gofpdi v1.0.12
  • github.com/pkg/errors v0.8.1
  • github.com/pkg/errors v0.9.1
  • github.com/pmezard/go-difflib v1.0.0
  • github.com/ruudk/golang-pdf417 v0.0.0-20181029194003-1af4ab5afa58
  • github.com/stretchr/testify v1.2.2
  • golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2
  • golang.org/x/crypto v0.0.0-20190510104115-cbcb75029529
  • golang.org/x/exp v0.0.0-20180321215751-8460e604b9de
  • golang.org/x/exp v0.0.0-20180807140117-3d87b88a115f
  • golang.org/x/exp v0.0.0-20190125153040-c74c464bbbf2
  • golang.org/x/exp v0.0.0-20190306152737-a1d7652674e8
  • golang.org/x/exp v0.0.0-20191002040644-a1355ae1e2c3
  • golang.org/x/image v0.0.0-20180708004352-c73c2afc3b81
  • golang.org/x/image v0.0.0-20190227222117-0694c2d4d067
  • golang.org/x/image v0.0.0-20190802002840-cff245a6509b
  • golang.org/x/image v0.0.0-20190910094157-69e4b8554b2a
  • golang.org/x/image v0.0.0-20200119044424-58c23975cae1
  • golang.org/x/image v0.0.0-20200430140353-33d19683fad8
  • golang.org/x/image v0.0.0-20200618115811-c13761719519
  • golang.org/x/image v0.0.0-20201208152932-35266b937fa6
  • golang.org/x/image v0.0.0-20210216034530-4410531fe030
  • golang.org/x/mobile v0.0.0-20190719004257-d2bd2a29d028
  • golang.org/x/mod v0.1.0
  • golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3
  • golang.org/x/net v0.0.0-20190620200207-3b0461eec859
  • golang.org/x/net v0.0.0-20201224014010-6772e930b67b
  • golang.org/x/sync v0.0.0-20190423024810-112230192c58
  • golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a
  • golang.org/x/sys v0.0.0-20190312061237-fead79001313
  • golang.org/x/sys v0.0.0-20190412213103-97732733099d
  • golang.org/x/sys v0.0.0-20201119102817-f84b799fce68
  • golang.org/x/sys v0.0.0-20210304124612-50617c2ba197
  • golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1
  • golang.org/x/text v0.3.0
  • golang.org/x/text v0.3.3
  • golang.org/x/text v0.3.5
  • golang.org/x/tools v0.0.0-20180525024113-a5b4c53f6e8b
  • golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e
  • golang.org/x/tools v0.0.0-20190206041539-40960b6deb8e
  • golang.org/x/tools v0.0.0-20190927191325-030b2cf1153e
  • golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7
  • gonum.org/v1/gonum v0.0.0-20180816165407-929014505bf4
  • gonum.org/v1/gonum v0.8.2
  • gonum.org/v1/gonum v0.9.1
  • gonum.org/v1/netlib v0.0.0-20190313105609-8cb42192e0e0
  • gonum.org/v1/plot v0.0.0-20190515093506-e2840ee46a6b
  • gonum.org/v1/plot v0.9.0
  • rsc.io/pdf v0.1.1
.github/workflows/codeql-analysis.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
Dockerfile docker
  • scratch latest build