https://github.com/aalto-ics-kepaco/msms_rt_ssvm

Implementation of the LC-MS²Struct model published in the manuscript "Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data" by Bach et al.

https://github.com/aalto-ics-kepaco/msms_rt_ssvm

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: biorxiv.org, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.3%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Implementation of the LC-MS²Struct model published in the manuscript "Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data" by Bach et al.

Basic Info
  • Host: GitHub
  • Owner: aalto-ics-kepaco
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 276 MB
Statistics
  • Stars: 6
  • Watchers: 1
  • Forks: 5
  • Open Issues: 12
  • Releases: 0
Created about 5 years ago · Last pushed almost 4 years ago

https://github.com/aalto-ics-kepaco/msms_rt_ssvm/blob/master/

# LC-MSStruct

This package implements a [Structured Support Vector Machine (SSVM)](https://en.wikipedia.org/wiki/Structured_support_vector_machine) 
model for the molecule structure prediction of liquid chromatography (LC) tandem mass spectrometry data (MS). This 
work is part of the publication:

**"Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data"**,

*Eric Bach, Emma L. Schymanski and Juho Rousu*, 2022


We consider the output of an LC-MS experiment as *structured* output. The structure is thereby assumed to be 
imposed by the observed *retention orders* (RO) of the MS features, i.e. MS-information, MS-spectrum, and 
retention time (RT). We assume, that for each MS feature a set of potential molecular structures, the so-called 
candidate set, can be generated. The idea is to predict a ranking of the candidate structures associated with *each* 
features. The SSVM framework allows us to predict rankings that are not independent of each other, but are taking 
into account the observed ROs, which are assumed to give *structure* respectively additional constraints which 
improve the ranking. 

## Installation

That's how you install the package: 

1) Clone the package and change to the directory:
```bash
git clone https://github.com/aalto-ics-kepaco/msms_rt_ssvm
cd msms_rt_ssvm
```

2) Create a **conda** environment and install dependencies:
```bash
conda env create -f environment.yml
conda activate lcms2struct
```

3) Install the package:
```bash
pip install .
```

4) Leave the package directory:
```bash
cd ..  
```

5) Clone the package-dependency "[msmsrt_scorer](https://github.com/aalto-ics-kepaco/msms_rt_score_integration)", 
   implementing the max-marginal (see Paper) inference, and change to the directory:
```bash
git clone https://github.com/aalto-ics-kepaco/msms_rt_score_integration
cd msms_rt_score_integration
```

6) Install the "msmsrt_scorer" package (it is assumed that the conda environment is active):
```bash
pip install .
```

7) (Optional) Change back to the msms_rt_ssvm directory and test the package:
```bash
cd ../msms_rt_ssvm

# Unpack test databases
gunzip --keep ssvm/tests/Bach2020_test_db.sqlite.gz
gunzip --keep ssvm/tests/Massbank_test_db.sqlite.gz

# Run the tests
python -m unittest discover -s ssvm/tests -p 'unittests*.py'

## Expected output ##
# .............s................s.....................s...................s.....s..................................s......
# ----------------------------------------------------------------------
# Ran 121 tests in 99.599s
# 
# OK (skipped=6)
```

All code was developed and tested in a Linux environment. Other operating systems are not supported. 

## Usage

Example usages of the package can be found the [repository of the experiments](https://github.com/aalto-ics-kepaco/lcms2struct_exp)
done for the manuscript.

## Cite the package

If you use this package, please cite our original publication:

```bibtex
@article {Bach2022,
  author = {Bach, Eric and Schymanski, Emma L. and Rousu, Juho},
  title = {Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data},
  elocation-id = {2022.02.11.480137},
  year = {2022},
  doi = {10.1101/2022.02.11.480137}, 
  publisher = {Cold Spring Harbor Laboratory},
  URL = {https://www.biorxiv.org/content/early/2022/04/27/2022.02.11.480137},
  eprint = {https://www.biorxiv.org/content/early/2022/04/27/2022.02.11.480137.full.pdf},
  journal = {bioRxiv}
}
```
Software citation: [![DOI](https://zenodo.org/badge/357853378.svg)](https://zenodo.org/badge/latestdoi/357853378)

Owner

  • Name: KEPACO
  • Login: aalto-ics-kepaco
  • Kind: organization
  • Location: Espoo, Finland

Kernel Machines, Pattern Analysis and Computational Metabolomics - Research group at Aalto University

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1