https://github.com/bachi55/spec2vec_gnps_data_analysis

Analysis and benchmarking of mass spectra similarity measures using gnps data set.

https://github.com/bachi55/spec2vec_gnps_data_analysis

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.5%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Analysis and benchmarking of mass spectra similarity measures using gnps data set.

Basic Info
  • Host: GitHub
  • Owner: bachi55
  • License: apache-2.0
  • Language: Jupyter Notebook
  • Default Branch: master
  • Size: 20.8 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of iomega/spec2vec_gnps_data_analysis
Created over 5 years ago · Last pushed almost 5 years ago

https://github.com/bachi55/spec2vec_gnps_data_analysis/blob/master/

![GitHub](https://img.shields.io/github/license/iomega/spec2vec_gnps_data_analysis) ![GitHub Workflow Status](https://img.shields.io/github/workflow/status/iomega/spec2vec_gnps_data_analysis/CI%20Build)

# spec2vec_gnps_data_analysis
Analysis and benchmarking of mass spectra similarity measures using gnps data set.

If you use **spec2vec** for your research, please cite the following references:

F Huber, L Ridder, S Verhoeven, JH Spaaks, F Diblen, S Rogers, JJJ van der Hooft, "Spec2Vec: Improved mass spectral similarity scoring through learning of structural relationships", bioRxiv, https://doi.org/10.1101/2020.08.11.245928 

(and if you use **matchms** as well:
F. Huber, S. Verhoeven, C. Meijer, H. Spreeuw, E. M. Villanueva Castilla, C. Geng, J.J.J. van der Hooft, S. Rogers, A. Belloum, F. Diblen, J.H. Spaaks, (2020). matchms - processing and similarity evaluation of mass spectrometry data. Journal of Open Source Software, 5(52), 2411, https://doi.org/10.21105/joss.02411 )

Thanks!

## Tutorial on matchms and Spec2Vec
Possibly the easiest way to learn how to run Spec2Vec is to follow our tutorial on `matchms` and `Spec2Vec`.

+ [Part I - Import and process MS/MS data using matchms](https://blog.esciencecenter.nl/build-your-own-mass-spectrometry-analysis-pipeline-in-python-using-matchms-part-i-d96c718c68ee)
+ [Part II - Compute spectral similarity using Spec2Vec](https://blog.esciencecenter.nl/build-a-mass-spectrometry-analysis-pipeline-in-python-using-matchms-part-ii-spec2vec-8aa639571018)
+ [Part III - Create molecular networks from Spec2Vec similarities](https://blog.esciencecenter.nl/build-a-mass-spectrometry-analysis-pipeline-in-python-using-matchms-part-iii-molecular-91891248ee34)


## Create environment
Current spec2vec works with Python 3.7 or 3.8, it might also work with earlier versions but we haven't tested.
```
conda create --name spec2vec_analysis python=3.7  # or 3.8 if you prefer
conda activate spec2vec_analysis
conda install --channel nlesc --channel bioconda --channel conda-forge spec2vec
pip install jupyter
```

## Clone this repository and run notebooks
```
git clone https://github.com/iomega/spec2vec_gnps_data_analysis
cd spec2vec_gnps_data_analysis
jupyter notebook
```

## Download data
- Original data was obtained from GNPS: https://gnps-external.ucsd.edu/gnpslibrary/ALL_GNPS.json
- Cleaned and processed GNPS dataset for positive mode spectra (raw data accessed on 2020-05-11), can be found on zenodo: https://zenodo.org/record/3978072

## Download pre-trained models
Pretrained Word2Vec models to be used with Spec2Vec can be found on zenodo.
- Model trained on __UniqueInchikey__ subset (12,797 spectra): https://zenodo.org/record/3978054
- Model trained on __AllPositive__ set of all positive ionization mode spectra (after filtering): https://zenodo.org/record/4173596

Owner

  • Name: Eric Bach
  • Login: bachi55
  • Kind: user
  • Location: Espoo, Finnland
  • Company: Aalto University

Doctoral student in the field of Machine Learning, Bioinformatics and Computational Metabolomics.

GitHub Events

Total
Last Year