metaboigniter

Pre-processing of mass spectrometry-based metabolomics data with quantification and identification based on MS1 and MS2 data.

https://github.com/nf-core/metaboigniter

Science Score: 31.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 10 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.6%) to scientific vocabulary

Keywords

identification mass-spectrometry metabolomics ms1 ms2 nextflow nf-core pipeline quantification workflow

Keywords from Contributors

metagenomics bioinformatics annotation
Last synced: 6 months ago · JSON representation ·

Repository

Pre-processing of mass spectrometry-based metabolomics data with quantification and identification based on MS1 and MS2 data.

Basic Info
Statistics
  • Stars: 15
  • Watchers: 29
  • Forks: 14
  • Open Issues: 3
  • Releases: 0
Topics
identification mass-spectrometry metabolomics ms1 ms2 nextflow nf-core pipeline quantification workflow
Created almost 6 years ago · Last pushed almost 2 years ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation

README.md

nf-core/metaboigniter

GitHub Actions CI Status GitHub Actions Linting StatusAWS CICite with Zenodo nf-test

Nextflow run with conda run with docker run with singularity Launch on Seqera Platform

Get help on SlackFollow on TwitterFollow on MastodonWatch on YouTube

Introduction

nf-core/metaboigniter is a bioinformatics pipeline that ingests raw mass spectrometry data in mzML format, typically in the form of peak lists and MS2 spectral data, for comprehensive metabolomics analysis. The key stages involve centroiding, feature detection, adduct detection, alignment, and linking, which progressively refine and align the data. The pipeline can also perform requantification to compensate for missing values and leverages MS2Query for compound identification based on MS2 data, outputting a comprehensive list of detected and potentially identified metabolites.

nf-core/metaboigniter workflow

  1. Centroiding: Converts the continuous mass spectra into a series of discrete points.
  2. Feature Detection: Identifies unique signals or 'features' in the spectra.
  3. Adduct Detection: Identifies adduct ions, which are formed by the interaction of the sample with the ion source.
  4. Alignment: Ensures that the same features across different samples are matched together.
  5. Linking: Establishes connections between features across different ionization modes or adducts.
  6. Requantification: Fills in missing values in the data set for a more complete analysis.
  7. Identification: Uses MS2Query and SIRIUS to identify compounds based on their MS2 spectral data.
  8. Output Generation: Produces a comprehensive list of detected and potentially identified metabolites.

nf-core/metaboigniter metro map

Usage

[!NOTE] If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

First, prepare a samplesheet with your input data that looks as follows:

samplesheet.csv:

csv sample,level,type,msfile CONTROL_REP1,MS1,normal,mzML_POS_Quant/X2_Rep1.mzML CONTROL_REP2,MS1,normal,mzML_POS_Quant/X2_Rep2.mzML POOL_MS2,MS2,normal,mzML_POS_ID/POOL_MS2.mzML

Each row in this CSV file represents a unique sample, with the details provided in the columns.

  1. sample: This column should contain unique names for each sample. No two samples should share the same name in this column.
  2. level: This column should specify the level of mass spectrometry data contained in each sample file. This can be 'MS1' for files containing only MS1 data, 'MS2' for files containing only MS2 data, and 'MS12' for files containing both MS1 and MS2 data.
  3. type: This column can contain any descriptor of your choice, such as 'normal', 'disease', etc. This is usually used to provide some classification or group identification to your samples.
  4. msfile: This column should contain the path to the mzML file for each sample.

Now, you can run the pipeline using:

bash nextflow run nf-core/metaboigniter \ -profile <docker/singularity/.../institute> \ --input samplesheet.csv \ --outdir <OUTDIR>

[!WARNING] Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

For more details and further functionality, please refer to the usage documentation and the parameter documentation.

Pipeline output

To see the results of an example test run with a full size dataset refer to the results tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the output documentation.

Credits

nf-core/metaboigniter was originally written by Payam Emami. The DSL2 version was developed with significant contributions from Axel Walter and Efi Kontou.

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on the Slack #metaboigniter channel (you can join with this invite).

Citations

If you use nf-core/metaboigniter for your analysis, please cite it using the following doi: 10.5281/zenodo.4743790

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

Owner

  • Name: nf-core
  • Login: nf-core
  • Kind: organization
  • Email: core@nf-co.re

A community effort to collect a curated set of analysis pipelines built using Nextflow.

Citation (CITATIONS.md)

# nf-core/metaboigniter: Citations

## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)

> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.

## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/)

> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.

## Pipeline tools

- [OpenMS](https://openms.de/)

  > Röst, H.L., Sachsenberg, T., Aiche, S., Bielow, C., Weisser, H., Aicheler, F., Andreotti, S., Ehrlich, H.-C., Gutenbrunner, P., Kenar, E., Liang, X., Nahnsen, S., Nilse, L., Pfeuffer, J., Rosenberger, G., Rurik, M., Schmitt, U., Veit, J., Walzer, M., Wojnar, D., Wolski, W.E., Schilling, O., Choudhary, J.S., Malmström, L., Aebersold, R., Reinert, K., Kohlbacher, O. OpenMS: A flexible open-source software platform for mass spectrometry data analysis. Nature Methods, vol. 13, 2016. doi:10.1038/nmeth.3959

- [SIRIUS](https://bio.informatik.uni-jena.de/software/sirius/)

  > Kai Dührkop, Markus Fleischauer, Marcus Ludwig, Alexander A. Aksenov, Alexey V. Melnik, Marvin Meusel, Pieter C. Dorrestein, Juho Rousu, and Sebastian Böcker, SIRIUS 4: Turning tandem mass spectra into metabolite structure information. Nature Methods 16, 299–302, 2019.

- [MS2Query](https://github.com/iomega/ms2query)

  > de Jonge NF, Louwen JJR, Chekmeneva E, Camuzeaux S, Vermeir FJ, Jansen RS, Huber F, van der Hooft JJJ. MS2Query: reliable and scalable MS2 mass spectra-based analogue search. Nat Commun. 2023 Mar 29;14(1):1752. doi: 10.1038/s41467-023-37446-4. PMID: 36990978; PMCID: PMC10060387.

## Software packaging/containerisation tools

- [Anaconda](https://anaconda.com)

  > Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web.

- [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/)

  > Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506.

- [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/)

  > da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.

- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

  > Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. doi: 10.5555/2600239.2600241.

- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)

  > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.

GitHub Events

Total
  • Issues event: 5
  • Watch event: 4
  • Issue comment event: 6
  • Push event: 6
  • Pull request event: 8
  • Fork event: 1
  • Create event: 4
Last Year
  • Issues event: 5
  • Watch event: 4
  • Issue comment event: 6
  • Push event: 6
  • Pull request event: 8
  • Fork event: 1
  • Create event: 4

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 123
  • Total Committers: 5
  • Avg Commits per committer: 24.6
  • Development Distribution Score (DDS): 0.577
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Payam p****i@s****e 52
payam p****m@p****l 42
Phil Ewels p****s@s****e 18
nf-core-bot c****e@n****e 10
Egon Willighagen e****n@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 11
  • Total pull requests: 62
  • Average time to close issues: 7 months
  • Average time to close pull requests: 22 days
  • Total issue authors: 8
  • Total pull request authors: 6
  • Average comments per issue: 1.0
  • Average comments per pull request: 1.27
  • Merged pull requests: 35
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 4
  • Average time to close issues: about 24 hours
  • Average time to close pull requests: 7 days
  • Issue authors: 2
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.25
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • PayamEmami (4)
  • plashmore (3)
  • maxvincent24 (2)
  • xuel12 (1)
  • famosab (1)
  • nvnieuwk (1)
  • maxulysse (1)
  • jordeu (1)
  • jen-reeve (1)
  • felipe-mansoldo (1)
  • apeltzer (1)
  • liangyong1991 (1)
  • ewels (1)
Pull Request Authors
  • nf-core-bot (33)
  • PayamEmami (32)
  • ewels (7)
  • axelwalter (1)
  • jordeu (1)
  • KevinMenden (1)
Top Labels
Issue Labels
bug (11) enhancement (5)
Pull Request Labels
WIP (2)