https://github.com/bigbio/pgatk-io
High performance io library for proteogenomics
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.4%) to scientific vocabulary
Keywords
Repository
High performance io library for proteogenomics
Basic Info
Statistics
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
pgatk-io
About pgatk-io
The pgatk-io library is a java framework to manipulate mass spectrometry and proteomics file formats. It has an special focus on novel file formats like Apache Spark Parquet and Json file formats for proteomics.
Support Matrix
This table summarizes the current level of support for each feature across the different file formats. See discussion below for details on each feature.
| Feature | MGF | APL (Maxquant) | mzXML | mzML | PRIDE Json |Pep Avro | | ---------------------|--------------------|------------------------|---------------------|----------------------|---------------------|------------------------| | Random Access | :heavycheckmark: | :heavycheckmark: | :heavycheckmark: | :heavycheckmark: | :heavycheckmark: | | | Fast Iterable Access | :heavycheckmark: | :whitecheckmark: | :heavycheckmark: | :x: | :x: | :heavycheckmark: | | Gzip Support | :x: | :x: | :x: | :x: |:x: | | | Numpress Support | :x: | :x: | :whitecheckmark: | :whitecheckmark: |:x: | |
File formats
- MGF - http://www.matrixscience.com/help/datafilehelp.html
- mzXML - http://tools.proteomecenter.org/wiki/index.php?title=Formats:mzXML#:~:text=mzXML%20is%20an%20open%20data,foundation%20of%20our%20proteomic%20pipelines
- mzML - https://www.psidev.info/mzML
- ArchiveSpectrum (PRIDE Json) - http://bigbio.xyz/pgatk-io/io/github/bigbio/pgatk/io/pride/ArchiveSpectrum.html
- AnnotatedSpectrum (Pep Avro) - http://bigbio.xyz/pgatk-io/io/github/bigbio/pgatk/io/pride/AnnotatedSpectrum.html
License
pgatk-io is licensed under Apache License 2.0.
Main Features
Based on a custom build class to efficiently parse text files line by line all parsers can handle arbitrary large files in minimal memory, allowing easy and efficient processing of peak list files using the Java programming language.
For every implementation a Random Access and Iterable Access Reader is provided.
- In the Random access developers can access to any individual Spectrum using the Identifier of the Spectrum or the index.
- In the Iterable access developers can access one by one each of the spectra with the next function
Getting Help
If you have questions or need additional help, please create an issue in the library repo in github (https://github.com/bigbio/pgatk-io/issues). Please send us your feedback, including error reports, improvement suggestions, new feature requests and any other things you might want to suggest.
Similar libraries:
ms-data-core-api Perez-Riverol Y., Uszkoreit J., Sanchez A., Ternent T., Del Toro N., Hermjakob H., Vizcaíno J.A., Wang R. ms-data-core-api: an open-source, metadata-oriented library for computational proteomics. Bioinformatics, 2015 Sep 1;31(17):2903-5 ms-data-core-api
jmzReader Griss J, Reisinger F, Hermjakob H, Vizcaíno JA. jmzReader: A Java parser library to process and visualize multiple text and XML-based mass spectrometry data formats. Proteomics. 2012 Mar;12(6):795-8. doi: 10.1002/pmic.201100578.
Owner
- Name: BigBio Stack
- Login: bigbio
- Kind: organization
- Email: proteomicsstack@gmail.com
- Location: Cambridge, UK
- Website: http://bigbio.xyz
- Repositories: 24
- Profile: https://github.com/bigbio
Provide big data solutions Bioinformatics
GitHub Events
Total
Last Year
Dependencies
- actions/checkout v2 composite
- actions/setup-java v1 composite
- actions/checkout v2 composite
- actions/setup-java v1 composite
- actions/checkout v2 composite
- actions/setup-java v1 composite
- com.fasterxml.jackson.core:jackson-annotations 2.11.1
- com.fasterxml.jackson.core:jackson-core 2.11.1
- com.fasterxml.jackson.core:jackson-databind 2.11.1
- com.fasterxml.jackson.module:jackson-module-paranamer 2.11.1
- com.spotify.sparkey:sparkey 3.0.1
- com.sun.xml.bind:jaxb-core 2.3.0.1
- com.sun.xml.bind:jaxb-impl 2.3.2
- io.github.bigbio.pgatk:pgatk-utilities 1.0.1-SNAPSHOT
- junit:junit 4.13.1
- net.openhft:chronicle-map 3.17.1
- org.apache.avro:avro 1.10.2
- org.ehcache:ehcache 3.8.1
- org.hamcrest:hamcrest-core 1.3
- org.hamcrest:hamcrest-library 1.3
- org.iq80.leveldb:leveldb 0.10
- org.mapdb:mapdb 3.0.8
- org.mockito:mockito-core 1.10.19
- org.projectlombok:lombok 1.18.2
- org.slf4j:jcl-over-slf4j 1.7.25
- org.slf4j:slf4j-api 1.7.25
- org.zeroturnaround:zt-zip 1.13
- org.zoodb:zoodb 0.5.2
- uk.ac.ebi.jmzml:jmzml 1.7.11