sugarremoval

The Sugar Removal Utility - An algorithmic approach for in silico removal of circular and linear sugars from molecular structures.

https://github.com/jonasschaub/sugarremoval

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 41 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.0%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

The Sugar Removal Utility - An algorithmic approach for in silico removal of circular and linear sugars from molecular structures.

Basic Info
  • Host: GitHub
  • Owner: JonasSchaub
  • License: mit
  • Language: Java
  • Default Branch: main
  • Homepage:
  • Size: 126 MB
Statistics
  • Stars: 8
  • Watchers: 2
  • Forks: 2
  • Open Issues: 1
  • Releases: 8
Created almost 6 years ago · Last pushed 12 months ago
Metadata Files
Readme License Citation

README.md

DOI Javadoc License: MIT Maintenance build GitHub issues GitHub contributors GitHub release Maven Central Quality Gate StatusSoftware article - JChemInf GitHub wiki

SRU_logo

Sugar Removal Utility (SRU)

An algorithmic approach for in silico removal of circular and linear sugars from molecular structures

Contents of this document

Overview

  • The Sugar Removal Utility (SRU), an algorithmic approach for in silico removal of circular and linear sugars from molecular structures, is described in this scientific publication: Schaub, J., Zielesny, A., Steinbeck, C. et al. Too sweet: cheminformatics for deglycosylation in natural products. J Cheminform 12, 67 (2020). There, you can find all necessary details about the algorithm and its various configuration options. We also published a follow-up article where we used the SRU to analyse sugar moieties in the Collection of Open Natural products (COCONUT) database.
  • This repository used to host the SRU source code, but it has now moved to the Chemistry Development Kit (CDK) Java library for cheminformatics. If you want to use the SRU as a Java library, you now need to use the CDK version 2.10 or higher. Information on how to install and use the CDK can be found in the GitHub repository linked above. You can then use the SRU via CDK's SugarRemovalUtility class.
  • This repository now only hosts the SRU command-line application and its source code and it serves as a place for documentation about the algorithm.
  • The SRU's functionalities can also be used in other software tools:
  • Every software tool listed above is open and free (of charge) to use!
  • The repository wiki contains code examples and some additional notes on sugar moiety detection and removal using the SRU.

Contents of this repository

Sources

The sources available in /src/main/java/de/unijena/cheminf/deglycosylation/ belong to the SRU command-line application. It makes the various settings for fine-tuning the sugar detection and removal process available through command-line arguments. But using the CDK SugarRemovalUtility class directly in your own software project offers some additional configuration options and functionalities: * Adding and removing circular and linear sugar patterns for the initial detection steps * Sugar detection without removal * Detecting only the number of sugar moieties of a molecule

The class SugarRemovalUtilityTest can be found in the directory /src/test/java/de/unijena/cheminf/deglycosylation/. It is a JUnit test class that tests the performance of the Sugar Removal Utility on multiple specific molecular structures of natural products hand-picked from public databases (see article linked above). Code examples of how to use and configure the SugarRemovalUtility class can be found here.

SugarRemovalUtility CMD App

The sub-folder "SugarRemovalUtility CMD App" contains the sugar removal command-line application downloadable as Java archive. The JAR file "SugarRemovalUtility-jar-with-dependencies.jar" can be executed from the command-line using Java version 17 or higher. A detailed explanation how to use the application can be found in "Usage instructions.txt". Also, an example input file is provided, named "smilestestfile.txt".

Natural product test sets

The test resources folder contains the reviewglycosylatedNPsbacteriadata.sdf file which was published and provided by Elshahawi et al. (2015) and contains bacterial glycosylated natural products used for testing the SRU algorithm (see SRU paper linked above).
The text file "handpickednp.txt" contains a list of SMILES codes serving as a natural product test set for the performance of the Sugar Removal Utility. They were hand-picked from public databases via the COlleCtion of Open NatUral producTs (COCONUT). More details can be found in the test class (see below) and the Sugar Removal Utility publication.

Installation

As stated above, the Sugar Removal Utility is now part of the Chemistry Development Kit. So, if you are already using CDK, you do not need to install the SRU externally, you can use it via CDK's SugarRemovalUtility class. If not, please follow the installation description in the CDK repository linked above.
The Sugar Removal Utility web applcation in this repository is hosted as a package/artifact on the sonatype maven central repository. See the artifact page for installation guidelines using build tools like maven or gradle. To install it via its JAR archive, you can get it from the releases. Note that other dependencies will need to be installed via JAR archives as well this way.

Command line application JAR

The command-line application JAR has to be downloaded. After that, it can be executed from the command-line as described in the usage instructions. Java version 17 or higher has to be installed on your machine.

Source code

This is a Maven project. In order to use the source code for your own software, download or clone the repository and open it in a Maven-supporting IDE (e.g. IntelliJ) as a Maven project and execute the pom.xml file. Maven will then take care of installing all dependencies.

Dependencies

Needs to be pre-installed: * Java Development Kit (JDK) version 17 or higher * Adoptium OpenJDK (as one possible source of the JDK) * Eclipse Public License v2.0 * Apache Maven version 4 * Apache Maven * Apache License, version 2.0

Managed by Maven: * Chemistry Development Kit (CDK) version 2.10 * Chemistry Development Kit on GitHub * License: GNU Lesser General Public License 2.1 * JUnit version 5.10.0 * JUnit 5 * License: Eclipse Public License v2.0 * Apache Commons CLI version 1.4 * Apache Commons CLI * Apache License, version 2.0

References and useful links

Sugar Removal Utility * Schaub, J., Zielesny, A., Steinbeck, C., Sorokina, M. Too sweet: cheminformatics for deglycosylation in natural products. J Cheminform 12, 67 (2020). https://doi.org/10.1186/s13321-020-00467-y * Sugar Removal Web Application * Source Code of Web Application

Glycosylation statistics of COCONUT publication (Using the SRU) * Schaub, J., Zielesny, A., Steinbeck, C., Sorokina, M. Description and Analysis of Glycosidic Residues in the Largest Open Natural Products Database. Biomolecules 2021, 11, 486. https://doi.org/10.3390/biom11040486

Chemistry Development Kit (CDK) * Chemistry Development Kit on GitHub * Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen EL. The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and Bioinformatics. J Chem Inform Comput Sci. 2003;43(2):493-500. * Steinbeck C, Hoppe C, Kuhn S, Floris M, Guha R, Willighagen EL. Recent Developments of the Chemistry Development Kit (CDK) - An Open-Source Java Library for Chemo- and Bioinformatics. Curr Pharm Des. 2006; 12(17):2111-2120. * May JW and Steinbeck C. Efficient ring perception for the Chemistry Development Kit. J. Cheminform. 2014; 6:3. * Willighagen EL, Mayfield JW, Alvarsson J, Berg A, Carlsson L, Jeliazkova N, Kuhn S, Pluska T, Rojas-Chertó M, Spjuth O, Torrance G, Evelo CT, Guha R, Steinbeck C, The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J Cheminform. 2017; 9:33. * Groovy Cheminformatics with the Chemistry Development Kit

COlleCtion of Open NatUral producTs (COCONUT) * COCONUT Online home page * Sorokina, M., Merseburger, P., Rajan, K. et al. COCONUT online: Collection of Open Natural Products database. J Cheminform 13, 2 (2021). https://doi.org/10.1186/s13321-020-00478-9 * Sorokina, M., Steinbeck, C. Review on natural products databases: where to find data in 2020. J Cheminform 12, 20 (2020). * Venkata Chandrasekhar, Kohulan Rajan, Sri Ram Sagar Kanakam, Nisha Sharma, Viktor Weißenborn, Jonas Schaub, Christoph Steinbeck, COCONUT 2.0: a comprehensive overhaul and curation of the collection of open natural products database, Nucleic Acids Research, Volume 53, Issue D1, 6 January 2025, Pages D634–D643, https://doi.org/10.1093/nar/gkae1063

Owner

  • Name: Jonas Schaub
  • Login: JonasSchaub
  • Kind: user
  • Location: Jena, Germany
  • Company: Friedrich-Schiller-University

Doctoral candidate of Steinbeck research group for cheminformatics and computational metabolomics. ORCID: 0000-0003-1554-6666

Citation (CITATION.cff)

cff-version: 1.2.0
title: Sugar Removal Utility (SRU)
version: 1.5.0.0
message: "If you use this software, please cite it as below and also cite the accompanying scientific publication referenced below."
type: software
authors:
  - family-names: "Schaub"
    given-names: "Jonas"
    orcid: "https://orcid.org/0000-0003-1554-6666"
  - family-names: "Zielesny"
    given-names: "Achim"    
    orcid: "https://orcid.org/0000-0003-0722-4229"
  - family-names: "Steinbeck"
    given-names: "Christoph"
    orcid: "https://orcid.org/0000-0001-6966-0814"
  - family-names: "Sorokina"
    given-names: "Maria"
    orcid: "https://orcid.org/0000-0001-9359-7149"
doi: "10.5281/zenodo.7082113"
date-released: 2025-03-19
url: "https://github.com/JonasSchaub/SugarRemoval"
license: MIT
references:
  - authors:
      - family-names: "Schaub"
        given-names: "Jonas"
        orcid: "https://orcid.org/0000-0003-1554-6666"
      - family-names: "Zielesny"
        given-names: "Achim"
        orcid: "https://orcid.org/0000-0003-0722-4229"
      - family-names: "Steinbeck"
        given-names: "Christoph"
        orcid: "https://orcid.org/0000-0001-6966-0814"
      - family-names: "Sorokina"
        given-names: "Maria"
        orcid: "https://orcid.org/0000-0001-9359-7149"
    doi: "10.1186/s13321-020-00467-y"
    issue: 1
    journal: "J Cheminform"
    scope: "Cite this paper if you want to reference the general concepts of the software."
    title: "Too sweet: cheminformatics for deglycosylation in natural products"
    type: article
    volume: 12
    year: 2020

GitHub Events

Total
  • Create event: 3
  • Release event: 2
  • Issues event: 7
  • Watch event: 2
  • Delete event: 1
  • Issue comment event: 13
  • Push event: 23
  • Gollum event: 3
  • Pull request event: 4
Last Year
  • Create event: 3
  • Release event: 2
  • Issues event: 7
  • Watch event: 2
  • Delete event: 1
  • Issue comment event: 13
  • Push event: 23
  • Gollum event: 3
  • Pull request event: 4

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 8
  • Total pull requests: 5
  • Average time to close issues: 5 months
  • Average time to close pull requests: about 10 hours
  • Total issue authors: 1
  • Total pull request authors: 2
  • Average comments per issue: 0.5
  • Average comments per pull request: 0.8
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 4
  • Pull requests: 4
  • Average time to close issues: 3 months
  • Average time to close pull requests: about 10 hours
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.75
  • Average comments per pull request: 1.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • JonasSchaub (8)
Pull Request Authors
  • JonasSchaub (4)
  • dependabot[bot] (1)
Top Labels
Issue Labels
enhancement (3) documentation (1)
Pull Request Labels
dependencies (1)

Dependencies

.github/workflows/maven.yml actions
  • actions/checkout v3 composite
  • actions/setup-java v3 composite
pom.xml maven
  • commons-cli:commons-cli 1.4
  • junit:junit 4.13.2
  • org.openscience.cdk:cdk-core 2.8
  • org.openscience.cdk:cdk-data 2.8
  • org.openscience.cdk:cdk-depict 2.8
  • org.openscience.cdk:cdk-inchi 2.8
  • org.openscience.cdk:cdk-model 2.8
  • org.openscience.cdk:cdk-silent 2.8
  • org.openscience.cdk:cdk-smiles 2.8
.github/workflows/SonarCloud.yml actions
  • actions/cache v3 composite
  • actions/checkout v3 composite
  • actions/setup-java v3 composite
  • crazy-max/ghaction-import-gpg v5.0.0 composite
.github/workflows/publish-javadoc.yml actions
  • MathieuSoysal/Javadoc-publisher.yml v2.4.0 composite
.github/workflows/publish-to-maven-central.yml actions
  • actions/checkout v4 composite
  • actions/setup-java v3 composite
  • crazy-max/ghaction-import-gpg v5.0.0 composite