uk.ac.cam.ch.wwmm.oscar

OSCAR (Open Source Chemistry Analysis Routines) is an open source extensible system for the automated annotation of chemistry in scientific articles.

https://github.com/blueobelisk/oscar4

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    4 of 17 committers (23.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.3%) to scientific vocabulary

Keywords

blueobelisk chemistry text-mining

Keywords from Contributors

mesh annotation sequences interactive hacking network-simulation
Last synced: 6 months ago · JSON representation

Repository

OSCAR (Open Source Chemistry Analysis Routines) is an open source extensible system for the automated annotation of chemistry in scientific articles.

Basic Info
  • Host: GitHub
  • Owner: BlueObelisk
  • License: artistic-2.0
  • Language: Java
  • Default Branch: main
  • Homepage:
  • Size: 125 MB
Statistics
  • Stars: 34
  • Watchers: 6
  • Forks: 5
  • Open Issues: 1
  • Releases: 0
Topics
blueobelisk chemistry text-mining
Created about 6 years ago · Last pushed 7 months ago
Metadata Files
Readme License Citation

README.md

OSCAR4

Java CI with Maven Maven Central

OSCAR (Open Source Chemistry Analysis Routines) is an open source extensible system for the automated annotation of chemistry in scientific articles. It can be used to identify chemical names, reaction names, ontology terms, enzymes and chemical prefixes and adjectives, and chemical data such as state, yield, IR, NMR and mass spectra and elemental analyses. In addition, where possible, any chemical names detected will be annotated with structures derived either by lookup, or name-to-structure parsing using OPSIN or with identifiers from the ChEBI (`Chemical Entities of Biological Interest) ontology.

OSCAR has been under development since 2002. The current version, OSCAR4, focuses on providing a core library that facilitates integration with other tools. Its simple to use API is modularised to promote extension into other domains and allows for its use within workflow systems like Taverna and U-Compare.

OSCAR is developed by the Murray-Rust research group at the Unilever Centre for Molecular Science Informatics, University of Cambridge. The corresponding publication can be found here and the authors would appreciate it if this is cited in any work that makes use of the code.

Examples

The following code will identify chemical named entities in text, and output a list of them together with their Standard InChI, when available.

```java String s = "....";

Oscar oscar = new Oscar(); List entities = oscar.findAndResolveNamedEntities(s); for (ResolvedNamedEntity ne : entities) { System.out.println(ne.getSurface()); ChemicalStructure stdInchi = ne.getFirstChemicalStructure(FormatType.STD_INCHI); if (stdInchi != null) { System.out.println(stdInchi); } System.out.println(); } ```

Deployment to the Maven central repository

1) Create a gpg key gpg --full-generate-key --pinentry-mode=loopback Note, I think it must be RSA and the largest you can create. Remember to protect it with a password.

2) Upload it to http://keyserver.ubuntu.com/ gpg --armor --export mjw@mjw.name Take the output from the above command and paste it into that URL.

3) Create an account on https://central.sonatype.com/

4) Log in and make sure you have access to the Namespace you want to deploy to: https://central.sonatype.com/publishing/namespaces

For this repo, it will be uk.ac.cam.ch.wwmm; if you do not, you will need to request access via someone else, who does have access.

5) You will need to create a token for deployment via https://central.sonatype.com/account This needs to be pasted into your ~/.m2/settings.xml, e.g.: <settings xmlns="http://maven.apache.org/SETTINGS/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 https://maven.apache.org/xsd/settings-1.0.0.xsd"> <servers> <server> <id>central</id> <username>foo</username> <password>bar</password> </server> </servers> </settings>

6) Note, this assumes you have a ssh key to access github. Build, package and sign: mvn -Dusername=git release:prepare -DautoVersionSubmodules=true -DreleaseVersion=5.3.0 -DdevelopmentVersion=5.4-SNAPSHOT

  • Set the tag label as 5.3.0 when requested
  • Enter your GPG password

7) Upload it to central.sonatype.com mvn -Psonatype-oss-release release:perform -DconnectionUrl=scm:git:https://github.com/BlueObelisk/oscar4 -Dtag=5.3.0 - Enter your GPG password

8) Log into https://central.sonatype.com/publishing/deployments The deployment should be here, pending to go; if everything is green, hit publish.

Support

Issue/Feature Request Tracker

Mailing List (Google Group)

Owner

  • Name: Blue Obelisk
  • Login: BlueObelisk
  • Kind: organization

GitHub Events

Total
  • Issues event: 2
  • Watch event: 7
  • Delete event: 1
  • Issue comment event: 13
  • Push event: 8
  • Pull request event: 1
  • Fork event: 1
  • Create event: 2
Last Year
  • Issues event: 2
  • Watch event: 7
  • Delete event: 1
  • Issue comment event: 13
  • Push event: 8
  • Pull request event: 1
  • Fork event: 1
  • Create event: 2

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 824
  • Total Committers: 17
  • Avg Commits per committer: 48.471
  • Development Distribution Score (DDS): 0.762
Past Year
  • Commits: 14
  • Committers: 2
  • Avg Commits per committer: 7.0
  • Development Distribution Score (DDS): 0.071
Top Committers
Name Email Commits
Egon Willighagen e****w 196
Mark J. Williamson m****w@m****e 191
dmj30 d****0@l****t 164
Sam Adams s****s@g****m 126
Daniel Lowe d****7@c****k 64
lh359 l****9@l****t 20
Mark J. Williamson m****9@c****k 19
Daniel Lowe d****l@n****m 11
dma_k d****k@l****t 7
lh359 l****9@c****k 7
dan2097 d****7@l****t 7
Egon Willighagen e****n@g****m 5
keybo k****o@B****e 3
petermr p****6@c****k 1
Oliver Stueker o****r@g****m 1
Daniel Lowe 7****7 1
dependabot[bot] 4****] 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 7
  • Total pull requests: 9
  • Average time to close issues: 5 months
  • Average time to close pull requests: 1 day
  • Total issue authors: 5
  • Total pull request authors: 4
  • Average comments per issue: 4.43
  • Average comments per pull request: 1.33
  • Merged pull requests: 9
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • egonw (4)
  • biotech7 (1)
  • merkys (1)
  • mjw99 (1)
  • sathiyabalu89 (1)
Pull Request Authors
  • egonw (5)
  • mjw99 (3)
  • dependabot[bot] (1)
  • dan2097 (1)
Top Labels
Issue Labels
help wanted (2) good first issue (1) question (1)
Pull Request Labels
dependencies (1)

Packages

  • Total packages: 15
  • Total downloads: unknown
  • Total dependent packages: 55
    (may contain duplicates)
  • Total dependent repositories: 44
    (may contain duplicates)
  • Total versions: 150
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-api

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 5
  • Dependent Repositories: 15
Rankings
Dependent repos count: 6.2%
Dependent packages count: 11.3%
Average: 26.1%
Stargazers count: 41.1%
Forks count: 45.8%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-chemnamedict

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 8
  • Dependent Repositories: 3
Rankings
Dependent packages count: 7.3%
Dependent repos count: 13.7%
Average: 27.0%
Stargazers count: 41.1%
Forks count: 45.8%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-core

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 12
  • Dependent Repositories: 2
Rankings
Dependent packages count: 5.1%
Dependent repos count: 16.0%
Average: 27.0%
Stargazers count: 41.1%
Forks count: 45.8%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-tokenizer

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 7
  • Dependent Repositories: 2
Rankings
Dependent packages count: 8.3%
Dependent repos count: 16.0%
Average: 27.8%
Stargazers count: 41.1%
Forks count: 45.8%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-obo

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 7
  • Dependent Repositories: 2
Rankings
Dependent packages count: 8.3%
Dependent repos count: 16.0%
Average: 27.8%
Stargazers count: 41.1%
Forks count: 45.8%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-recogniser-core

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 5
  • Dependent Repositories: 2
Rankings
Dependent packages count: 11.3%
Dependent repos count: 16.0%
Average: 28.6%
Stargazers count: 41.1%
Forks count: 45.8%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-data

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 3
  • Dependent Repositories: 5
Rankings
Dependent repos count: 10.8%
Dependent packages count: 17.3%
Average: 28.8%
Stargazers count: 41.1%
Forks count: 45.8%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-memmrecogniser

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 3
  • Dependent Repositories: 2
Rankings
Dependent repos count: 16.0%
Dependent packages count: 17.3%
Average: 30.1%
Stargazers count: 41.1%
Forks count: 45.8%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-opsin

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 2
  • Dependent Repositories: 4
Rankings
Dependent repos count: 12.0%
Dependent packages count: 22.9%
Average: 30.5%
Stargazers count: 41.1%
Forks count: 45.8%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-patternrecogniser

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 2
  • Dependent Repositories: 2
Rankings
Dependent repos count: 16.0%
Dependent packages count: 22.9%
Average: 31.5%
Stargazers count: 41.1%
Forks count: 45.8%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-all

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 1
  • Dependent Repositories: 2
Rankings
Dependent repos count: 16.0%
Dependent packages count: 32.7%
Average: 33.8%
Stargazers count: 40.5%
Forks count: 45.8%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-preprocessor

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Stargazers count: 30.3%
Forks count: 31.6%
Dependent repos count: 32.0%
Average: 35.7%
Dependent packages count: 48.9%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 1
Rankings
Dependent repos count: 20.7%
Average: 39.4%
Stargazers count: 41.1%
Forks count: 45.8%
Dependent packages count: 49.9%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-formatter

Provides classes to format Oscar output into a certain format.

  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 1
Rankings
Dependent repos count: 20.7%
Average: 39.4%
Stargazers count: 41.1%
Forks count: 45.8%
Dependent packages count: 49.9%
Last synced: 6 months ago
repo1.maven.org: uk.ac.cam.ch.wwmm.oscar:oscar4-memmrecogniser-train

Parent POM for Maven managed projects in the Unilever Centre for Molecular Science Informatics

  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 1
Rankings
Dependent repos count: 20.7%
Average: 39.4%
Stargazers count: 41.1%
Forks count: 45.8%
Dependent packages count: 49.9%
Last synced: 6 months ago

Dependencies

oscar4-all/pom.xml maven
  • uk.ac.cam.ch.wwmm.oscar:oscar4-api
  • uk.ac.cam.ch.wwmm.oscar:oscar4-data
oscar4-api/pom.xml maven
  • com.google.guava:guava
  • org.apache.logging.log4j:log4j-api
  • org.apache.logging.log4j:log4j-core
  • org.apache.logging.log4j:log4j-slf4j-impl
  • uk.ac.cam.ch.wwmm.oscar:oscar4-chemnamedict
  • uk.ac.cam.ch.wwmm.oscar:oscar4-core
  • uk.ac.cam.ch.wwmm.oscar:oscar4-memmrecogniser
  • uk.ac.cam.ch.wwmm.oscar:oscar4-obo
  • uk.ac.cam.ch.wwmm.oscar:oscar4-opsin
  • uk.ac.cam.ch.wwmm.oscar:oscar4-patternrecogniser
  • uk.ac.cam.ch.wwmm.oscar:oscar4-recogniser-core
  • uk.ac.cam.ch.wwmm.oscar:oscar4-tokenizer
  • xom:xom
  • junit:junit test
oscar4-chemnamedict/pom.xml maven
  • commons-lang:commons-lang
  • org.apache.logging.log4j:log4j-api
  • org.apache.logging.log4j:log4j-core
  • org.apache.logging.log4j:log4j-slf4j-impl
  • uk.ac.cam.ch.wwmm.oscar:oscar4-core
  • xom:xom
  • junit:junit test
oscar4-core/pom.xml maven
  • commons-io:commons-io
  • org.apache.logging.log4j:log4j-api
  • org.apache.logging.log4j:log4j-core
  • org.apache.logging.log4j:log4j-slf4j-impl
  • xom:xom
  • junit:junit test
oscar4-data/pom.xml maven
  • net.sourceforge.jregex:jregex
  • org.apache.logging.log4j:log4j-api
  • org.apache.logging.log4j:log4j-core
  • org.apache.logging.log4j:log4j-slf4j-impl
  • uk.ac.cam.ch.wwmm.oscar:oscar4-core
  • uk.ac.cam.ch.wwmm.oscar:oscar4-tokenizer
  • xom:xom
  • junit:junit test
oscar4-formatter/pom.xml maven
  • uk.ac.cam.ch.wwmm.oscar:oscar4-core 5.3-SNAPSHOT
oscar4-memmrecogniser/pom.xml maven
  • com.google.guava:guava
  • org.apache.commons:commons-math
  • org.apache.logging.log4j:log4j-api
  • org.apache.logging.log4j:log4j-core
  • org.apache.logging.log4j:log4j-slf4j-impl
  • org.apache.opennlp:opennlp-maxent
  • uk.ac.cam.ch.wwmm.oscar:oscar4-chemnamedict
  • uk.ac.cam.ch.wwmm.oscar:oscar4-core
  • uk.ac.cam.ch.wwmm.oscar:oscar4-obo
  • uk.ac.cam.ch.wwmm.oscar:oscar4-recogniser-core
  • uk.ac.cam.ch.wwmm.oscar:oscar4-tokenizer
  • xom:xom
  • junit:junit test
oscar4-memmrecogniser-train/pom.xml maven
  • commons-io:commons-io
  • org.apache.logging.log4j:log4j-api
  • org.apache.logging.log4j:log4j-core
  • org.apache.logging.log4j:log4j-slf4j-impl
  • org.apache.opennlp:opennlp-maxent
  • uk.ac.cam.ch.wwmm.oscar:oscar4-chemnamedict
  • uk.ac.cam.ch.wwmm.oscar:oscar4-core
  • uk.ac.cam.ch.wwmm.oscar:oscar4-memmrecogniser
  • uk.ac.cam.ch.wwmm.oscar:oscar4-obo
  • uk.ac.cam.ch.wwmm.oscar:oscar4-recogniser-core
  • uk.ac.cam.ch.wwmm.oscar:oscar4-tokenizer
  • xom:xom
  • junit:junit test
oscar4-obo/pom.xml maven
  • com.google.guava:guava
  • commons-io:commons-io
  • org.apache.logging.log4j:log4j-api
  • org.apache.logging.log4j:log4j-core
  • org.apache.logging.log4j:log4j-slf4j-impl
  • uk.ac.cam.ch.wwmm.oscar:oscar4-chemnamedict
  • uk.ac.cam.ch.wwmm.oscar:oscar4-core
  • junit:junit test
  • org.mockito:mockito-core test
oscar4-opsin/pom.xml maven
  • commons-io:commons-io
  • uk.ac.cam.ch.opsin:opsin-core
  • uk.ac.cam.ch.opsin:opsin-inchi
  • uk.ac.cam.ch.wwmm.oscar:oscar4-chemnamedict
  • uk.ac.cam.ch.wwmm.oscar:oscar4-core
  • xom:xom
  • junit:junit test
oscar4-patternrecogniser/pom.xml maven
  • com.google.guava:guava
  • uk.ac.cam.ch.wwmm.oscar:oscar4-chemnamedict
  • uk.ac.cam.ch.wwmm.oscar:oscar4-core
  • uk.ac.cam.ch.wwmm.oscar:oscar4-obo
  • uk.ac.cam.ch.wwmm.oscar:oscar4-recogniser-core
  • uk.ac.cam.ch.wwmm.oscar:oscar4-tokenizer
  • xom:xom
  • junit:junit test
oscar4-preprocessor/pom.xml maven
  • com.ibm.icu:icu4j
  • junit:junit test
oscar4-recogniser-core/pom.xml maven
  • com.google.guava:guava
  • commons-io:commons-io
  • dk.brics.automaton:automaton
  • org.apache.logging.log4j:log4j-api
  • org.apache.logging.log4j:log4j-core
  • org.apache.logging.log4j:log4j-slf4j-impl
  • uk.ac.cam.ch.wwmm.oscar:oscar4-chemnamedict
  • uk.ac.cam.ch.wwmm.oscar:oscar4-core
  • uk.ac.cam.ch.wwmm.oscar:oscar4-obo
  • uk.ac.cam.ch.wwmm.oscar:oscar4-tokenizer
  • xom:xom
  • junit:junit test
  • org.mockito:mockito-core test
.github/workflows/maven.yml actions
  • actions/checkout v3 composite
  • actions/setup-java v3 composite
oscar4-tokeniser/pom.xml maven
  • org.apache.logging.log4j:log4j-api
  • org.apache.logging.log4j:log4j-core
  • org.apache.logging.log4j:log4j-slf4j-impl
  • uk.ac.cam.ch.wwmm.oscar:oscar4-core
  • uk.ac.cam.ch.wwmm.oscar:oscar4-obo
  • xom:xom
  • junit:junit test
pom.xml maven