https://github.com/blueobelisk/chemicaltagger
ChemicalTagger is a tool for semantic text-mining in chemistry.
Science Score: 39.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.1%) to scientific vocabulary
Repository
ChemicalTagger is a tool for semantic text-mining in chemistry.
Basic Info
Statistics
- Stars: 43
- Watchers: 6
- Forks: 10
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
ChemicalTagger Overview
ChemicalTagger is a tool for semantic text-mining in chemistry; the associated publication can be found here.
A. Components:
This package is used for marking up experimental sections in chemistry papers: It has 3 main classes:
I. ChemistryPOSTagger:
This class takes a sentence and runs it against (by default) three taggers:
- OSCAR4 (for chemical entities)
- Regex (for recognising key words)
- OpenNLP (for English parts of speech)
II. ChemistrySentenceParser:
This class converts a tagged sentence into a parseTree. It uses a lexer and parser generated by the ANTLR grammar.
III. ASTtoXML:
This class converts a parseTree into an XML document.
B. Running ChemicalTagger:
```java import uk.ac.cam.ch.wwmm.chemicaltagger.POSContainer; import uk.ac.cam.ch.wwmm.chemicaltagger.ChemistryPOSTagger; import uk.ac.cam.ch.wwmm.chemicaltagger.ChemistrySentenceParser; import uk.ac.cam.ch.wwmm.chemicaltagger.Utils; import nu.xom.Document;
public class ChemicalTaggerTest {
public static void main(String[] args) { String text = "A solution of 124C (7.0 g, 32.4 mmol) in concentrate H2SO4 " + "(9.5 mL) was added to a solution of concentrate H2SO4 (9.5 mL) " + "and fuming HNO3 (13 mL) and the mixture was heated at 60°C for " + "30 min. After cooling to room temperature, the reaction mixture " + "was added to iced 6M solution of NaOH (150 mL) and neutralized " + "to pH 6 with 1N NaOH solution. The reaction mixture was extracted " + "with dichloromethane (4x100 mL). The combined organic phases were " + "dried over Na2SO4, filtered and concentrated to give 124D as a solid.";
// Calling ChemistryPOSTagger
POSContainer posContainer = ChemistryPOSTagger.getDefaultInstance().runTaggers(text);
// Returns a string of TAG TOKEN format (e.g.: DT The NN cat VB sat IN on DT the NN matt)
// Call ChemistrySentenceParser either by passing the POSContainer or by InputStream
ChemistrySentenceParser chemistrySentenceParser = new ChemistrySentenceParser(posContainer);
// Create a parseTree of the tagged input
chemistrySentenceParser.parseTags();
// Return an XMLDoc
Document doc = chemistrySentenceParser.makeXMLDocument();
Utils.writeXMLToFile(doc,"target/file1.xml");
} } ```
Owner
- Name: Blue Obelisk
- Login: BlueObelisk
- Kind: organization
- Website: http://www.blueobelisk.org/
- Repositories: 59
- Profile: https://github.com/BlueObelisk
GitHub Events
Total
- Watch event: 5
- Push event: 1
- Fork event: 2
Last Year
- Watch event: 5
- Push event: 1
- Fork event: 2
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 8
- Total pull requests: 4
- Average time to close issues: 17 days
- Average time to close pull requests: about 8 hours
- Total issue authors: 7
- Total pull request authors: 3
- Average comments per issue: 2.63
- Average comments per pull request: 0.0
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 2
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: 2 months
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- biotech7 (2)
- abhibha1807 (1)
- CreamyLong (1)
- lo2aayy (1)
- mjw99 (1)
- tongyey (1)
- sathiyabalu89 (1)
Pull Request Authors
- dependabot[bot] (3)
- lo2aayy (1)
- dan2097 (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
- Total downloads: unknown
- Total dependent packages: 1
- Total dependent repositories: 4
- Total versions: 6
repo1.maven.org: uk.ac.cam.ch.wwmm:chemicalTagger
ChemicalTagger
- Homepage: http://chemicaltagger.ch.cam.ac.uk
- Documentation: https://appdoc.app/artifact/uk.ac.cam.ch.wwmm/chemicalTagger/
- License: Apache License, Version 2.0
-
Latest release: 1.6.2
published over 4 years ago
Rankings
Dependencies
- com.tunnelvisionlabs:antlr4 4.5.3
- com.tunnelvisionlabs:antlr4-annotations 4.5.3
- com.tunnelvisionlabs:antlr4-runtime 4.5.3
- commons-io:commons-io 2.11.0
- commons-lang:commons-lang 2.6
- org.apache.logging.log4j:log4j-1.2-api 2.18.0
- org.apache.opennlp:opennlp-tools 1.9.4
- org.jsoup:jsoup 1.15.2
- uk.ac.cam.ch.wwmm.oscar:oscar4-api 5.2.0
- uk.ac.cam.ch.wwmm.oscar:oscar4-data 5.2.0
- xom:xom 1.3.7
- junit:junit test
- org.xml-cml:jumbo-testutil 1.0.1 test
- actions/checkout v3 composite
- actions/setup-java v3 composite