https://github.com/clulab/edin-data
Biomolecular events mined by Reach from PubMed Central
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org, ncbi.nlm.nih.gov -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.7%) to scientific vocabulary
Keywords
information-extraction
nlp-datasets
pubmed-central
Last synced: 9 months ago
·
JSON representation
Repository
Biomolecular events mined by Reach from PubMed Central
Statistics
- Stars: 1
- Watchers: 10
- Forks: 1
- Open Issues: 0
- Releases: 0
Topics
information-extraction
nlp-datasets
pubmed-central
Created over 6 years ago
· Last pushed over 6 years ago
https://github.com/clulab/edin-data/blob/master/
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License # Description This repository contains "silver" biomolecular event mentions mined from the full text of a subset of [PubMed Central](https://www.ncbi.nlm.nih.gov/pmc/) publications using [Reach](https://github.com/clulab/reach/tree/d6a84be099a155b2bedb69f2102493d5db9399fb) (commit [`d6a84be099a155b2bedb69f2102493d5db9399fb`](https://github.com/clulab/reach/tree/d6a84be099a155b2bedb69f2102493d5db9399fb)). A portion of this data was used to supplement the training of [`edin`](https://github.com/ZhengTang1120/Interpretation-Decoder-for-Neural-Network/tree/master). ## Rules [`rules.yml`](./data/rules.yml) contains the [Odin rules](https://arxiv.org/abs/1509.07513) referenced in the event json data. ## Events | File | Event | |---------------------------------------------|-----------------| | [`ph_events.jsonl`](./data/ph_events.jsonl) | Phosphorylation | | [`lo_events.jsonl`](./data/lo_events.jsonl) | Localization | | [`ge_events.jsonl`](./data/ge_events.jsonl) | Gene Expression | ### Structure Each `.jsonl` is a [JSON lines](http://jsonlines.org/) file contains a list of JSON objects where each object represents an event mention in the schema defined in [`oas3-data-schema.yml`](./data/oas3-data-schema.yml). [Click this link](https://editor.swagger.io/?url=https://raw.githubusercontent.com/clulab/edin-data/dev/data/oas3-data-schema.yml) to render the schema in an editor. #### Quick view This command will pretty print the first gene expression event listed in [`ge_events.jsonl`](./data/ge_events.jsonl): ```bash head -n 1 ge_events.jsonl | python -m json.tool ```
Owner
- Name: Computational Language Understanding Lab (CLU Lab) at University of Arizona
- Login: clulab
- Kind: organization
- Location: Tucson, AZ
- Website: http://clulab.org
- Repositories: 72
- Profile: https://github.com/clulab
