Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.4%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: mo-arvan
  • Language: Python
  • Default Branch: main
  • Size: 5.22 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed 9 months ago
Metadata Files
Readme Citation

README.md

Clinical NLP

% Transfer Learning is not a Silver Bullet: A Case Study on Medical Relation Extraction

Datasets

i2b2 2010

This dataset provides a corpus of assertions in clinical discharge summaries. The task is split into six classes, namely present, possible, absent, hypothetical, conditional and associated with someone else. However, the distribution is highly skewed, such that only 6% of the assertions belong to the latter three classes. Hence we only use the present, possible, and absent assertions for our evaluation as they present the most important information for doctors.

From [1].

BioScope

This is a corpus of assertions in biomedical publications. It was specifically curated for the study of negation and speculation (or absent and possible in this paper) scope and does not contain present annotations. The BioScope dataset does not completely match the information need of health professionals and the i2b2 corpus lacks varied medical text types.

From [1].

MIMIC-III

provides texts from discharge summaries as well as other clinical notes (physician letters, nurse letters, and radiology reports) representing a promising source of varied medical text. Therefore, two annotators followed the annotation guidelines from the i2b2 challenge, and labelled 5,000 assertions, i.e. word spans of entities and their corresponding present / possible / absent class.

From [1].

*sem2012 - Sherlock

Taken from stories by Sir Author Conan Doyle (literary work)

SFU Review Corpus

A collection of product reviews (free text by human users)

References

1

Owner

  • Name: Mo Arvan
  • Login: mo-arvan
  • Kind: user
  • Location: Chicago
  • Company: University of Illinois at Chicago

Computer Scientist

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: clinical-nlp
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Mohammad
    family-names: Arvan
    email: marvan3@uic.edu
    affiliation: University of Illinois at Chicago
    orcid: 'https://orcid.org/0000-0001-9155-4559'
  - given-names:  Hunter 
    family-names: Holt
    email: hholt2@uic.edu
    affiliation: University of Illinois at Chicago
    orcid: 'https://orcid.org/0000-0001-6833-8372'
  - given-names: Natalie
    family-names: Parde
    email: parde@uic.edu
    affiliation: University of Illinois at Chicago
    orcid: 'https://orcid.org/0000-0003-0072-7499'
    
repository-code: 'https://github.com/mo-arvan/clinical-nlp'
license: CC-BY-NC-4.0

GitHub Events

Total
  • Push event: 2
Last Year
  • Push event: 2