snap-umls-clusters

Master Thesis Project in Arab American University Palestine with Palestinian Neuro Initiative Educational Research Center - Clustering medical sentences based on Unified Medical Language System (UMLS) terms and expanded UMLS terms present in them

https://github.com/zainasaadeddin/snap-umls-clusters

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: scholar.google, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.1%) to scientific vocabulary

Keywords

deep-neural-networks knowledge-graph language-model machine-learning natural-language-processing text-mining
Last synced: 6 months ago · JSON representation

Repository

Master Thesis Project in Arab American University Palestine with Palestinian Neuro Initiative Educational Research Center - Clustering medical sentences based on Unified Medical Language System (UMLS) terms and expanded UMLS terms present in them

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Topics
deep-neural-networks knowledge-graph language-model machine-learning natural-language-processing text-mining
Created over 4 years ago · Last pushed about 3 years ago
Metadata Files
Readme Citation

README.md

snap-umls-clusters

DOI

Project Google Site

clusterize the SNAP-IV ADHD questions based on Unified Medical Language System (UMLS) terms and expanded UMLS terms present in them.

Thesis Supervisors:

Clustering

  • Feature matrix
  • Sillhouete determines the best number of clusters, after evaluating multiple values, where 2 <= N < 12

Visualization

  • NetworkX graphs depicting CUI terms that usually appear on the same sentences.
  • Distance matrices: reordered to match clusters.
  • Feature matrices: reordered to match clusters.

Usage

Setup

  • Install requirements: pip install -r requirements.txt

  • Install scispacy model:

pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_core_sci_md-0.4.0.tar.gz

  • Download the UMLS Metathesaurus (login required). Place it on the base directory.

https://download.nlm.nih.gov/umls/kss/2021AA/umls-2021AA-metathesaurus.zip

Execute

python main.py SNAP_IV_Long_with_Scoring.pdf

Owner

  • Name: ZainaSaadeddin
  • Login: zainasaadeddin
  • Kind: user
  • Location: Palestine
  • Company: Fatima Fellowship 2022

Mainly into: (ML/DL/NLP) modeling technologies.

GitHub Events

Total
Last Year

Dependencies

requirements.txt pypi
  • networkx *
  • numpy *
  • owlready2 *
  • pypdf2 *
  • requests *
  • scispacy *
  • sklearn *
  • spacy *