snap-umls-clusters
Master Thesis Project in Arab American University Palestine with Palestinian Neuro Initiative Educational Research Center - Clustering medical sentences based on Unified Medical Language System (UMLS) terms and expanded UMLS terms present in them
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: scholar.google, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.1%) to scientific vocabulary
Keywords
Repository
Master Thesis Project in Arab American University Palestine with Palestinian Neuro Initiative Educational Research Center - Clustering medical sentences based on Unified Medical Language System (UMLS) terms and expanded UMLS terms present in them
Basic Info
- Host: GitHub
- Owner: zainasaadeddin
- Language: Python
- Default Branch: main
- Homepage: https://sites.google.com/view/zeina-master-project2021
- Size: 209 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Topics
Metadata Files
README.md
snap-umls-clusters
clusterize the SNAP-IV ADHD questions based on Unified Medical Language System (UMLS) terms and expanded UMLS terms present in them.
Thesis Supervisors:
Clustering
- Feature matrix
- Sillhouete determines the best number of clusters, after evaluating multiple values, where
2 <= N < 12
Visualization
- NetworkX graphs depicting CUI terms that usually appear on the same sentences.
- Distance matrices: reordered to match clusters.
- Feature matrices: reordered to match clusters.
Usage
Setup
Install requirements:
pip install -r requirements.txtInstall scispacy model:
pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_core_sci_md-0.4.0.tar.gz
- Download the UMLS Metathesaurus (login required). Place it on the base directory.
https://download.nlm.nih.gov/umls/kss/2021AA/umls-2021AA-metathesaurus.zip
Execute
python main.py SNAP_IV_Long_with_Scoring.pdf
Owner
- Name: ZainaSaadeddin
- Login: zainasaadeddin
- Kind: user
- Location: Palestine
- Company: Fatima Fellowship 2022
- Website: https://www.linkedin.com/in/jszeina/
- Twitter: jszeina
- Repositories: 1
- Profile: https://github.com/zainasaadeddin
Mainly into: (ML/DL/NLP) modeling technologies.
GitHub Events
Total
Last Year
Dependencies
- networkx *
- numpy *
- owlready2 *
- pypdf2 *
- requests *
- scispacy *
- sklearn *
- spacy *