https://github.com/bayesomicslab/oud-risk-prediction
OUD-Risk-Prediction
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary
Repository
OUD-Risk-Prediction
Basic Info
- Host: GitHub
- Owner: bayesomicslab
- Language: Jupyter Notebook
- Default Branch: main
- Size: 38.3 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
OUD-Risk-Prediction
Repository for the project Opioid Use Disorder Risk Modelling through Mobility and Genetic Feature Integration
About The Project

Overview of our integrative approach for estimating disease risk.\ The mobility trace and genetic data are preprocessed, then augmented to balance the genetic and mobility trace sample sizes. The augmented data is merged using a disease co-occurrence parameter ($C$), genetic relative risk ($G$), and mobility relative risk ($M$). In the modelling step, features and models are selected, and classifiers are trained to estimate OUD risk.
Data Sources
- The preprocessed mobility and genetic data can be found under
./data/preprocessed_raw/ - The extracted genetic variants data can be found under
./data/data_variants/
Getting Started
To set up the project locally, generate the hybrid datasets, perform feature and model selection, train against the datasests and evaluate the model performance, please follow the instructions below.
Prerequisites
The following python version is required to proceed to the steps below: * Python version: python (>=3.8)
Installation
Setup steps for running the simulation and modelling mechanisms.
- Clone the OUD-Risk-Prediction repository
sh git clone https://github.com/bayesomicslab/OUD-Risk-Prediction.git - Install dependencies
sh pip install -r requirements.txt - Run the commands described in the "Usage" section.
Usage
- Create synthetic hybrid datasets (mobility trace + genetic)
sh python create_synthetic_datasets.py \ --comorbidity=CO_OCCURRENCE_LEVEL \ --rr_geno=GENOTYPE_RISK_RATIO\ --rr_mt=MOBILITY_TRACE_RISK_RATIO\ --n_sets=NUM_SETS_PER_CO_OCCURRENCE_RR_CONFIG\ --out - Feature selection on genotype data
sh python feature_selection.py \ --filename=FILENAME_FOR_MERGED_DATA \ --out - Model selection
sh python model_selection.py \ --merged_file=FILENAME_FOR_MERGED_DATA \ --var_features=FILENAME_FOR_SELECTED_VARIANTS\ --out - Training and evaluation of models
sh python select_train_test_f1.py \ --dataset=CO_OCCURRENCE_LEVEL \ --rr_geno=GENOTYPE_RISK_RATIO\ --rr_mt=MOBILITY_TRACE_RISK_RATIO\ --hyperparams=HYPERPARAMS_FILEPATH\ --out
Authors
- Derek Aguiar, Ph.D. (PI)
- Sybille M. Lgitime
- Bing Wang, Ph.D.
- Dipak Dey, Ph.D.
- Kaustubh Prabhu
- Devin J. McConnell
Acknowledgments
D.A. and S.L. were supported in part by the University of Connecticuts Institute for Collaboration on Health, Intervention, and Policy (awarded to D.A.); B.W. was supported in part by the National Science Foundation grant IIS-1407205 (awarded to B.W.).
Owner
- Name: bayesomicslab
- Login: bayesomicslab
- Kind: organization
- Repositories: 12
- Profile: https://github.com/bayesomicslab