https://github.com/aspuru-guzik-group/long-acting-injectables

Code and results for Machine Learning Models to Accelerate the Design of Polymeric Long-Acting Injectables

https://github.com/aspuru-guzik-group/long-acting-injectables

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.8%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Code and results for Machine Learning Models to Accelerate the Design of Polymeric Long-Acting Injectables

Basic Info
  • Host: GitHub
  • Owner: aspuru-guzik-group
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 51.5 MB
Statistics
  • Stars: 11
  • Watchers: 5
  • Forks: 6
  • Open Issues: 2
  • Releases: 0
Created about 4 years ago · Last pushed over 3 years ago
Metadata Files
Readme License

README.md

long-acting-injectables

Data and code used for the research article titled "Machine Learning Models to Accelerate the Design of Polymeric Long-Acting Injectables".

There are directories for (i) few-shot models and (ii) zero-shot models.

The few-shot models directory conatins the datafile used to train these models (Dataset17feat) as various files types (xlsx, csv, and tsv). The zero-shot models directory conatins the datafile used to train these models (Dataset14feat) as various files types (xlsx, csv, and tsv).

Each directory contains a python class (NESTEDCV) that is called for the nested cross-validation of either the few-shot or zero-shot machine learning models. This class is called to train all of the machine learning models in this study (except for the neural networks). Once implemented, a 10-fold nested cross-validation is conducted on the specified model. The results of this nested cross-validation are stored in a sub-directory (NESTEDCVRESULTS) as a pickle file (.pkl). The best model hyperparameter configuration is stored in a sub-directory (Trained_models) as a pickle file (.pkl).

Each directory also contains the python scripts used for (i) preliminary model evaluation, (ii) refinement and re-training of the "best" model, and (iii) to call the trained model to make predictions.

Each directory also contains all of the codes necessary to replicate the figures used in the research article (e.g., Figure1, Figure2, etc.). There is an additional sub-directory (Figures) that stores all of the figures generated using these python scripts.

Link to preprint: https://doi.org/10.26434/chemrxiv-2021-mxrxw-v2

Owner

  • Name: Aspuru-Guzik group repo
  • Login: aspuru-guzik-group
  • Kind: organization

GitHub Events

Total
  • Watch event: 3
  • Fork event: 1
Last Year
  • Watch event: 3
  • Fork event: 1