https://github.com/cthoyt/sfi

An implementation of the Solubility Forecast Index (SFI)

https://github.com/cthoyt/sfi

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.2%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

An implementation of the Solubility Forecast Index (SFI)

Basic Info
  • Host: GitHub
  • Owner: cthoyt
  • License: mit
  • Default Branch: main
  • Size: 184 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of PatWalters/sfi
Created about 4 years ago · Last pushed over 3 years ago

https://github.com/cthoyt/sfi/blob/main/

## sfi

An implementation of the Solubility Forecast Index (SFI)

## Installation and Use
### Installation
The necessary libraries can be installed using the command below.
pip install -r requirements.txt
### The fast, easy way to calculate SFI The notebook **example_sfi.ipynb** shows how to use a stored cLogD model to predict and plot SFI. ### The slow way, if you want to see how the sausage if made 1. The script **dask_descriptors.py** extracts structures and cLogD data for more than 2 million molecules from the ChEMBL database and generates molecular descriptors. The script uses the [chembl-downloader](https://github.com/cthoyt/chembl-downloader), so it's not necessary to have ChEMBL installed to run these scripts and notebooks. Note that running this script will take a while. On my MacBook Pro, this takes about 7hrs. **You have been warned**.
dask_descriptors.py logd_descriptors.pkl
2. The notebook **build_logd_model.ipynb** generates a machine learning model for predicting cLogD based on the data in the ChEMBL database. This provides a nice example of how we can use LightGBM to generate a machine learning model for a large dataset. 3. The notebook **example_sfi.ipynb** shows how to use a stored cLogD model to predict and plot the Solubility Forecast Index (SFI). ## Bibliography Hill, A. P., & Young, R. J. (2010). Getting physical in drug discovery: a contemporary perspective on solubility and hydrophobicity. Drug discovery today, 15(15-16), 648-655. https://doi.org/10.1016/j.drudis.2010.05.016

Owner

  • Name: Charles Tapley Hoyt
  • Login: cthoyt
  • Kind: user
  • Location: Bonn, Germany
  • Company: RWTH Aachen University

GitHub Events

Total
Last Year