https://github.com/cthoyt/sfi
An implementation of the Solubility Forecast Index (SFI)
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 2 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.2%) to scientific vocabulary
Last synced: 6 months ago
·
JSON representation
Repository
An implementation of the Solubility Forecast Index (SFI)
Basic Info
- Host: GitHub
- Owner: cthoyt
- License: mit
- Default Branch: main
- Size: 184 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of PatWalters/sfi
Created about 4 years ago
· Last pushed over 3 years ago
https://github.com/cthoyt/sfi/blob/main/
## sfi An implementation of the Solubility Forecast Index (SFI) ## Installation and Use ### Installation The necessary libraries can be installed using the command below.pip install -r requirements.txt### The fast, easy way to calculate SFI The notebook **example_sfi.ipynb** shows how to use a stored cLogD model to predict and plot SFI. ### The slow way, if you want to see how the sausage if made 1. The script **dask_descriptors.py** extracts structures and cLogD data for more than 2 million molecules from the ChEMBL database and generates molecular descriptors. The script uses the [chembl-downloader](https://github.com/cthoyt/chembl-downloader), so it's not necessary to have ChEMBL installed to run these scripts and notebooks. Note that running this script will take a while. On my MacBook Pro, this takes about 7hrs. **You have been warned**.dask_descriptors.py logd_descriptors.pkl2. The notebook **build_logd_model.ipynb** generates a machine learning model for predicting cLogD based on the data in the ChEMBL database. This provides a nice example of how we can use LightGBM to generate a machine learning model for a large dataset. 3. The notebook **example_sfi.ipynb** shows how to use a stored cLogD model to predict and plot the Solubility Forecast Index (SFI). ## Bibliography Hill, A. P., & Young, R. J. (2010). Getting physical in drug discovery: a contemporary perspective on solubility and hydrophobicity. Drug discovery today, 15(15-16), 648-655. https://doi.org/10.1016/j.drudis.2010.05.016
Owner
- Name: Charles Tapley Hoyt
- Login: cthoyt
- Kind: user
- Location: Bonn, Germany
- Company: RWTH Aachen University
- Website: https://cthoyt.com
- Repositories: 489
- Profile: https://github.com/cthoyt