uvvisml
Predict optical properties of molecules with machine learning.
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 10 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
✓Institutional organization owner
Organization learningmatter-mit has institutional domain (gomezbombarelli.mit.edu) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.6%) to scientific vocabulary
Keywords
Repository
Predict optical properties of molecules with machine learning.
Basic Info
Statistics
- Stars: 32
- Watchers: 3
- Forks: 9
- Open Issues: 1
- Releases: 2
Topics
Metadata Files
README.md
UVVisML
Predict optical properties of molecules with machine learning.
Colab Examples
A Google Colab notebook is available here with examples of using the various types of models and predictions. Alternatively, you may use the command line instructions below.
Command Line Setup
- Install Anaconda or Miniconda if you have not yet done so.
git clone git@github.com:learningmatter-mit/uvvisml.gitcd uvvismlconda env create -f environment.ymlcd uvvismlbash get_model_files.sh(This downloads trained model files from Zenodo.)conda activate uvvismlpip install chemprop
Making Predictions
Test file
To make predictions, specify a --test_file with the dyes or dye-solvent pairs for which you wish to predict properties. This should be a CSV with one dye (for vacuum TD-DFT predictions) or dye-solvent pair (for experimental predictions) per line. For example, the test file for vacuum TD-DFT predictions could be:
smiles
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1
CCN(CC)c1ccc2cc(-c3nc4ccccc4n3C)c(=O)oc2c1
C[SiH](C)c1cccc2ccccc12
The test file for experimental predictions could be:
smiles,solvent
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,C1CCCCC1
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,CCOC(C)=O
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,CC#N
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,CCO
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,OCC(O)CO
CCN(CC)c1ccc2cc(-c3nc4ccccc4n3C)c(=O)oc2c1,CC#N
C[SiH](C)c1cccc2ccccc12,C1CCCCC1
Property
- Experimental peak wavelength of maximum absorption:
--property absorption_peak_nm_expt - Vertical excitation energy with maximum oscillator strength in vacuum TD-DFT:
--property vertical_excitation_eV_tddft
Method
- Single-fidelity (experiment or TD-DFT):
--method chemprop - Multi-fidelity (experiment only):
--method chemprop_tddft
Train dataset
- Experiment:
--train_dataset combined(default) or--train_dataset deep4chem - TD-DFT:
--train_dataset all_wb97xd3
Cluster
Cluster that the script will be run on. Includes options for Supercloud and Engaging clusters at MIT. Default of None runs the script on the local machine.
Uncertainty in Predictions
Output the ensemble variance (a measure of epistemic uncertainty) in predictions using --uncertainty_method ensemble_variance.
Examples
``` python uvvisml/predict.py --testfile uvvisml/data/splits/lambdamaxabs/deep4chem/groupbysmiles/smilestargettest.csv --property absorptionpeaknmexpt --method chemprop --predsfile testpreds.csv
python uvvisml/predict.py --testfile uvvisml/data/splits/lambdamaxabs/deep4chem/groupbysmiles/smilestargettest.csv --property verticalexcitationeVtddft --method chemprop --predsfile testpreds.csv
python uvvisml/predict.py --testfile uvvisml/data/splits/lambdamaxabs/deep4chem/groupbysmiles/smilestargettest.csv --property absorptionpeaknmexpt --method chemprop --predsfile testpreds.csv --train_dataset deep4chem
python uvvisml/predict.py --testfile uvvisml/data/splits/lambdamaxabs/deep4chem/groupbysmiles/smilestargettest.csv --property absorptionpeaknmexpt --method chemproptddft --predsfile testpreds.csv --loglevel info ```
Data
Please see the Data README for details on the sources and processing of the data used in this repository.
Citation
If you use this code, please cite the following manuscript:
@article{greenman2022multi,
title={Multi-fidelity prediction of molecular optical peaks with deep learning},
author={Greenman, Kevin P. and Green, William H. and G{\'{o}}mez-Bombarelli, Rafael},
journal={Chemical Science},
year={2022},
volume={13},
issue={4},
pages={1152-1162},
publisher={The Royal Society of Chemistry},
doi={10.1039/D1SC05677H},
url={http://dx.doi.org/10.1039/D1SC05677H}
}
The code for reproducing the results and figures from the above paper is available on Zenodo.
Owner
- Name: Learning Matter @ MIT
- Login: learningmatter-mit
- Kind: organization
- Email: rafagb@mit.edu
- Website: https://gomezbombarelli.mit.edu/
- Repositories: 33
- Profile: https://github.com/learningmatter-mit
Rafael Gomez-Bombarelli Group @ MIT
GitHub Events
Total
- Watch event: 9
- Push event: 3
- Pull request event: 1
- Fork event: 1
Last Year
- Watch event: 9
- Push event: 3
- Pull request event: 1
- Fork event: 1