uvvisml

Predict optical properties of molecules with machine learning.

https://github.com/learningmatter-mit/uvvisml

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 10 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
    Organization learningmatter-mit has institutional domain (gomezbombarelli.mit.edu)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary

Keywords

cheminformatics deep-learning machine-learning molecular-property-prediction spectroscopy uvvis-spectroscopy
Last synced: 4 months ago · JSON representation

Repository

Predict optical properties of molecules with machine learning.

Basic Info
  • Host: GitHub
  • Owner: learningmatter-mit
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 9.12 MB
Statistics
  • Stars: 32
  • Watchers: 3
  • Forks: 9
  • Open Issues: 1
  • Releases: 2
Topics
cheminformatics deep-learning machine-learning molecular-property-prediction spectroscopy uvvis-spectroscopy
Created about 4 years ago · Last pushed 5 months ago
Metadata Files
Readme License Citation

README.md

UVVisML

License: MIT DOI

Predict optical properties of molecules with machine learning.

Colab Examples

A Google Colab notebook is available here with examples of using the various types of models and predictions. Alternatively, you may use the command line instructions below.

Command Line Setup

  1. Install Anaconda or Miniconda if you have not yet done so.
  2. git clone git@github.com:learningmatter-mit/uvvisml.git
  3. cd uvvisml
  4. conda env create -f environment.yml
  5. cd uvvisml
  6. bash get_model_files.sh (This downloads trained model files from Zenodo.)
  7. conda activate uvvisml
  8. pip install chemprop

Making Predictions

Test file

To make predictions, specify a --test_file with the dyes or dye-solvent pairs for which you wish to predict properties. This should be a CSV with one dye (for vacuum TD-DFT predictions) or dye-solvent pair (for experimental predictions) per line. For example, the test file for vacuum TD-DFT predictions could be: smiles CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1 CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1 CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1 CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1 CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1 CCN(CC)c1ccc2cc(-c3nc4ccccc4n3C)c(=O)oc2c1 C[SiH](C)c1cccc2ccccc12

The test file for experimental predictions could be: smiles,solvent CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,C1CCCCC1 CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,CCOC(C)=O CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,CC#N CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,CCO CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,OCC(O)CO CCN(CC)c1ccc2cc(-c3nc4ccccc4n3C)c(=O)oc2c1,CC#N C[SiH](C)c1cccc2ccccc12,C1CCCCC1

Property

  • Experimental peak wavelength of maximum absorption: --property absorption_peak_nm_expt
  • Vertical excitation energy with maximum oscillator strength in vacuum TD-DFT: --property vertical_excitation_eV_tddft

Method

  • Single-fidelity (experiment or TD-DFT): --method chemprop
  • Multi-fidelity (experiment only): --method chemprop_tddft

Train dataset

  • Experiment: --train_dataset combined (default) or --train_dataset deep4chem
  • TD-DFT: --train_dataset all_wb97xd3

Cluster

Cluster that the script will be run on. Includes options for Supercloud and Engaging clusters at MIT. Default of None runs the script on the local machine.

Uncertainty in Predictions

Output the ensemble variance (a measure of epistemic uncertainty) in predictions using --uncertainty_method ensemble_variance.

Examples

``` python uvvisml/predict.py --testfile uvvisml/data/splits/lambdamaxabs/deep4chem/groupbysmiles/smilestargettest.csv --property absorptionpeaknmexpt --method chemprop --predsfile testpreds.csv

python uvvisml/predict.py --testfile uvvisml/data/splits/lambdamaxabs/deep4chem/groupbysmiles/smilestargettest.csv --property verticalexcitationeVtddft --method chemprop --predsfile testpreds.csv

python uvvisml/predict.py --testfile uvvisml/data/splits/lambdamaxabs/deep4chem/groupbysmiles/smilestargettest.csv --property absorptionpeaknmexpt --method chemprop --predsfile testpreds.csv --train_dataset deep4chem

python uvvisml/predict.py --testfile uvvisml/data/splits/lambdamaxabs/deep4chem/groupbysmiles/smilestargettest.csv --property absorptionpeaknmexpt --method chemproptddft --predsfile testpreds.csv --loglevel info ```

Data

Please see the Data README for details on the sources and processing of the data used in this repository.

Citation

If you use this code, please cite the following manuscript:

@article{greenman2022multi, title={Multi-fidelity prediction of molecular optical peaks with deep learning}, author={Greenman, Kevin P. and Green, William H. and G{\'{o}}mez-Bombarelli, Rafael}, journal={Chemical Science}, year={2022}, volume={13}, issue={4}, pages={1152-1162}, publisher={The Royal Society of Chemistry}, doi={10.1039/D1SC05677H}, url={http://dx.doi.org/10.1039/D1SC05677H} }

The code for reproducing the results and figures from the above paper is available on Zenodo.

Owner

  • Name: Learning Matter @ MIT
  • Login: learningmatter-mit
  • Kind: organization
  • Email: rafagb@mit.edu

Rafael Gomez-Bombarelli Group @ MIT

GitHub Events

Total
  • Watch event: 9
  • Push event: 3
  • Pull request event: 1
  • Fork event: 1
Last Year
  • Watch event: 9
  • Push event: 3
  • Pull request event: 1
  • Fork event: 1

Dependencies

environment.yml pypi