https://github.com/chemprop/chemprop_benchmark
Chemprop benchmarking scripts and data for v1
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary
Repository
Chemprop benchmarking scripts and data for v1
Basic Info
Statistics
- Stars: 26
- Watchers: 5
- Forks: 6
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
Chemprop benchmarking scripts and data
This repository contains benchmarking scripts and data for Chemprop, a message passing neural network for molecular property prediction, as described in the paper Chemprop: Machine Learning Package for Chemical Property Prediction. Please have a look at the Chemprop repository for installation and usage instructions.
Data
All datasets used in the study can be downloaded from Zenodo. You can either download and extract the file data.tar.gz yourself, or run
wget https://zenodo.org/records/10078142/files/data.tar.gz
tar -xzvf data.tar.gz
The data folder should be placed within the chemprop_benchmark folder (i.e. where this README and the scripts folder are located).
Benchmarks
The paper reports a large number of benchmarks, than can be run individually by executing one of the shell scripts in the scripts folder. For example, to run the barriers_e2 reaction benchmark, activate your Chemprop environment as described in the Chemprop repository, and then run (after adapting the path to your Chemprop folder):
cd scripts
./barriers_e2.sh
This will run a hyperparameter search, as well as a training run on the best hyperparameters, and produce the folder results_barriers_e2 with all information. Specifically, the file results_barriers_e2/test_scores.csv will list the test set errors. If you have installed Chemprop via pip, use chemprop_train etc instead of python $chemprop_dir/train.py in the script.
Available benchmarking systems:
* hiv HIV replication inhibition from MoleculeNet and OGB with scaffold splits
* pcba_random Biological activities from MoleculeNet with random splits
* pcba_random_nans Biological activities from MoleculeNet with missing targets NOT set to zero (to be comparable to the OGB version) with random splits
* pcba_scaffold Biological activities from MoleculeNet and OGB with scaffold splits
* qm9_multitask DFT calculated properties from MoleculeNet and OGB, trained as a multi-task model
* qm9_u0 DFT calculated properties from MoleculeNet and OGB, trained as a single-task model on the target U0 only
* qm9_gap DFT calculated properties from MoleculeNet and OGB, trained as a single-task model on the target gap only
* sampl Water-octanol partition coefficients, used to predict molecules from the SAMPL6, 7 and 9 challenges
* atom_bond_137k Quantum-mechanical atom and bond descriptors
* bde Bond dissociation enthalpies trained as single-task model
* bde_charges Bond dissociation enthalpies trained as multi-task model together with atomic partial charges
* charges_eps_4 Partial charges at a dielectric constant of 4 (in protein)
* charges_eps_78 Partial charges at a dielectric constant of 78 (in water)
* barriers_e2 Reaction barrier heights of E2 reactions
* barriers_sn2 Reaction barrier heights of SN2 reactions
* barriers_cycloadd Reaction barrier heights of cycloaddition reactions
* barriers_rdb7 Reaction barrier heights in the RDB7 dataset
* barriers_rgd1 Reaction barrier heights in the RGD1-CNHO dataset
* multi_molecule UV/Vis peak absorption wavelengths in different solvents
* ir IR Spectra
* pcqm4mv2 HOMO-LUMO gaps of the PCQM4Mv2 dataset
* uncertainty_ensemble Uncertainty estimation using an ensemble using the QM9 gap dataset
* uncertainty_evidential Uncertainty estimation using evidential learning using the QM9 gap dataset
* uncertainty_mve Uncertainty estimation using mean-variance estimation using the QM9 gap dataset
* timing Timing benchmark using subsets of QM9 gap
The benchmarks were done on the master branch of Chemprop v1.6.1. The only exception is the timing benchmarks, which were run on the benchmark_timing branch that includes timing printouts. However, they can also be run on the master branch, although with less verbous printouts. If you want to recreate the exact environment this study was run in, you can use the environment.yml file to set up a conda environment.
Owner
- Name: chemprop
- Login: chemprop
- Kind: organization
- Location: MIT
- Repositories: 1
- Profile: https://github.com/chemprop
Home of the official chemprop project
GitHub Events
Total
- Watch event: 4
- Fork event: 1
Last Year
- Watch event: 4
- Fork event: 1
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 2
- Total pull requests: 3
- Average time to close issues: 3 months
- Average time to close pull requests: 3 days
- Total issue authors: 2
- Total pull request authors: 2
- Average comments per issue: 0.5
- Average comments per pull request: 1.0
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 0
- Average time to close issues: 3 months
- Average time to close pull requests: N/A
- Issue authors: 2
- Pull request authors: 0
- Average comments per issue: 0.5
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- SoodabehGhaffari (1)
- Nick-Mul (1)
Pull Request Authors
- hesther (2)
- shihchengli (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- alabaster ==0.7.12
- babel ==2.9.0
- chardet ==4.0.0
- descriptastorus ==2.3.0.2
- docutils ==0.16
- idna ==2.10
- imagesize ==1.2.0
- mypy-extensions ==0.4.3
- packaging ==20.8
- pandas-flavor ==0.2.0
- pygments ==2.7.4
- pyqt5-sip ==4.19.18
- pyqtchart ==5.12
- pyqtwebengine ==5.12.1
- rdkit ==2022.9.5
- requests ==2.25.1
- scipy ==1.7.3
- snowballstemmer ==2.1.0
- sphinx ==3.4.3
- sphinxcontrib-applehelp ==1.0.2
- sphinxcontrib-devhelp ==1.0.2
- sphinxcontrib-htmlhelp ==1.0.3
- sphinxcontrib-jsmath ==1.0.1
- sphinxcontrib-qthelp ==1.0.3
- sphinxcontrib-serializinghtml ==1.1.4
- torch ==1.7.1
- typed-argument-parser ==1.6.1
- typing-extensions ==3.7.4.2
- typing-inspect ==0.6.0
- urllib3 ==1.26.3
- xarray ==0.16.2