https://github.com/chemprop/chemprop_benchmark_v2

Chemprop benchmarking scripts and data for v2

https://github.com/chemprop/chemprop_benchmark_v2

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
    2 of 6 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.0%) to scientific vocabulary

Keywords from Contributors

chemistry drug-discovery
Last synced: 7 months ago · JSON representation

Repository

Chemprop benchmarking scripts and data for v2

Basic Info
  • Host: GitHub
  • Owner: chemprop
  • License: mit
  • Language: Shell
  • Default Branch: main
  • Size: 444 KB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created almost 2 years ago · Last pushed 10 months ago
Metadata Files
Readme License

README.md

Chemprop v2 benchmarking scripts and data

License: MIT DOI

This repository contains benchmarking scripts and data for Chemprop v2, a message passing neural network for molecular property prediction, as described in the paper Chemprop v2: Modular, Fast, and User-Friendly. Please refer to the Chemprop repository for installation and usage instructions.

Data

All datasets used in the study can be downloaded from Zenodo. You can either download and extract the file data.tar.gz yourself, or run

wget https://zenodo.org/records/10078142/files/data.tar.gz tar -xzvf data.tar.gz

The data folder should be placed within the chemprop_benchmark_v2 folder (i.e. where this README and the scripts folder are located).

Benchmarks

The paper reports a large number of benchmarks that can be run individually by executing one of the shell scripts in the scripts folder. For example, to run the barriers_e2 reaction benchmark, activate your Chemprop environment as described in the Chemprop repository, and then run (after adapting the path to your Chemprop folder):

cd scripts ./barriers_e2.sh

This will run a hyperparameter search, as well as a training run on the best hyperparameters, and produce the results_barriers_e2 folder with all the necessary information, including model checkpoints and test set predictions.

The following benchmarking systems were used in the paper: * hiv HIV replication inhibition from MoleculeNet and OGB with scaffold splits * pcba_random Biological activities from MoleculeNet with random splits * pcba_random_nans Biological activities from MoleculeNet with missing targets NOT set to zero (to be comparable to the OGB version) with random splits * pcba_scaffold Biological activities from MoleculeNet and OGB with scaffold splits * qm9_multitask DFT calculated properties from MoleculeNet and OGB, trained as a multi-task model * qm9_u0 DFT calculated properties from MoleculeNet and OGB, trained as a single-task model on the target U0 only * qm9_gap DFT calculated properties from MoleculeNet and OGB, trained as a single-task model on the target gap only * sampl octanol–water partition coefficients (SAMPL6 & 7) and toluene–water partition coefficients (SAMPL9) * barriers_e2 Reaction barrier heights of E2 reactions * barriers_sn2 Reaction barrier heights of SN2 reactions * barriers_cycloadd Reaction barrier heights of cycloaddition reactions * barriers_rdb7 Reaction barrier heights in the RDB7 dataset * barriers_rgd1 Reaction barrier heights in the RGD1-CNHO dataset * multi_molecule UV/Vis peak absorption wavelengths in different solvents * pcqm4mv2 HOMO-LUMO gaps of the PCQM4Mv2 dataset * timing Timing benchmark using subsets of the QM9 gap

The benchmarks were performed using Chemprop v2.0.3. To reproduce the exact environment used in this study, you can create a conda environment using the provided environment.yml file.

Owner

  • Name: chemprop
  • Login: chemprop
  • Kind: organization
  • Location: MIT

Home of the official chemprop project

GitHub Events

Total
  • Release event: 1
  • Watch event: 1
  • Public event: 1
  • Push event: 2
  • Create event: 1
Last Year
  • Release event: 1
  • Watch event: 1
  • Public event: 1
  • Push event: 2
  • Create event: 1

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 66
  • Total Committers: 6
  • Avg Commits per committer: 11.0
  • Development Distribution Score (DDS): 0.621
Past Year
  • Commits: 28
  • Committers: 5
  • Avg Commits per committer: 5.6
  • Development Distribution Score (DDS): 0.393
Top Committers
Name Email Commits
joelnkn 6****n 25
Akshat Zalte 5****e 17
Nathan Morgan n****n@g****m 12
shihchengli s****i@m****u 6
Kevin Greenman k****g@m****u 4
Hao-Wei Pang 4****g 2
Committer Domains (Top 20 + Academic)
mit.edu: 2

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: 8 days
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.5
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: less than a minute
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • KnathanM (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

environment.yml pypi
  • aimsim-core ==2.2.1
  • aiohttp ==3.9.5
  • aiosignal ==1.3.1
  • astartes ==1.2.2
  • attrs ==23.2.0
  • certifi ==2024.6.2
  • charset-normalizer ==3.3.2
  • chemprop ==2.0.3
  • click ==8.1.7
  • cloudpickle ==3.0.0
  • configargparse ==1.7
  • dill ==0.3.8
  • filelock ==3.15.4
  • frozenlist ==1.4.1
  • fsspec ==2024.6.1
  • future ==1.0.0
  • hyperopt ==0.2.7
  • idna ==3.7
  • jinja2 ==3.1.4
  • joblib ==1.4.2
  • jsonschema ==4.22.0
  • jsonschema-specifications ==2023.12.1
  • lightning ==2.3.1
  • lightning-utilities ==0.11.3.post0
  • markupsafe ==2.1.5
  • mhfp ==1.9.6
  • mordredcommunity ==2.0.5
  • mpmath ==1.3.0
  • msgpack ==1.0.8
  • multidict ==6.0.5
  • multiprocess ==0.70.16
  • networkx ==3.3
  • numpy ==1.26.4
  • nvidia-cublas-cu12 ==12.1.3.1
  • nvidia-cuda-cupti-cu12 ==12.1.105
  • nvidia-cuda-nvrtc-cu12 ==12.1.105
  • nvidia-cuda-runtime-cu12 ==12.1.105
  • nvidia-cudnn-cu12 ==8.9.2.26
  • nvidia-cufft-cu12 ==11.0.2.54
  • nvidia-curand-cu12 ==10.3.2.106
  • nvidia-cusolver-cu12 ==11.4.5.107
  • nvidia-cusparse-cu12 ==12.1.0.106
  • nvidia-nccl-cu12 ==2.20.5
  • nvidia-nvjitlink-cu12 ==12.5.40
  • nvidia-nvtx-cu12 ==12.1.105
  • packaging ==24.1
  • padelpy ==0.1.16
  • pandas ==2.2.2
  • pillow ==10.3.0
  • protobuf ==5.27.2
  • psutil ==6.0.0
  • py4j ==0.10.9.7
  • pyarrow ==16.1.0
  • python-dateutil ==2.9.0.post0
  • pytorch-lightning ==2.3.1
  • pytz ==2024.1
  • pyyaml ==6.0.1
  • ray ==2.31.0
  • rdkit ==2024.3.1
  • referencing ==0.35.1
  • requests ==2.32.3
  • rpds-py ==0.18.1
  • scikit-learn ==1.5.0
  • scipy ==1.14.0
  • six ==1.16.0
  • sympy ==1.12.1
  • tabulate ==0.9.0
  • tensorboardx ==2.6.2.2
  • threadpoolctl ==3.5.0
  • torch ==2.3.1
  • torchmetrics ==1.4.0.post0
  • tqdm ==4.66.4
  • triton ==2.3.1
  • typing-extensions ==4.12.2
  • tzdata ==2024.1
  • urllib3 ==2.2.2
  • yarl ==1.9.4