Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.3%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: chimie-paristech-CTM
- License: other
- Language: Python
- Default Branch: main
- Size: 52.7 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
thermo_GNN
This is the repository containing the code associated with the paper "Graph-based deep learning models for thermodynamic property prediction: The interplay between target definition, data distribution, featurization, and model architecture". Code is provided "as-is". Minor edits may be required to tailor the scripts for different computational systems.
Table of Contents
Features
- Atom-fingerprint:
- Mol-feature:
- Ringcount feature:
- MLP_Trigonometric:
- KAN_Trigonometric:
More information on the meaning of the individual features, we refer to the associated manuscript.
Requirements
To use CPUs, Suitable for x86 and ARM platforms. To use GPUs, you will need: * cuda >= 8.0 * cuDNN
Installation
To download the code
git clone https://github.com/chimie-paristech-CTM/tree/main
cd thermo_GNN
To set up the thermoGNN conda environment:
conda env create -f environment.yml
To install the thermoGNN package, activate the thermoGNN environment and run the following command within the thermoGNN directory:
conda activate thermo_GNN
pip install -e .
Updates
- ☑ add Atom fingerprint.
- ☑ add Mol-feature.
- ☑ add Ringcount feature.
- ☑ add Mlp_Trigonometric.
- ☑ add KAN_Trigonometric.
Quick Start
Folder Structure
.
├── README.md
├── dataset/smalldataset
│ ├── data/
│ │ ├── *data*.csv
│ ...
└── chemprop/
.
├── *data*.csv
column
|smiles | target_label|
data:
│smiles1| target_label1|
│smiles2| target_label2|
│smiles3| target_label3|
...
└──
Train
To train a model, run:
python train.py --data_path <path> --dataset_type <type> --save_dir <dir> --epochs <epoch> --input_features_type <input_features_type> --aggregation <norm> --output_fingerprint <output_fingerprint> --model <model>
where:
1. <path> is the csv file path not the dir path.
2. <aggregation> containing [sum, mean, norm] controlling the output type of outputhead.
3. `<inputfeaturestype>` containing [chemprop, jpca, moleculelevelfeature] controling the type of input feature.
4. `<outputfingerprint>containing [atom, mol] controlling the type of output fingerprint.
5.
For example:
python train.py --data_path ./dataset/singledata/lipo_train.csv --dataset_type regression --output_fingerprint atom --save_dir ./lipo/checkpoint --epochs 2 --input_features_type molecule_level_feature --aggregation norm
A full list of available command-line arguments can be found in chemprop/args.py.
If installed from source, python train.py can be replaced with chemprop_train.
Notes:
* The default metric for classification is AUC and the default metric for regression is RMSE. Other metrics may be specified with --metric <metric>.
* --save_dir may be left out if you don't want to save model checkpoints.
* --quiet can be added to reduce the amount of debugging information printed to the console. Both a quiet and verbose version of the logs are saved in the save_dir.
Script
the folder 'dataset_preparation' contains all scripts to process the original datasets.
Dataset
This link datasets contains qm9, paton, qmugs, pc9, and qmugs1.1 datasets.
Citation
If (parts of) this work are used as part of a publication, please cite the paper:
@article{***,
title={Graph-based deep learning models for thermodynamic property prediction: The interplay between target definition, data distribution, featurization, and model architecture},
author={Bowen Deng ,Thijs Stuyver},
journal={ChemRxiv},
year={2024}
}
Furthermore, since the work is based on chemprop, please also cite the paper in which this code was originally presented:
@article{***,
title={Chemprop: A Machine Learning Package for Chemical Property Prediction},
author={Esther Heid ,Kevin P. Greenman ,Yunsie Chung ,Shih-Cheng Li ,David E. Graff ,Florence H. Vermeire ,Haoyang Wu ,William H. Green ,Charles J. McGill},
journal={ChemRxiv},
year={2023}
}
Acknowledgement
- PyTorch implementation of chemprop: https://github.com/chemprop/chemprop
Owner
- Name: chimie-paristech-CTM
- Login: chimie-paristech-CTM
- Kind: organization
- Repositories: 1
- Profile: https://github.com/chimie-paristech-CTM
Citation (CITATIONS.bib)
# this was downloaded from ACS: https://pubs.acs.org/doi/10.1021/acs.jcim.9b00237
@article{chemprop_theory,
author = {Yang, Kevin and Swanson, Kyle and Jin, Wengong and Coley, Connor and Eiden, Philipp and Gao, Hua and Guzman-Perez, Angel and Hopper, Timothy and Kelley, Brian and Mathea, Miriam and Palmer, Andrew and Settels, Volker and Jaakkola, Tommi and Jensen, Klavs and Barzilay, Regina},
title = {Analyzing Learned Molecular Representations for Property Prediction},
journal = {Journal of Chemical Information and Modeling},
volume = {59},
number = {8},
pages = {3370-3388},
year = {2019},
doi = {10.1021/acs.jcim.9b00237},
note ={PMID: 31361484},
URL = {
https://doi.org/10.1021/acs.jcim.9b00237
},
eprint = {
https://doi.org/10.1021/acs.jcim.9b00237
}
}
# this was downloaded from ACS: https://pubs.acs.org/doi/10.1021/acs.jcim.3c01250
@article{chemprop_software,
author = {Heid, Esther and Greenman, Kevin P. and Chung, Yunsie and Li, Shih-Cheng and Graff, David E. and Vermeire, Florence H. and Wu, Haoyang and Green, William H. and McGill, Charles J.},
title = {Chemprop: A Machine Learning Package for Chemical Property Prediction},
journal = {Journal of Chemical Information and Modeling},
volume = {64},
number = {1},
pages = {9-17},
year = {2024},
doi = {10.1021/acs.jcim.3c01250},
note ={PMID: 38147829},
URL = {
https://doi.org/10.1021/acs.jcim.3c01250
},
eprint = {
https://doi.org/10.1021/acs.jcim.3c01250
}
}
GitHub Events
Total
- Watch event: 5
- Delete event: 3
- Push event: 9
- Public event: 1
- Fork event: 1
Last Year
- Watch event: 5
- Delete event: 3
- Push event: 9
- Public event: 1
- Fork event: 1
Dependencies
- mambaorg/micromamba 0.23.0 build
- Werkzeug <3
- descriptastorus >=2.6.1
- descriptastorus <2.6.1
- flask >=1.1.2,<=2.1.3
- hyperopt >=0.2.3
- matplotlib >=3.1.3
- numpy >=1.18.1
- pandas >=1.0.3
- pandas-flavor >=0.2.0
- rdkit >=2020.03.1.0
- scikit-learn >=0.22.2.post1
- scipy <1.11
- scipy >=1.9
- sphinx >=3.1.2
- sphinx-rtd-theme >=2.0.0
- tensorboardX >=2.0
- torch >=1.4.0
- tqdm >=4.45.0
- typed-argument-parser >=1.6.1
- typed-argument-parser >=1.6.1