https://github.com/adicksonlab/agdiff
Implementation of AGDIFF: Attention-Enhanced Diffusion for Molecular Geometry Prediction
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: acs.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.5%) to scientific vocabulary
Keywords
Repository
Implementation of AGDIFF: Attention-Enhanced Diffusion for Molecular Geometry Prediction
Basic Info
- Host: GitHub
- Owner: ADicksonLab
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://pubs.acs.org/doi/10.1021/acs.jcim.4c01896
- Size: 20.1 MB
Statistics
- Stars: 14
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
AGDIFF: Attention-Enhanced Diffusion for Molecular Geometry Prediction
This repository contains the official implementation of the work "AGDIFF: Attention-Enhanced Diffusion for Molecular Geometry Prediction".
AGDIFF introduces a novel approach that enhances diffusion models with attention mechanisms and an improved SchNet architecture, achieving state-of-the-art performance in predicting molecular geometries.
Unique Features of AGDIFF
- Attention Mechanisms: Enhances the global and local encoders with attention mechanisms for better feature extraction and integration.
- Improved SchNet Architecture: Incorporates learnable activation functions, adaptive scaling modules, and dual pathway processing to increase model expressiveness.
- Batch Normalization: Stabilizes training and improves convergence for the local encoder.
- Feature Expansion: Extends the MLP Edge Encoder with feature expansion and processing, combining processed features and bond embeddings for more adaptable edge representations.
https://github.com/user-attachments/assets/78feda75-3a20-422a-9b3f-f96fceea69cc
Content
Environment Setup
Install dependencies via Conda/Mamba
bash
conda env create -f agdiff.yml
bash
conda activate agdiff
bash
pip install torch_geometric
bash
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.4.0+cu121.html
bash
pip install torch-sparse -f https://data.pyg.org/whl/torch-2.4.0+cu121.html
bash
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.4.0+cu121.html
Once you installed all the dependencies, you should install the package locally in editable mode:
bash
pip install -e .
Dataset
Official Dataset
The preprocessed datasets (GEOM) provided by GEODIFF can be found in this [Google Drive folder]. After downloading and unzipping the dataset, it should be placed in the folder path specified by the dataset variable in the configuration files located at ./configs/*.yml. You may also want to use the pretrained model provided in the same link.
The official raw GEOM dataset is also available [here].
Training
AGDIFF's training details and hyper-parameters are provided in the config files (./configs/*.yml). Feel free to tune these parameters as needed.
To train the model, use the following commands:
bash
python scripts/train.py ./configs/qm9_default.yml
python scripts/train.py ./configs/drugs_default.yml
Model checkpoints, configuration YAML files, and training logs will be saved in a directory specified by --logdir in train.py.
Generation
To generate conformations for entire or part of test sets, use:
bash
python scripts/test.py ./logs/path/to/checkpoints/${iter}.pt ./configs/qm9_default.yml \
--start_idx 0 --end_idx 200
Here start_idx and end_idx indicate the range of the test set that we want to use. To reproduce the paper's results, you should use 0 and 200 for startidx and endidx, respectively. All hyper-parameters related to sampling can be set in test.py files. Specifically, for testing the qm9 model, you could add the additional arg --w_global 0.3, which empirically shows slightly better results.
We also provide an example of conformation generation for a specific molecule (alanine dipeptide) in the examples folder. To generate conformations for alanine dipeptide, use:
```bash python examples/testalaninedipeptide.py ./logs/path/to/checkpoints/${iter}.pt ./configs/qm9default.yml ./examples/alaninedipeptide.pdb
```
Evaluation
After generating conformations, evaluate the results of benchmark tasks using the following commands.
Task 1. Conformation Generation
Calculate COV and MAT scores on the GEOM datasets with:
bash
python scripts/evaluation/eval_covmat.py path/to/samples/sample_all.pkl
Acknowledgement
Our implementation is based on GEODIFF, PyTorch, PyG, SchNet
Citation
If you use our code or method in your work, please consider citing the following:
bibtex
@misc{wyzykowskiAGDIFFAttentionEnhancedDiffusion2024,
title = {{{AGDIFF}}: {{Attention-Enhanced Diffusion}} for {{Molecular Geometry Prediction}}},
shorttitle = {{{AGDIFF}}},
author = {Wyzykowski, Andr{\'e} Brasil Vieira and Fathi Niazi, Fatemeh and Dickson, Alex},
year = {2024},
month = oct,
publisher = {ChemRxiv},
doi = {10.26434/chemrxiv-2024-wrvr4},
urldate = {2024-10-09},
archiveprefix = {ChemRxiv},
langid = {english},
keywords = {attention,conformer,diffusion models,generative,GNN,graph neural network,machine learning,structure}
}
Please direct any questions to André Wyzykowski (abvwmc@gmail.com) and Alex Dickson (alexrd@msu.edu).
Owner
- Name: ADicksonLab
- Login: ADicksonLab
- Kind: organization
- Repositories: 25
- Profile: https://github.com/ADicksonLab
GitHub Events
Total
- Issues event: 10
- Watch event: 10
- Delete event: 1
- Issue comment event: 5
- Push event: 25
- Create event: 1
Last Year
- Issues event: 10
- Watch event: 10
- Delete event: 1
- Issue comment event: 5
- Push event: 25
- Create event: 1
Issues and Pull Requests
Last synced: 5 months ago
All Time
- Total issues: 3
- Total pull requests: 7
- Average time to close issues: 4 days
- Average time to close pull requests: 2 days
- Total issue authors: 3
- Total pull request authors: 2
- Average comments per issue: 1.67
- Average comments per pull request: 0.0
- Merged pull requests: 6
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 3
- Pull requests: 7
- Average time to close issues: 4 days
- Average time to close pull requests: 2 days
- Issue authors: 3
- Pull request authors: 2
- Average comments per issue: 1.67
- Average comments per pull request: 0.0
- Merged pull requests: 6
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- dizhou-wu (1)
- LucianoPenaTejo (1)
- lfs119 (1)
Pull Request Authors
- FatemehFathiNiazi (5)
- alexrd (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
- Total downloads: unknown
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 1
- Total maintainers: 1
pypi.org: torch-agdiff
PyTorch Implementation of AGDIFF from "AGDIFF: Attention-Enhanced Diffusion for Molecular Geometry Prediction"
- Homepage: https://github.com/ADicksonLab/AGDIFF
- Documentation: https://torch-agdiff.readthedocs.io/
- License: MIT