https://github.com/adicksonlab/agdiff

Implementation of AGDIFF: Attention-Enhanced Diffusion for Molecular Geometry Prediction

https://github.com/adicksonlab/agdiff

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: acs.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.5%) to scientific vocabulary

Keywords

attention conformer diffusion-models generative-ai gnn graph-neural-networks machine-learning structure
Last synced: 5 months ago · JSON representation

Repository

Implementation of AGDIFF: Attention-Enhanced Diffusion for Molecular Geometry Prediction

Basic Info
Statistics
  • Stars: 14
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
attention conformer diffusion-models generative-ai gnn graph-neural-networks machine-learning structure
Created over 1 year ago · Last pushed 8 months ago
Metadata Files
Readme License

README.md

AGDIFF: Attention-Enhanced Diffusion for Molecular Geometry Prediction

License: MIT

paper

This repository contains the official implementation of the work "AGDIFF: Attention-Enhanced Diffusion for Molecular Geometry Prediction".

AGDIFF introduces a novel approach that enhances diffusion models with attention mechanisms and an improved SchNet architecture, achieving state-of-the-art performance in predicting molecular geometries.

Unique Features of AGDIFF

  • Attention Mechanisms: Enhances the global and local encoders with attention mechanisms for better feature extraction and integration.
  • Improved SchNet Architecture: Incorporates learnable activation functions, adaptive scaling modules, and dual pathway processing to increase model expressiveness.
  • Batch Normalization: Stabilizes training and improves convergence for the local encoder.
  • Feature Expansion: Extends the MLP Edge Encoder with feature expansion and processing, combining processed features and bond embeddings for more adaptable edge representations.

photo not available

https://github.com/user-attachments/assets/78feda75-3a20-422a-9b3f-f96fceea69cc

Content

  1. Environment Setup
  2. Dataset
  3. Training
  4. Generation
  5. Evaluation
  6. Acknowledgment
  7. Citation

Environment Setup

Install dependencies via Conda/Mamba

bash conda env create -f agdiff.yml bash conda activate agdiff bash pip install torch_geometric bash pip install torch-scatter -f https://data.pyg.org/whl/torch-2.4.0+cu121.html bash pip install torch-sparse -f https://data.pyg.org/whl/torch-2.4.0+cu121.html bash pip install torch-cluster -f https://data.pyg.org/whl/torch-2.4.0+cu121.html

Once you installed all the dependencies, you should install the package locally in editable mode:

bash pip install -e .

Dataset

Official Dataset

The preprocessed datasets (GEOM) provided by GEODIFF can be found in this [Google Drive folder]. After downloading and unzipping the dataset, it should be placed in the folder path specified by the dataset variable in the configuration files located at ./configs/*.yml. You may also want to use the pretrained model provided in the same link.

The official raw GEOM dataset is also available [here].

Training

AGDIFF's training details and hyper-parameters are provided in the config files (./configs/*.yml). Feel free to tune these parameters as needed.

To train the model, use the following commands:

bash python scripts/train.py ./configs/qm9_default.yml python scripts/train.py ./configs/drugs_default.yml Model checkpoints, configuration YAML files, and training logs will be saved in a directory specified by --logdir in train.py.

Generation

To generate conformations for entire or part of test sets, use:

bash python scripts/test.py ./logs/path/to/checkpoints/${iter}.pt ./configs/qm9_default.yml \ --start_idx 0 --end_idx 200 Here start_idx and end_idx indicate the range of the test set that we want to use. To reproduce the paper's results, you should use 0 and 200 for startidx and endidx, respectively. All hyper-parameters related to sampling can be set in test.py files. Specifically, for testing the qm9 model, you could add the additional arg --w_global 0.3, which empirically shows slightly better results.

We also provide an example of conformation generation for a specific molecule (alanine dipeptide) in the examples folder. To generate conformations for alanine dipeptide, use:

```bash python examples/testalaninedipeptide.py ./logs/path/to/checkpoints/${iter}.pt ./configs/qm9default.yml ./examples/alaninedipeptide.pdb

```

Evaluation

After generating conformations, evaluate the results of benchmark tasks using the following commands.

Task 1. Conformation Generation

Calculate COV and MAT scores on the GEOM datasets with:

bash python scripts/evaluation/eval_covmat.py path/to/samples/sample_all.pkl

Acknowledgement

Our implementation is based on GEODIFF, PyTorch, PyG, SchNet

Citation

If you use our code or method in your work, please consider citing the following:

bibtex @misc{wyzykowskiAGDIFFAttentionEnhancedDiffusion2024, title = {{{AGDIFF}}: {{Attention-Enhanced Diffusion}} for {{Molecular Geometry Prediction}}}, shorttitle = {{{AGDIFF}}}, author = {Wyzykowski, Andr{\'e} Brasil Vieira and Fathi Niazi, Fatemeh and Dickson, Alex}, year = {2024}, month = oct, publisher = {ChemRxiv}, doi = {10.26434/chemrxiv-2024-wrvr4}, urldate = {2024-10-09}, archiveprefix = {ChemRxiv}, langid = {english}, keywords = {attention,conformer,diffusion models,generative,GNN,graph neural network,machine learning,structure} }

Please direct any questions to André Wyzykowski (abvwmc@gmail.com) and Alex Dickson (alexrd@msu.edu).

Owner

  • Name: ADicksonLab
  • Login: ADicksonLab
  • Kind: organization

GitHub Events

Total
  • Issues event: 10
  • Watch event: 10
  • Delete event: 1
  • Issue comment event: 5
  • Push event: 25
  • Create event: 1
Last Year
  • Issues event: 10
  • Watch event: 10
  • Delete event: 1
  • Issue comment event: 5
  • Push event: 25
  • Create event: 1

Issues and Pull Requests

Last synced: 5 months ago

All Time
  • Total issues: 3
  • Total pull requests: 7
  • Average time to close issues: 4 days
  • Average time to close pull requests: 2 days
  • Total issue authors: 3
  • Total pull request authors: 2
  • Average comments per issue: 1.67
  • Average comments per pull request: 0.0
  • Merged pull requests: 6
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 7
  • Average time to close issues: 4 days
  • Average time to close pull requests: 2 days
  • Issue authors: 3
  • Pull request authors: 2
  • Average comments per issue: 1.67
  • Average comments per pull request: 0.0
  • Merged pull requests: 6
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • dizhou-wu (1)
  • LucianoPenaTejo (1)
  • lfs119 (1)
Pull Request Authors
  • FatemehFathiNiazi (5)
  • alexrd (2)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
pypi.org: torch-agdiff

PyTorch Implementation of AGDIFF from "AGDIFF: Attention-Enhanced Diffusion for Molecular Geometry Prediction"

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 10.2%
Average: 33.8%
Dependent repos count: 57.4%
Maintainers (1)
Last synced: 8 months ago