https://github.com/calvinp0/cmpnn-revised
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: calvinp0
- Language: Jupyter Notebook
- Default Branch: master
- Size: 6.39 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
CMPNN-Revised
A modular, extensible, and Lightning-powered revision of SY575/CMPNN — based on the IJCAI 2020 paper:
Communicative Representation Learning on Attributed Molecular Graphs
🚀 Overview
This repository revises and modernizes the original Communicative Message Passing Neural Network (CMPNN) implementation to improve:
- Extensibility – Modular design for easy plug-and-play experimentation.
- Readability – Cleaner abstractions and more maintainable code.
- Trainability – Integrated PyTorch Lightning for scalable training loops and device handling.
- Completeness – Implements options mentioned in the paper but missing from the original repo.
✅ What's New
🧠 Architecture
- Modular CMPNNEncoder with configurable:
comm_mode:'add','mlp','gru','ip'booster:'sum','mean','sum_max','attention'
- Aggregation strategies extracted into their own classes: easy to swap in
mean,sum,norm, etc. - Shared encoder or separate encoders for multi-molecule models.
- Handles 1-atom molecules by injecting an zero-filled bond feature vector that contains a null flag at index 0 when no bonds are present, fixing a missing edge case in the original CMPNN code.
🔁 Multi-Molecule Learning
- Support for pairwise molecule encoding:
- Use shared CMPNN encoders or independent ones.
- Encode donor/acceptor, ligand/protein, or other pairwise molecular structures.
🔧 Optimization & Infrastructure
- Refactored with
torch_geometric-compatibleDataandBatchobjects. - Optional global features using:
- Morgan fingerprints
- RDKit 2D descriptors
- Normalized RDKit features
- Charge-based features
- Extended featurizers and batching logic with robust testing.
📦 Structure
bash
cmpnn_revised/
├── models/ # CMPNN core architecture
├── data/ # Data objects & batching logic
├── featurizer/ # Atom/Bond/Global featurizers
├── lightning/ # PyTorch Lightning modules
├── scripts/ # Training, evaluation, etc.
├── tests/ # Pytest unit tests
├── mol_data/ # Molecule data for benchmarking
🛠 Installation
Create a Conda environment and install dependencies:
bash
conda env create -f environment.yml
conda activate cmpnn_env
bash setup_device_torch.sh
🔬 Example Usage
Coming soon. Check out scripts/ and tests/ for sample training workflows and test coverage.
📚 Reference
Communicative Representation Learning on Attributed Molecular Graphs Shengchao Liu, Xuanang Li, Xuanjing Huang, Jian Tang IJCAI 2020 — PDF
💡 Acknowledgements
- Original inspiration and code: SY575/CMPNN
- Built using RDKit, PyTorch Geometric, and PyTorch Lightning
Owner
- Name: Calvin
- Login: calvinp0
- Kind: user
- Repositories: 1
- Profile: https://github.com/calvinp0
GitHub Events
Total
- Delete event: 1
- Push event: 16
- Pull request event: 3
- Fork event: 1
- Create event: 4
Last Year
- Delete event: 1
- Push event: 16
- Pull request event: 3
- Fork event: 1
- Create event: 4