https://github.com/calvinp0/cmpnn-revised

https://github.com/calvinp0/cmpnn-revised

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.7%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: calvinp0
  • Language: Jupyter Notebook
  • Default Branch: master
  • Size: 6.39 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 10 months ago
Metadata Files
Readme

README.md

CMPNN-Revised

A modular, extensible, and Lightning-powered revision of SY575/CMPNN — based on the IJCAI 2020 paper:
Communicative Representation Learning on Attributed Molecular Graphs

🚀 Overview

This repository revises and modernizes the original Communicative Message Passing Neural Network (CMPNN) implementation to improve:

  • Extensibility – Modular design for easy plug-and-play experimentation.
  • Readability – Cleaner abstractions and more maintainable code.
  • Trainability – Integrated PyTorch Lightning for scalable training loops and device handling.
  • Completeness – Implements options mentioned in the paper but missing from the original repo.

✅ What's New

🧠 Architecture

  • Modular CMPNNEncoder with configurable:
    • comm_mode: 'add', 'mlp', 'gru', 'ip'
    • booster: 'sum', 'mean', 'sum_max', 'attention'
  • Aggregation strategies extracted into their own classes: easy to swap in mean, sum, norm, etc.
  • Shared encoder or separate encoders for multi-molecule models.
  • Handles 1-atom molecules by injecting an zero-filled bond feature vector that contains a null flag at index 0 when no bonds are present, fixing a missing edge case in the original CMPNN code.

🔁 Multi-Molecule Learning

  • Support for pairwise molecule encoding:
    • Use shared CMPNN encoders or independent ones.
    • Encode donor/acceptor, ligand/protein, or other pairwise molecular structures.

🔧 Optimization & Infrastructure

  • Refactored with torch_geometric-compatible Data and Batch objects.
  • Optional global features using:
    • Morgan fingerprints
    • RDKit 2D descriptors
    • Normalized RDKit features
    • Charge-based features
  • Extended featurizers and batching logic with robust testing.

📦 Structure

bash cmpnn_revised/ ├── models/ # CMPNN core architecture ├── data/ # Data objects & batching logic ├── featurizer/ # Atom/Bond/Global featurizers ├── lightning/ # PyTorch Lightning modules ├── scripts/ # Training, evaluation, etc. ├── tests/ # Pytest unit tests ├── mol_data/ # Molecule data for benchmarking

🛠 Installation

Create a Conda environment and install dependencies: bash conda env create -f environment.yml conda activate cmpnn_env bash setup_device_torch.sh

🔬 Example Usage

Coming soon. Check out scripts/ and tests/ for sample training workflows and test coverage.

📚 Reference

Communicative Representation Learning on Attributed Molecular Graphs Shengchao Liu, Xuanang Li, Xuanjing Huang, Jian Tang IJCAI 2020 — PDF

💡 Acknowledgements

  • Original inspiration and code: SY575/CMPNN
  • Built using RDKit, PyTorch Geometric, and PyTorch Lightning

Owner

  • Name: Calvin
  • Login: calvinp0
  • Kind: user

GitHub Events

Total
  • Delete event: 1
  • Push event: 16
  • Pull request event: 3
  • Fork event: 1
  • Create event: 4
Last Year
  • Delete event: 1
  • Push event: 16
  • Pull request event: 3
  • Fork event: 1
  • Create event: 4