https://github.com/dptech-corp/nag2g

https://github.com/dptech-corp/nag2g

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, acs.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.2%) to scientific vocabulary
Last synced: 4 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: dptech-corp
  • License: gpl-3.0
  • Language: Python
  • Default Branch: main
  • Size: 389 KB
Statistics
  • Stars: 14
  • Watchers: 2
  • Forks: 10
  • Open Issues: 5
  • Releases: 0
Created about 2 years ago · Last pushed 9 months ago
Metadata Files
Readme License

README.md

NAG2G: Node-Aligned Graph-to-Graph Model

Welcome to the NAG2G (Node-Aligned Graph-to-Graph) repository! NAG2G is a state-of-the-art neural network model for retrosynthesis prediction.

JACS Au Paper Arxiv Preprint Uni-Retro Platform

🔥 Latest Updates

📝 This branch was completed in 2023 but wasn’t released until now — opened in response to community interest.


New in this branch:

  • 💊 Enhanced Stereochemistry Support

    • Direct prediction of stereochemical features (e.g., chirality) from model outputs
    • No post-processing required for stereochemical reconstruction
  • Unified Bidirectional Synthesis

    • Single model supports both retrosynthesis and forward synthesis
    • 🗓️ August 2024 — 💻 Initial codebase released (main branch)
    • 🗓️ February 2024 — 🧪 Paper published in JACS Au
    • 🗓️ September 2023 — 📄 Preprint available on ArXiv

Environment Setup

To begin working with NAG2G, you'll need to set up your environment. Below is a step-by-step guide to get you started:

```bash

Install Uni-Core

git clone https://github.com/dptech-corp/Uni-Core cd Uni-Core pip install . cd -

Install Unimol plus

cd unimol_plus pip install . cd -

Install additional dependencies

pip install rdchiral transformers tokenizers omegaconf rdkit ```

Datasets and Pretrained Weights

You can obtain the dataset USPTO-50k and pretrained model weights for USPTO-50k from the Google Drive:

Model Validation

To validate the NAG2G model with the provided weights, follow the instructions below:

When using a dataset that does not include reactants, you need to modify the valid.sh script. Specifically, add the --no_reactant command in line 95 in the code.

When using your own dataset, please modify the data_path in the valid.sh script.

```bash

Execute the validation script with the specified checkpoint file

sh valid.sh path2weight/NAG2Gunimolplususpto50k20230513-222355/checkpoint_last.pt ```

Data Preprocessing Instructions

If you need to regenerate the dataset, please refer to the code inside the data_preprocess directory.

bash cd data_preprocess python lmdb_preprocess <input_csv> <output_lmdb>

Two sample CSV files are provided for reference: - sample.csv: This sample includes given reactants. - sample_without_reactants.csv: This sample does not include given reactants.


For any questions or issues, please open an issue on our GitHub repository.

Thank you for your interest in NAG2G!

Owner

  • Name: DP Technology
  • Login: dptech-corp
  • Kind: organization
  • Location: China

GitHub Events

Total
  • Issues event: 2
  • Watch event: 5
  • Issue comment event: 5
  • Push event: 2
  • Fork event: 6
  • Create event: 1
Last Year
  • Issues event: 2
  • Watch event: 5
  • Issue comment event: 5
  • Push event: 2
  • Fork event: 6
  • Create event: 1

Dependencies

unimol_plus/setup.py pypi
  • numpy *
  • pandas *