https://github.com/dptech-corp/nag2g
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, acs.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.2%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: dptech-corp
- License: gpl-3.0
- Language: Python
- Default Branch: main
- Size: 389 KB
Statistics
- Stars: 14
- Watchers: 2
- Forks: 10
- Open Issues: 5
- Releases: 0
Metadata Files
README.md
NAG2G: Node-Aligned Graph-to-Graph Model
Welcome to the NAG2G (Node-Aligned Graph-to-Graph) repository! NAG2G is a state-of-the-art neural network model for retrosynthesis prediction.
🔥 Latest Updates
- 🗓️ May 2025 — 🌿
with_stereoisomerismbranch is now publicly available
📝 This branch was completed in 2023 but wasn’t released until now — opened in response to community interest.
New in this branch:
💊 Enhanced Stereochemistry Support
- Direct prediction of stereochemical features (e.g., chirality) from model outputs
- No post-processing required for stereochemical reconstruction
⇄ Unified Bidirectional Synthesis
Environment Setup
To begin working with NAG2G, you'll need to set up your environment. Below is a step-by-step guide to get you started:
```bash
Install Uni-Core
git clone https://github.com/dptech-corp/Uni-Core cd Uni-Core pip install . cd -
Install Unimol plus
cd unimol_plus pip install . cd -
Install additional dependencies
pip install rdchiral transformers tokenizers omegaconf rdkit ```
Datasets and Pretrained Weights
You can obtain the dataset USPTO-50k and pretrained model weights for USPTO-50k from the Google Drive:
Model Validation
To validate the NAG2G model with the provided weights, follow the instructions below:
When using a dataset that does not include reactants, you need to modify the valid.sh script. Specifically, add the --no_reactant command in line 95 in the code.
When using your own dataset, please modify the data_path in the valid.sh script.
```bash
Execute the validation script with the specified checkpoint file
sh valid.sh path2weight/NAG2Gunimolplususpto50k20230513-222355/checkpoint_last.pt ```
Data Preprocessing Instructions
If you need to regenerate the dataset, please refer to the code inside the data_preprocess directory.
bash
cd data_preprocess
python lmdb_preprocess <input_csv> <output_lmdb>
Two sample CSV files are provided for reference:
- sample.csv: This sample includes given reactants.
- sample_without_reactants.csv: This sample does not include given reactants.
For any questions or issues, please open an issue on our GitHub repository.
Thank you for your interest in NAG2G!
Owner
- Name: DP Technology
- Login: dptech-corp
- Kind: organization
- Location: China
- Website: https://www.dp.tech/en
- Repositories: 9
- Profile: https://github.com/dptech-corp
GitHub Events
Total
- Issues event: 2
- Watch event: 5
- Issue comment event: 5
- Push event: 2
- Fork event: 6
- Create event: 1
Last Year
- Issues event: 2
- Watch event: 5
- Issue comment event: 5
- Push event: 2
- Fork event: 6
- Create event: 1
Dependencies
- numpy *
- pandas *