Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, nature.com, acs.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary
Keywords
Repository
Geom3D: Geometric Modeling on 3D Structures, NeurIPS 2023
Basic Info
- Host: GitHub
- Owner: chao1224
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://openreview.net/forum?id=ygXSNrIU1p
- Size: 870 KB
Statistics
- Stars: 123
- Watchers: 2
- Forks: 13
- Open Issues: 4
- Releases: 0
Topics
Metadata Files
README.md
Symmetry-Informed Geometric Representation for Molecules, Proteins, and Crystalline Materials
Authors: Shengchao Liu, Weitao Du, Yanjing Li, Zhuoxinran Li, Zhiling Zheng, Chenru Duan, Zhiming Ma, Omar Yaghi, Anima Anandkumar, Christian Borgs, Jennifer Chayes, Hongyu Guo, Jian Tang
[ArXiv]
This is Geom3D, a platfrom for geometric modeling on 3D structures:
Environment
Conda
Setup the anaconda
bash
wget https://repo.continuum.io/archive/Anaconda3-2019.10-Linux-x86_64.sh
bash Anaconda3-2019.10-Linux-x86_64.sh -b
export PATH=$PWD/anaconda3/bin:$PATH
Packages
Start with some basic packages. ```bash conda create -n Geom3D python=3.7 conda activate Geom3D conda install -y -c rdkit rdkit conda install -y numpy networkx scikit-learn conda install -y -c conda-forge -c pytorch pytorch=1.9.1 conda install -y -c pyg -c conda-forge pyg=2.0.2 pip install ogb==1.2.1
pip install sympy
pip install ase
pip install lie_learn # for TFN and SE3-Trans
pip install packaging # for SEGNN pip3 install e3nn # for SEGNN
pip install transformers # for smiles pip install selfies # for selfies
pip install atom3d # for Atom3D pip install cffi # for Atom3D pip install biopython # for Atom3D
pip install cython # for pyximport
conda install -y -c conda-forge py-xgboost-cpu # for XGB ```
Datasets
We cover three types of datasets: - Small Molecules - QM9 - MD17 - rMD17 - COLL - Proteins - EC - FOLD - Small Molecules and Proteins - LBA - LEP - Materials - MatBench - QMOF
For dataset acquisition:
- We provide a set of raw and processed dataset HuggingFace. You can download the data using python download_data.py under ./data.
- Please refer to the data folder for more details.
Overview of Models
Representation Models
Geom3D includes the following representation models: - SchNet, NeurIPS'18 - TFN, NeurIPS'18 Workshop - DimeNet, ICLR'20 - SE(3)-Trans, NeurIPS'20 - EGNN, ICML'21 - PaiNN, ICML'21 - GemNet, NeurIPS'21 - SphereNet, ICLR'22 - SEGNN, ICLR'22 - NequIP, Nature Communications'22 - Allegro, Nature Communications'23 - Equiformer, ICLR'23 - GVP-GNN, ICLR'21 - IEConv, ICLR'21 - GearNet, ICLR'23 - ProNet, ICLR'23 - CDConv, ICLR'23
We also include the following 7 1D models and 11 2D models (specifically for small molecules): - 1D Fingerprints: MLP, RF, XGB - 1D SMILES: CNN, BERT - 1D Selfies: CNN, BERT - 2D topology: - GCN, NeurIPS'2015 - ENN-S2S, ICML'17 - GraphSAGE, NeurIPS'17 - GAT, ICLR'2018 - GIN, ICLR'2019 - D-MPNN, ACS-JCIM'2019 - N-Gram Graph, NeurIPS'2019 - PNA, NeurIPS'2020 - Graphormer, NeurIPS'21 - AWARE, TMLR'2022 - GraphGPS, NeurIPS'22
Notice that there is no pretraining considered at this stage. For geoemtric pretraining models, please check the following section.
Geometric Pretraining
We include the following 14 geometric pretraining methods:
- Pure 3D:
- Supervised
- Atom Type Prediction
- Distance Prediction
- Angle Prediction
- 3D InfoGraph, from GeoSSL, ICLR'23
- GeoSSL-RR, from GeoSSL, ICLR'23
- GeoSSL-InfoNCE, from GeoSSL, ICLR'23
- GeoSSL-EBM-NCE, from GeoSSL, ICLR'23
- GeoSSL-DDM, ICLR'23
- GeoSSL-DDM-1L, ICLR'23
- 3D-EMGP, AAAI'23
- Joint 2D-3D:
Scripts
The python scripts can be found in examples_3D. We list the bash scripts (and hyperparameters) in scripts. For example, the bash script for SchNet on QM9 is:
```
cd examples_3D
export model3d=SchNet export dataset=QM9 export tasklist=(mu alpha homo lumo gap r2 zpve u0 u298 h298 g298 cv)
export lrlist=(5e-4) export lrschedulerlist=(CosineAnnealingLR) export split=customized01 export seed=42 export embdimlist=(128 300) export batchsizelist=(128)
export epochs=1000
for task in "${tasklist[@]}"; do for lr in "${lrlist[@]}"; do for lrscheduler in "${lrschedulerlist[@]}"; do for embdim in "${embdimlist[@]}"; do for batchsize in "${batchsize_list[@]}"; do
export output_model_dir=output/random/"$model_3d"/"$dataset"/"$task"_"$split"_"$seed"/"$lr"_"$lr_scheduler"_"$emb_dim"_"$batch_size"_"$epochs"
export output_file="$output_model_dir"/result.out
mkdir -p "$output_model_dir"
python finetune_QM9.py \
--model_3d="$model_3d" --dataset="$dataset" --epochs="$epochs" \
--task="$task" \
--split="$split" --seed="$seed" \
--batch_size="$batch_size" \
--emb_dim="$emb_dim" \
--lr="$lr" --lr_scheduler="$lr_scheduler" --no_eval_train --print_every_epoch=1 --num_workers=8 \
--output_model_dir="$output_model_dir" \
> "$output_file"
done done done done done ```
Now only the bash scripts for QM9 are available. We will release the complete version soon, together with Notebook demo. Please stay tuned.
Checkpoints
Checkpoints for all the pretraining and downstream tasks will be released soon.
Cite us
Feel free to cite this work if you find it useful to you!
@article{liu2023symmetry,
title={Symmetry-Informed Geometric Representation for Molecules, Proteins, and Crystalline Materials},
author={Liu, Shengchao and Du, Weitao and Li, Yanjing and Li, Zhuoxinran and Zheng, Zhiling and Duan, Chenru and Ma, Zhiming and Yaghi, Omar and Anandkumar, Anima and Borgs, Christian and others},
journal={arXiv preprint arXiv:2306.09375},
year={2023}
}
Owner
- Name: Shengchao Liu
- Login: chao1224
- Kind: user
- Location: Montreal, QC, Canada
- Company: Mila-UdeM
- Website: chao1224.github.io
- Repositories: 7
- Profile: https://github.com/chao1224
Ph.D. candidate @ Mila-UdeM
GitHub Events
Total
- Watch event: 15
- Fork event: 5
Last Year
- Watch event: 15
- Fork event: 5
Committers
Last synced: over 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Shengchao Liu | s****r@g****m | 9 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 3
- Total pull requests: 4
- Average time to close issues: N/A
- Average time to close pull requests: 1 minute
- Total issue authors: 3
- Total pull request authors: 2
- Average comments per issue: 1.0
- Average comments per pull request: 0.25
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- cheliu-computation (1)
- newalexander (1)
- BaruaBee (1)
Pull Request Authors
- chao1224 (3)
- YanjingLiLi (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 27 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 1
- Total maintainers: 1
pypi.org: geom3d
Geometric Modeling on 3D Data
- Homepage: https://github.com/chao1224/Geom3D
- Documentation: https://geom3d.readthedocs.io/
- License: MIT
-
Latest release: 0.0
published over 2 years ago