hydragnn

Distributed PyTorch implementation of multi-headed graph convolutional neural networks

https://github.com/ornl/hydragnn

Science Score: 75.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
○
Academic publication links
✓
Committers with academic emails
5 of 15 committers (33.3%) from academic institutions
✓
Institutional organization owner
Organization ornl has institutional domain (software.ornl.gov)
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (16.3%) to scientific vocabulary

Keywords

machine-learning

Last synced: 6 months ago · JSON representation ·

Repository

Distributed PyTorch implementation of multi-headed graph convolutional neural networks

Basic Info

Host: GitHub
Owner: ORNL
License: bsd-3-clause
Language: Python
Default Branch: main
Homepage:
Size: 10.2 MB

Statistics

Stars: 85
Watchers: 8
Forks: 33
Open Issues: 18
Releases: 3

Topics

machine-learning

Created almost 5 years ago · Last pushed 6 months ago

Metadata Files

Readme Contributing License Citation

HydraGNN

Distributed PyTorch implementation of multi-headed graph convolutional neural networks

HydraGNN_QRcode

Capabilities

HydraGNN Overview

Multi-headed Prediction for graph and node-level properties
Distributed Data Parallelism at supercomputing level
Convolutional Layers as a hyperparameter
Geometric Equivariance in convolution and prediction
Global Attention

Dependencies

To install required packages with only basic capability (torch, torch_geometric, and related packages) and to serialize+store the processed data for later sessions (pickle5): bash pip install -r requirements.txt pip install -r requirements-torch.txt pip install -r requirements-pyg.txt

If you plan to modify the code, include packages for formatting (black) and testing (pytest) the code: bash pip install -r requirements-dev.txt

Detailed dependency installation instructions are available on the Wiki

Installation

After checking out HydgraGNN, we recommend to install HydraGNN in a developer mode so that you can use the files in your current location and update them if needed: bash python -m pip install -e .

Or, simply type the following in the HydraGNN directory: bash export PYTHONPATH=$PWD:$PYTHONPATH

Alternatively, if you have no plane to update, you can install HydraGNN in your python tree as a static package: bash python setup.py install

Quick Start

For detailed instructions, see the Comprehensive User Manual.

Below are the four main functionalities for running the code. 1. Training a model, including continuing from a previously trained model using configuration options: python import hydragnn hydragnn.run_training("examples/configuration.json") 2. Saving a model state: python import hydragnn model_name = model_checkpoint.pk hydragnn.save_model(model, optimizer, model_name, path="./logs/") 3. Loading a model state: python import hydragnn model_name = model_checkpoint.pk hydragnn.load_existing_model(model, model_name, path="./logs/") 4. Making predictions from a previously trained model: python import hydragnn hydragnn.run_prediction("examples/configuration.json", model) The run_training and run_predictions functions are convenient routines that encapsulate all the steps of the training process (data generation, data pre-processing, training of HydraGNN models, and use of trained HydraGNN models for inference) on toy problems, which are included in the CI test workflows. Both run_training and run_predictions require a JSON input file for configurable options. The save_model and load_model functions store and retrieve model checkpoints for continued training and subsequent inference. Ad-hoc example scripts where data pre-processing, training, and inference are done for specific datasets are provided in the examples folder.

Datasets

Built in examples are provided for testing purposes only. One source of data to create HydraGNN surrogate predictions is DFT output on the OLCF Constellation: https://doi.ccs.ornl.gov/

Detailed instructions are available on the Wiki

Configurable settings

HydraGNN uses a JSON configuration file (examples in examples/):

There are many options for HydraGNN; the dataset and model type are particularly important: - ["Verbosity"]["level"]: 0, 1, 2, 3, 4 (int) - ["Dataset"]["name"]: CuAu_32atoms, FePt_32atoms, FeSi_1024atoms (str)

Additionally, many important arguments fall within the ["NeuralNetwork"] section:

["NeuralNetwork"]
- ["Architecture"]
- ["mpnn_type"]
  Accepted types: CGCNN, DimeNet, EGNN, GAT, GIN, MACE, MFC, PAINN, PNAEq, PNAPlus, PNA, SAGE, SchNet (str)
- ["num_conv_layers"]
  Examples: 1, 2, 3, 4 ... (int)
- ["output_heads"]
  Task types: node, graph (int)
- ["global_attn_engine"] Accepted types: GPS, None
- ["global_attn_type"] Accepted types: multihead
- ["pe_dim"] Dimension of positional encodings (int)
- ["global_attn_heads"] Examples: 1, 2, 3, 4 ... (int)
- ["hidden_dim"]
  Dimension of node embeddings during convolution (int) - must be a multiple of "globalattnheads" if "globalattnengine" is not "None"
- ["Variables of Interest"]
- ["input_node_features"]
  Indices from nodal data used as inputs (int)
- ["output_index"]
  Indices from data used as targets (int)
- ["type"]
  Either node or graph (string)
- ["output_dim"]
  Dimensions of prediction tasks (list)
- ["Training"]
- ["num_epoch"]
  Examples: 75, 100, 250 (int)
- ["batch_size"]
  Examples: 16, 32, 64 (int)
- ["Optimizer"]["learning_rate"]
  Examples: 2e-3, 0.005 (float)
- ["compute_grad_energy"]
  Use the gradient of energy to predict forces (bool)

Citations

"HydraGNN: Distributed PyTorch implementation of multi-headed graph convolutional neural networks", Copyright ID#: 81929619 https://doi.org/10.11578/dc.20211019.2

Contributing

We encourage you to contribute to HydraGNN! Please check the guidelines on how to do so.

Documentation

Quick Start: This README provides basic usage examples
Comprehensive User Manual: Detailed guide covering data pre-processing, model construction, scalable data management, and training
Wiki: Additional technical documentation and datasets

Owner

Name: Oak Ridge National Laboratory
Login: ORNL
Kind: organization
Email: software@ornl.gov
Location: Oak Ridge TN

Website: http://software.ornl.gov
Repositories: 99
Profile: https://github.com/ORNL

Software repositories from Oak Ridge National Laboratory

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Lupo Pasini"
  given-names: "Massimiliano"
  orcid: "https://orcid.org/0000-0002-4980-6924"
- family-names: "Reeve"
  given-names: "Samuel Temple"
  orcid: "https://orcid.org/0000-0002-4250-9476"
- family-names: "Zhang"
  given-names: "Pei"
  orcid: "https://orcid.org/0000-0002-8351-0529"
- family-names: "Choi"
  given-names: "Jong Youl"
  orcid: "https://orcid.org/0000-0002-6459-6152"
title: "HydraGNN - Distributed PyTorch implementation of multi-headed graph convolutional neural networks"
version: 1.0.0
doi: 10.11578/dc.20211019.2
date-released: 2021-10-19
url: "https://github.com/ORNL/HydraGNN"

GitHub Events

Total

Create event: 14
Issues event: 9
Watch event: 20
Delete event: 8
Member event: 1
Issue comment event: 70
Push event: 53
Pull request review comment event: 104
Pull request review event: 166
Pull request event: 100
Fork event: 8

Last Year

Create event: 14
Issues event: 9
Watch event: 20
Delete event: 8
Member event: 1
Issue comment event: 70
Push event: 53
Pull request review comment event: 104
Pull request review event: 166
Pull request event: 100
Fork event: 8

Committers

Last synced: 10 months ago

All Time

Total Commits: 605
Total Committers: 15
Avg Commits per committer: 40.333
Development Distribution Score (DDS): 0.726

Past Year

Commits: 69
Committers: 11
Avg Commits per committer: 6.273
Development Distribution Score (DDS): 0.623

Top Committers

Name	Email	Commits
Massimiliano Lupo Pasini	m**i@g**m	166
Pei Zhang	z**1@o**v	144
Sam Reeve	6****e	117
Marko Burcul	b**o@g**m	63
Jong Choi	j****c	56
RylieWeaver	1****r	26
Justin Baker	b**r@m**u	11
Massimiliano Lupo Pasini	7**l@o**v	8
Zhifan Ye	y**n@m**n	5
Erdem Caliskan	1****n	2
Kshitij Mehta	k****a	2
Acer, Seher	a**s@o**v	2
Arindam Chowdhury	5****8	1
Chaojian Li	t**5@g**m	1
Saurav Maheshkar	s**r@g**m	1

Committer Domains (Top 20 + Academic)

ornl.gov: 3 mail.ustc.edu.cn: 1 math.utah.edu: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 31
Total pull requests: 140
Average time to close issues: about 1 year
Average time to close pull requests: 19 days
Total issue authors: 7
Total pull request authors: 11
Average comments per issue: 0.84
Average comments per pull request: 1.3
Merged pull requests: 103
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 4
Pull requests: 64
Average time to close issues: 4 months
Average time to close pull requests: 13 days
Issue authors: 1
Pull request authors: 9
Average comments per issue: 0.0
Average comments per pull request: 1.14
Merged pull requests: 41
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

allaffa (14)
streeve (10)
akhilpandey95 (4)
kshitij-v-mehta (4)
jinz2014 (2)
SauravMaheshkar (2)
lubbersnick (1)
jychoi-hpc (1)
seheracer (1)
pzhanggit (1)
Lance-Drane (1)

Pull Request Authors

allaffa (87)
jychoi-hpc (35)
RylieWeaver (27)
streeve (23)
pzhanggit (23)
ArCho48 (9)
LemonAndRabbit (5)
kshitij-v-mehta (5)
ashwinma (5)
zachfox (4)
erdemcaliskan (2)
JustinBakerMath (2)
licj15 (2)
frobnitzem (2)
seheracer (1)

Top Labels

Issue Labels

enhancement (5) bug (2) documentation (1)

Pull Request Labels

enhancement (75) bug (41) performance (5) documentation (5) help wanted (2)

Packages

Total packages: 1
Total downloads:
- pypi 17 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 2
Total maintainers: 1

pypi.org: hydragnn

Distributed PyTorch implementation of multi-headed graph convolutional neural networks

Homepage: https://github.com/ORNL/HydraGNN
Documentation: https://hydragnn.readthedocs.io/
License: BSD-3
Latest release: 3.0
published over 2 years ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 17 Last month

Rankings

Dependent packages count: 9.7%

Average: 38.8%

Dependent repos count: 67.9%

Maintainers (1)

hydragnn

Last synced: 6 months ago

Dependencies

.github/workflows/CI.yml actions

actions/cache v2 composite
actions/checkout v2.2.0 composite
actions/setup-python v2 composite

examples/ising_model/requirements.txt pypi

sympy *

requirements-dev.txt pypi

black ==21.5b1 development
mendeleev * development
pre-commit * development
pytest * development
pytest-mpi * development

requirements-optional.txt pypi

deepspeed *

requirements.txt pypi

ase *
click ==8.0.0
matplotlib *
pickle5 *
psutil *
tensorboard *
torch ==1.10
tqdm *

setup.py pypi

matplotlib *
pickle5 *
tensorboard *
torch >=1.8
torch-geometric >=1.7.2
torch-scatter *
torch-sparse *
tqdm *

requirements-pyg.txt pypi

pyg_lib ==0.4.0
torch_cluster ==1.6.3
torch_geometric ==2.3.1
torch_scatter ==2.1.2
torch_sparse ==0.6.18
torch_spline_conv ==1.2.2

requirements-torch.txt pypi

torch ==2.0.1
torchaudio *
torchvision *

hydragnn

Science Score: 75.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

HydraGNN

Capabilities

Dependencies

Installation

Quick Start

Datasets

Configurable settings

Citations

Contributing

Documentation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: hydragnn

Rankings

Maintainers (1)

Dependencies