GNS

GNS: A generalizable Graph Neural Network-based simulator for particulate and fluid modeling - Published in JOSS (2023)

https://github.com/geoelements/gns

Keywords

deep-learning graph-network-simulator machine-learning pytorch

Scientific Fields

Biology Life Sciences - 40% confidence

Last synced: 6 months ago · JSON representation ·

Repository

Graph Network Simulator

Basic Info

Host: GitHub
Owner: geoelements
License: other
Language: Python
Default Branch: main
Homepage: https://www.geoelements.org/gns/
Size: 19 MB

Statistics

Stars: 176
Watchers: 3
Forks: 43
Open Issues: 8
Releases: 4

Topics

deep-learning graph-network-simulator machine-learning pytorch

Created over 4 years ago · Last pushed 10 months ago

Metadata Files

Readme Contributing License Code of conduct Citation Authors

Graph Network Simulator (GNS) and MeshNet

Krishna Kumar, The University of Texas at Austin. Joseph Vantassel, Texas Advanced Computing Center, UT Austin. Yongjin Choi, The University of Texas at Austin.

Graph Network-based Simulator (GNS) is a generalizable, efficient, and accurate machine learning (ML)-based surrogate simulator for particulate and fluid systems using Graph Neural Networks (GNNs). GNS code is a viable surrogate for numerical methods such as Material Point Method, Smooth Particle Hydrodynamics and Computational Fluid dynamics. GNS exploits distributed data parallelism to achieve fast multi-GPU training. The GNS code can handle complex boundary conditions and multi-material interactions.

MeshNet is a scalable surrogate simulator for any mesh-based models like Finite Element Analysis (FEA), Computational Fluid Dynammics (CFD), and Finite Difference Methods (FDM).

Run GNS/MeshNet

Training GNS/MeshNet on simulation data ```shell

For particulate domain,

python3 -m gns.train --datapath="" --modelpath="" --ntraining_steps=100

For mesh-based domain,

python3 -m meshnet.train --datapath="" --modelpath="" --ntraining_steps=100 ```

Resume training

To resume training specify model_file and train_state_file:

```shell

For particulate domain,

python3 -m gns.train --datapath="" --modelpath="" --modelfile="model.pt" --trainstatefile="trainstate.pt" --ntraining_steps=100

For mesh-based domain,

python3 -m meshnet.train --datapath="" --modelpath="" --modelfile="model.pt" --trainstatefile="trainstate.pt" --ntraining_steps=100 ```

Rollout prediction ```shell

For particulate domain,

python3 -m gns.train --mode="rollout" --datapath="" --modelpath="" --outputpath="" --modelfile="model.pt" --trainstatefile="train_state.pt"

For mesh-based domain,

python3 -m meshnet.train --mode="rollout" --datapath="" --modelpath="" --outputpath="" --modelfile="model.pt" --trainstatefile="train_state.pt" ```

Render ```shell

For particulate domain,

python3 -m gns.renderrollout --outputmode="gif" --rolloutdir="" --rolloutname=""

For mesh-based domain,

python3 -m gns.render --rolloutdir="" --rolloutname="" ```

In particulate domain, the renderer also writes .vtu files to visualize in ParaView.

Sand rollout

GNS prediction of Sand rollout after training for 2 million steps.

In mesh-based domain, the renderer writes .gif animation.

Fluid flow rollout

Meshnet GNS prediction of cylinder flow after training for 1 million steps.

Command line arguments details

`train.py` in GNS (particulate domain)

**mode (Enum)** This flag is used to set the operation mode for the script. It can take one of three values; 'train', 'valid', or 'rollout'. **batch_size (Integer)** Batch size for training. **noise_std (Float)** Standard deviation of the noise when training. **data_path (String)** Specifies the directory path where the dataset is located. The dataset is expected to be in a specific format (e.g., .npz files). It should contain `metadata.json`. If `--mode` is training, the directory should contain `train.npz`. If `--mode` is testing (rollout), the directory should contain `test.npz`. If `--mode` is valid, the directory should contain `valid.npz`. **model_path (String)** The directory path where the trained model checkpoints are saved during training or loaded from during validation/rollout. **output_path (String)** Defines the directory where the outputs (e.g., rollouts) are saved, when the `--mode` is set to rollout. This is particularly relevant in the rollout mode where the predictions of the model are stored. **output_filename (String)** Base filename to use when saving outputs during rollout. Default is "rollout", and the output will be saved as `rollout.pkl` in `output_path`. It is not intended to include the file extension. **model_file (String)** The filename of the model checkpoint to load for validation or rollout (e.g., model-10000.pt). It supports a special value "latest" to automatically select the newest checkpoint file. This flexibility facilitates the evaluation of models at different stages of training. **train_state_file (String)** Similar to model_file, but for loading the training state (e.g., optimizer state). It supports a special value "latest" to automatically select the newest checkpoint file. (e.g., training_state-10000.pt) **ntraining_steps (Integer)** The total number of training steps to execute before stopping. **nsave_steps (Integer)** Interval at which the model and training state are saved. **lr_init (Float)** Initial learning rate. **lr_decay (Float)** How much the learning rate should decay over time. **lr_decay_steps (Integer)** Steps at which learning rate should decay. **cuda_device_number (Integer)** Base CUDA device (zero indexed). Default is None so default CUDA device will be used. **n_gpus (Integer)** Number of GPUs to use for training.

`train.py` in MeshNet (mesh-based domain)

**mode (String)** This flag is used to set the operation mode for the script. It can take one of three values; 'train', 'valid', or 'rollout'. **batch_size (Integer)** Batch size for training. **data_path (String)** Specifies the directory path where the dataset is located. The dataset is expected to be in a specific format (e.g., .npz files). If `--mode` is training, the directory should contain `train.npz`. If `--mode` is testing (rollout), the directory should contain `test.npz`. If `--mode` is valid, the directory should contain `valid.npz`. **model_path (String)** The directory path where the trained model checkpoints are saved during training or loaded from during validation/rollout. **output_path (String)** Defines the directory where the outputs (e.g., rollouts) are saved, when the `--mode` is set to rollout. This is particularly relevant in the rollout mode where the predictions of the model are stored. **model_file (String)** The filename of the model checkpoint to load for validation or rollout (e.g., model-10000.pt). It supports a special value "latest" to automatically select the newest checkpoint file. This flexibility facilitates the evaluation of models at different stages of training. **train_state_file (String)** Similar to model_file, but for loading the training state (e.g., optimizer state). It supports a special value "latest" to automatically select the newest checkpoint file. (e.g., training_state-10000.pt) **cuda_device_number (Integer)** Allows specifying a particular CUDA device for training or evaluation, enabling the use of specific GPUs in multi-GPU setups. **rollout_filename (String)** Base name for saving rollout files. The actual filenames will append an index to this base name. **ntraining_steps (Integer)** The total number of training steps to execute before stopping. **nsave_steps (Integer)** Interval at which the model and training state are saved.

Datasets

Particulate domain:

We use the numpy .npz format for storing positional data for GNS training. The .npz format includes a list of tuples of arbitrary length where each tuple corresponds to a differenet training trajectory and is of the form (position, particle_type). The data loader provides INPUT_SEQUENCE_LENGTH positions, set equal to six by default, to provide the GNS with the last INPUT_SEQUENCE_LENGTH minus one positions as input to predict the position at the next time step. The position is a 3-D tensor of shape (n_time_steps, n_particles, n_dimensions) and particle_type is a 1-D tensor of shape (n_particles).

The dataset contains:

Metadata file with dataset information (sequence length, dimensionality, box bounds, default connectivity radius, statistics for normalization, ...):

{ "bounds": [[0.1, 0.9], [0.1, 0.9]], "sequence_length": 320, "default_connectivity_radius": 0.015, "dim": 2, "dt": 0.0025, "vel_mean": [5.123277536458455e-06, -0.0009965205918140803], "vel_std": [0.0021978993231675805, 0.0026653552458701774], "acc_mean": [5.237611158734309e-07, 2.3633027988858656e-07], "acc_std": [0.0002582944917306106, 0.00029554531667679154] } * npz containing data for all trajectories (particle types, positions, global context, ...):

Training datasets for Sand, SandRamps, and WaterDropSample are available on DesignSafe Data Depot [@vantassel2022gnsdata].

We provide the following datasets: * WaterDropSample (smallest dataset) * Sand * SandRamps

Download the dataset DesignSafe DataDepot. If you are using this dataset please cite Vantassel and Kumar., 2022

Mesh-based domain:

We also use the numpy .npz format for storing data for training meshnet GNS.

The dataset contains: * npz containing python dictionary describing mesh data and relevant dynamics at mesh nodes for all trajectories. The dictionary includes {pos: (ntimestep, nnodes, ndims), node_type: (ntimestep, nnodes, ntypes), velocity: (ntimestep, nnodes, ndims), pressure: (ntimestep, nnodes, 1), cells: (ntimestep, ncells, 3)}

The dataset is shared on DesignSafe DataDepot. If you are using this dataset please cite Kumar and Choi., 2023

Installation

GNS uses pytorch geometric and CUDA. These packages have specific requirements, please see [PyG installation]((https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html) for details.

CPU-only installation on Linux

shell conda install -y pytorch torchvision torchaudio cpuonly -c pytorch conda install -y pyg -c pyg conda install -y pytorch-cluster -c pyg conda install -y absl-py -c anaconda conda install -y numpy dm-tree matplotlib-base pyevtk -c conda-forge You can use the WaterDropletSample dataset to check if your gns code is working correctly.

To test the code you can run:

pytest test/

To test on the small waterdroplet sample:

``` git clone https://github.com/geoelements/gns-sample

TMPDIR="./gns-sample" DATASETNAME="WaterDropSample"

mkdir -p ${TMPDIR}/${DATASETNAME}/models/ mkdir -p ${TMPDIR}/${DATASETNAME}/rollout/

DATAPATH="${TMPDIR}/${DATASETNAME}/dataset/" MODELPATH="${TMPDIR}/${DATASETNAME}/models/" ROLLOUTPATH="${TMPDIR}/${DATASET_NAME}/rollout/"

python -m gns.train --datapath=${DATAPATH} --modelpath=${MODELPATH} --ntraining_steps=10 ```

Building GNS environment on TACC (LS6 and Frontera)

to setup a virtualenv

shell sh ./build_venv.sh

check tests run sucessfully.
start your environment

shell source start_venv.sh

Building GNS on MacOS

shell pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu pip3 install torch_geometric pip3 install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.3.0+cpu.html pip3 install -r requirements.txt

GNS training in parallel

GNS can be trained in parallel on multiple nodes with multiple GPUs.

GNS Scaling results

RTXwaterdrop

GNS scaling results on TACC Frontera GPU nodes with RTX-5000 GPUs.

A100sand3d

GNS scaling result on TACC lonestar6 GPU nodes with A100 GPUs.

Usage

Single Node, Multi-GPU

shell python -m torch.distributed.launch --nnodes=1 --nproc_per_node=[GPU_PER_NODE] --node_rank=[LOCAL_RANK] --master_addr=[MAIN_RANK] gns/train_multinode.py [ARGS]

Multi-node, Multi-GPU

On each node, run shell python -m torch.distributed.launch --nnodes=[NNODES] --nproc_per_node=[GPU_PER_NODE] --node_rank=[LOCAL_RANK] --master_addr=[MAIN_RANK ]gns/train_multinode.py [ARGS]

Inspiration

PyTorch version of Graph Network Simulator and Mesh Graph Network Simulator are based on: * https://arxiv.org/abs/2002.09405 and https://github.com/deepmind/deepmind-research/tree/master/learningtosimulate * https://arxiv.org/abs/2010.03409 and https://github.com/deepmind/deepmind-research/tree/master/meshgraphnets * https://github.com/echowve/meshGraphNets_pytorch

Acknowledgement

This code is based upon work supported by the National Science Foundation under Grant OAC-2103937.

Citation

Repo

Kumar, K., & Vantassel, J. (2023). GNS: A generalizable Graph Neural Network-based simulator for particulate and fluid modeling. Journal of Open Source Software, 8(88), 5025. https://doi.org/10.21105/joss.05025

Dataset

Vantassel, Joseph; Kumar, Krishna (2022) “Graph Network Simulator Datasets.” DesignSafe-CI. https://doi.org/10.17603/ds2-0phb-dg64 v1
Kumar, K., Y. Choi. (2023) "Cylinder flow with graph neural network-based simulator." DesignSafe-CI. https://doi.org/10.17603/ds2-fzg7-1719

Owner

Name: Extreme-scale computational geomechanics research
Login: geoelements
Kind: organization

Repositories: 12
Profile: https://github.com/geoelements

JOSS Publication

GNS: A generalizable Graph Neural Network-based simulator for particulate and fluid modeling

Published

August 25, 2023

DOI

10.21105/joss.05025

Volume 8, Issue 88, Page 5025

Authors

Krishna Kumar

Assistant Professor, University of Texas at Austin, Texas, USA

Joseph Vantassel

Assistant Professor, Virginia Tech, Virginia, USA, Texas Advanced Computing Center, University of Texas at Austin, Texas, USA

Editor

Øystein Sørensen

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Kumar
  given-names: Krishna
  orcid: "https://orcid.org/0000-0003-2144-5562"
- family-names: Vantassel
  given-names: Joseph
  orcid: "https://orcid.org/0000-0002-1601-3354"
doi: 10.5281/zenodo.8249813
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Kumar
    given-names: Krishna
    orcid: "https://orcid.org/0000-0003-2144-5562"
  - family-names: Vantassel
    given-names: Joseph
    orcid: "https://orcid.org/0000-0002-1601-3354"
  date-published: 2023-08-25
  doi: 10.21105/joss.05025
  issn: 2475-9066
  issue: 88
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 5025
  title: "GNS: A generalizable Graph Neural Network-based simulator for
    particulate and fluid modeling"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.05025"
  volume: 8
title: "GNS: A generalizable Graph Neural Network-based simulator for
  particulate and fluid modeling"

GitHub Events

Total

Issues event: 11
Watch event: 43
Issue comment event: 19
Push event: 10
Pull request review comment event: 2
Pull request event: 6
Pull request review event: 6
Fork event: 10

Last Year

Issues event: 11
Watch event: 43
Issue comment event: 19
Push event: 10
Pull request review comment event: 2
Pull request event: 6
Pull request review event: 6
Fork event: 10

Committers

Last synced: 7 months ago

All Time

Total Commits: 254
Total Committers: 8
Avg Commits per committer: 31.75
Development Distribution Score (DDS): 0.425

Past Year

Commits: 1
Committers: 1
Avg Commits per committer: 1.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Krishna Kumar	k**k@u**u	146
Joseph Vantassel	j**l@u**u	64
baagee	y**2@g**m	33
Sikan Li	t**n@g**m	7
bumi001	g**d@g**m	1
Leila	1****n	1
Joseph Vantassel	3****l	1
Cheng-Hsi Hsiao	9****3	1

Committer Domains (Top 20 + Academic)

utexas.edu: 2

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 32
Total pull requests: 59
Average time to close issues: 5 months
Average time to close pull requests: 11 days
Total issue authors: 15
Total pull request authors: 8
Average comments per issue: 1.66
Average comments per pull request: 0.42
Merged pull requests: 46
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 6
Pull requests: 5
Average time to close issues: 21 days
Average time to close pull requests: 11 days
Issue authors: 5
Pull request authors: 3
Average comments per issue: 2.83
Average comments per pull request: 0.2
Merged pull requests: 4
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

kks32 (6)
WPettersson (5)
yjchoi1 (4)
osorensen (3)
freebob (2)
jpvantassel (2)
ZhangWenKang1 (1)
Berumott0 (1)
gortali (1)
qilinli (1)
ramc77 (1)
songjiazheng1 (1)
arjun-mani (1)
Yanghuoshan (1)
Kyle-RuidongLI (1)

Pull Request Authors

yjchoi1 (32)
kks32 (20)
jpvantassel (8)
chhsiao93 (4)
leilaroshan (3)
hassaniqbal209 (2)
Naveen-Raj-M (2)
skye-glitch (2)
bumi001 (1)

Top Labels

Issue Labels

Priority: High (3) Status: Help wanted (2) Status: Pending (2) Type: Refactor (2) Priority: Critical (2) Type: Bug (2) Priority: Low (1) Type: Maintenance (1) Status: In progress (1) Priority: Medium (1)

Pull Request Labels

Priority: Medium (8) Priority: Low (6) Priority: Critical (5) Priority: High (5) Type: Documentation (3) Status: Revision needed (3) Status: Accepted (2) Type: Core feature (2) Status: In progress (1) Type: Bug (1) Status: Completed (1) Type: Validation (1)

GNS

Science Score: 100.0%

Keywords

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Graph Network Simulator (GNS) and MeshNet

Run GNS/MeshNet

For particulate domain,

For mesh-based domain,

For particulate domain,

For mesh-based domain,

For particulate domain,

For mesh-based domain,

For particulate domain,

For mesh-based domain,

Command line arguments details

Datasets

Particulate domain:

Mesh-based domain:

Installation

Building GNS environment on TACC (LS6 and Frontera)

Building GNS on MacOS

GNS training in parallel

GNS Scaling results

Usage

Single Node, Multi-GPU

Multi-node, Multi-GPU

Inspiration

Acknowledgement

Citation

Repo

Dataset

Owner

JOSS Publication

GNS: A generalizable Graph Neural Network-based simulator for particulate and fluid modeling

Authors

Editor

Tags

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies