AmpTorch

AmpTorch: A Python package for scalable fingerprint-based neural network training on multi-element systems with integrated uncertainty quantification - Published in JOSS (2023)

https://github.com/ulissigroup/amptorch

Science Score: 98.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in JOSS metadata
✓
Academic publication links
Links to: zenodo.org
✓
Committers with academic emails
8 of 15 committers (53.3%) from academic institutions
✓
Institutional organization owner
Organization ulissigroup has institutional domain (ulissigroup.cheme.cmu.edu)
✓
JOSS paper metadata
Published in Journal of Open Source Software

Scientific Fields

Artificial Intelligence and Machine Learning Computer Science - 80% confidence

Engineering Computer Science - 40% confidence

Last synced: 6 months ago · JSON representation

Repository

AMPtorch: Atomistic Machine Learning Package (AMP) - PyTorch

Basic Info

Host: GitHub
Owner: ulissigroup
License: gpl-3.0
Language: C++
Default Branch: master
Homepage:
Size: 18.9 MB

Statistics

Stars: 60
Watchers: 8
Forks: 35
Open Issues: 7
Releases: 3

Created about 7 years ago · Last pushed over 2 years ago

Metadata Files

Readme Contributing License

AmpTorch: Atomistic Machine-learning Package - PyTorch

AmpTorch is a PyTorch implementation of the Atomistic Machine-learning Package (AMP) code that seeks to provide users with improved performance and flexibility as compared to the original code. The implementation does so by benefiting from state-of-the-art machine learning methods and techniques to be optimized in conjunction with high-throughput supercomputers. AmpTorch is built on top of PyTorch Geometric and Skorch.

Documentation

Go to AmpTorch Documentation for installation, usage and example lookup.

Installation

Install dependencies:

Ensure conda is up-to-date: conda update conda. Check the availability of gcc compiler on the machine.
Create conda environment
CPU machines: conda env create -f env_cpu.yml
GPU machines (CUDA 10.2): conda env create -f env_gpu.yml
Activate the conda environment conda activate amptorch.
Install the package with pip install -e .
(Optional) Install pre-commit hooks if you are creating PR: pre-commit install

Usage

Configs

To train a model using amptorch, configs = { "model": { "num_layers": int, "num_nodes": int, "get_forces": bool, "batchnorm": bool, "activation": object, **custom_args }, "optim": { "gpus": int, "force_coefficient": float, "lr": float, "batch_size": int, "epochs": int, "optimizer": object, "loss_fn": object, "loss": str, "metric": str, "cp_metric": str, "scheduler": dict, ## }, "dataset": { "raw_data": str or list, "lmdb_path": str, ## LMDB "val_split": float, "elements": list, "fp_scheme": str, "fp_params": dict, "cutoff_params": dict, "save_fps": bool, "scaling": dict, }, "cmd": { "debug": bool, "dtype": object, "run_dir": str, "seed": int, "identifier": str, "verbose": bool, "logger": False, }, } a set of configs must be specified to interact with the trainer. An exhaustive list of all possible flags and their descriptions is provided below: # No. of hidden layers # No. of nodes per layer # Compute per-atom forces (default: True) # Enable batch-normalization (default:False) # Activation function (default: nn.Tanh) # Any additional arguments used to customize existing/new models # No. of gpus to use, 0 for cpu (default: 0) # If force training, coefficient to weight the force component by (default: 0) # Initial learning rate (default: 1e-1) # Batch size (default: 32) # Max training epochs (default: 100) # Training optimizer (default: torch.optim.Adam) # Loss function to optimize (default: CustomLoss) # Control loss function criterion, "mse" or "mae" (default: "mse") # Metrics to be reported by, "mse" or "mae" (default: "mae") # Property based on which the model is saved. "energy" or "forces" (default: "energy") # Learning rate scheduler to use - {"policy": "StepLR", "params": {"step_size": 10, "gamma": 0.1}} # Path to ASE trajectory or database or list of Atoms objects # Path to LMDB database file for dataset too large to fit in memory ## Specify either "raw_data" or "lmdb_path" construction can be found in examples/3_lmdb/ # Proportion of training set to use for validation # List of unique elements in dataset, optional (default: computes unique elements) # Fingerprinting scheme to feature dataset, "gmpordernorm" or "gaussian" (default: "gmpordernorm") # Fingerprint parameters, see examples for correct layout for either GMP descriptors or SF descriptors # Cutoff function for SF descriptors - polynomial or cosine, ## Polynomial - {"cutoff_func": "Polynomial", "gamma": 2.0} ## Cosine - {"cutoff_func": "Cosine"} # Write calculated fingerprints to disk (default: True) # Feature scaling scheme, normalization or standardization ## normalization (scales features between "range") - {"type": "normalize", "range": (0, 1)} ## standardization (scales data to mean=0, stdev=1) - {"type": "standardize"} # Debug mode, does not write/save checkpoints/results (default: False) # Pytorch level of precision (default: torch.DoubleTensor) # Path to run trainer, where logs are to be saved (default: "./") # Random seed (default: 0) # Unique identifer to experiment, optional # Print training scores (default: True) # Log results to Weights and Biases (https://www.wandb.com/) ## wandb offers a very clean and flexible interface to monitor results online ## A free account is necessary to view and log results

Train model

``` from amptorch import AtomsTrainer

trainer = AtomsTrainer(configs) trainer.train() ```

Load checkpoints

Previously trained models may be loaded as follows: trainer = AtomsTrainer(configs) trainer.load_pretrained(path_to_checkpoint_dir)

Make predictions

predictions = trainer.predict(list_of_atoms_objects) energies = predictions["energy"] forces = predictions["forces"]

Construct AmpTorch-ASE calculator

To interface with ASE, an ASE calculator may be constructed as follows: ``` from amptorch import AmpTorch

calc = AmpTorch(trainer) slab.setcalculator(calc) energy = slab.getpotentialenergy() forces = slab.getforces() ```

Development notes

Reporting issues

Regarding bugs, issues or suggested feature improvements related to the software, please use the issue tracker of the GitHub project. Refere to Bug Reports for any questions related to what to expect when submitting a bug report.

Contributing

If you want to contribute to this project, please refer to our contribution guideline. Refere to Pull Requests for any questions related to what to expect when submitting a pull request.

Acknowledgements

This project is being developed at Carnegie Mellon University in the Department of Chemical Engineering, by Muhammed Shuaibi and Zachary Ulissi, in collaboration with Andrew Peterson, Franklin Goldsmith, Brenda Rubenstein, Andrew Medford, and Adam Willard as part of the Department of Energy's Bridging the time scale in exascale computing of chemical systems project.
Funded by the Department of Energy's Basic Enenergy Science, Computational Chemical Sciences Program Office. Award # DE-SC0019441
Engineering ideas have been heavily borrowed from our work on the Open Catalyst Project
Gaussian fingerprints have been adapted from SIMPLE-NN

License

This sofware is licensed under the GNU General Public License. See LICENSE.

Owner

Name: Ulissi Group
Login: ulissigroup
Kind: organization
Location: Pittsburgh, PA

Website: http://ulissigroup.cheme.cmu.edu/
Repositories: 22
Profile: https://github.com/ulissigroup

Research group of Zack Ulissi at CMU

JOSS Publication

AmpTorch: A Python package for scalable fingerprint-based neural network training on multi-element systems with integrated uncertainty quantification

Published

July 26, 2023

DOI

10.21105/joss.05035

Volume 8, Issue 87, Page 5035

Authors

Muhammed Shuaibi
Department of Chemical Engineering, Carnegie Mellon University, United States

Yuge Hu

Department of Chemical and Biomolecular Engineering, Georgia Institute of Technology, United States

Xiangyun Lei
Department of Chemical and Biomolecular Engineering, Georgia Institute of Technology, United States

Benjamin M. Comer
Department of Chemical and Biomolecular Engineering, Georgia Institute of Technology, United States

Matt Adams
Department of Chemical Engineering, Carnegie Mellon University, United States

Jacob Paras
School of Physics and School of Computer Science, Georgia Institute of Technology, United States

Rui Qi Chen
Department of Chemical and Biomolecular Engineering, Georgia Institute of Technology, United States

Eric Musa
Department of Chemical Engineering, University of Michigan, United States

Joseph Musielewicz
Department of Chemical Engineering, Carnegie Mellon University, United States

Andrew A. Peterson

School of Engineering, Brown University, United States

Andrew J. Medford

Department of Chemical and Biomolecular Engineering, Georgia Institute of Technology, United States

Zachary Ulissi

Department of Chemical Engineering, Carnegie Mellon University, United States

Editor

David Hagan

GitHub Events

Total

Watch event: 1

Last Year

Watch event: 1

Committers

Last synced: 7 months ago

All Time

Total Commits: 690
Total Committers: 15
Avg Commits per committer: 46.0
Development Distribution Score (DDS): 0.275

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Muhammed	m**i@a**u	500
ray38	x**8@g**u	82
Ray-Lei	r**i@t**l	41
Nicole Hu	5****u	25
Ben Comer	b**r@g**u	19
Saurabh Sivakumar	s**a@c**u	6
Andrew Peterson	a**n@b**u	4
EricMusa	e**a@u**u	3
Rui Qi Chen	3****c	3
mattaadams1@gmail.com	m**1@g**m	2
Matt Adams	5****1	1
Matthew Evans	7****s	1
Nicole Hu	n**u@g**u	1
Xiangyun Lei	x**8@a**u	1
Xiangyun Lei	x**8@l**u	1

Committer Domains (Top 20 + Academic)

gatech.edu: 2 login-hive2.pace.gatech.edu: 1 atl1-1-01-016-32-l.pace.gatech.edu: 1 gmail.comu: 1 umich.edu: 1 brown.edu: 1 cmu.edu: 1 tri.global: 1 andrew.cmu.edu: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 28
Total pull requests: 75
Average time to close issues: 3 months
Average time to close pull requests: 23 days
Total issue authors: 19
Total pull request authors: 9
Average comments per issue: 2.57
Average comments per pull request: 0.48
Merged pull requests: 60
Bot issues: 1
Bot pull requests: 6

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

ianfhunter (4)
vsumaria (4)
ml-evs (3)
jparas-3 (2)
cchang373 (1)
ray38 (1)
zulissi (1)
IliasChair (1)
renovate[bot] (1)
hsulab (1)
EricMusa (1)
reudhos (1)
saurabhsivakumar (1)
haipronl (1)
mshuaibii (1)

Pull Request Authors

mshuaibii (30)
nicoleyghu (14)
ray38 (12)
renovate[bot] (6)
EricMusa (5)
ruiqic (3)
ml-evs (1)
mattaadams (1)
jparas-3 (1)

Top Labels

Issue Labels

enhancement (3)

AmpTorch

Science Score: 98.0%

Scientific Fields

Repository

Basic Info

Statistics

Metadata Files

README.md

AmpTorch: Atomistic Machine-learning Package - PyTorch

Documentation

Installation

Usage

Configs

Train model

Load checkpoints

Make predictions

Construct AmpTorch-ASE calculator

Development notes

Reporting issues

Contributing

Acknowledgements

License

Owner

JOSS Publication

AmpTorch: A Python package for scalable fingerprint-based neural network training on multi-element systems with integrated uncertainty quantification

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels