mlip-arena

Fair and transparent benchmark of machine learning interatomic potentials (MLIPs), beyond basic error metrics https://openreview.net/forum?id=ysKfIavYQE

https://github.com/atomind-ai/mlip-arena

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary

Keywords

benchmark-framework interatomic-potentials machine-learning materials molecules quantum-chemistry

Last synced: 6 months ago · JSON representation ·

Repository

Fair and transparent benchmark of machine learning interatomic potentials (MLIPs), beyond basic error metrics https://openreview.net/forum?id=ysKfIavYQE

Basic Info

Host: GitHub
Owner: atomind-ai
License: apache-2.0
Language: Jupyter Notebook
Default Branch: main
Homepage: https://huggingface.co/spaces/atomind/mlip-arena
Size: 20.1 MB

Statistics

Stars: 61
Watchers: 0
Forks: 5
Open Issues: 12
Releases: 6

Topics

benchmark-framework interatomic-potentials machine-learning materials molecules quantum-chemistry

Created almost 2 years ago · Last pushed 7 months ago

Metadata Files

Readme License Citation

⚔️ MLIP Arena ⚔️

Foundation machine learning interatomic potentials (MLIPs), trained on extensive databases containing millions of density functional theory (DFT) calculations, have revolutionized molecular and materials modeling, but existing benchmarks suffer from data leakage, limited transferability, and an over-reliance on error-based metrics tied to specific density functional theory (DFT) references.

We introduce MLIP Arena, a unified benchmark platform for evaluating foundation MLIP performance beyond conventional error metrics. It focuses on revealing the physical soundness learned by MLIPs and assessing their utilitarian performance agnostic to underlying model architecture and training dataset.

By moving beyond static DFT references and revealing the important failure modes of current foundation MLIPs in real-world settings, MLIP Arena provides a reproducible framework to guide the next-generation MLIP development toward improved predictive accuracy and runtime efficiency while maintaining physical consistency.

MLIP Arena leverages modern pythonic workflow orchestrator 💙 Prefect 💙 to enable advanced task/flow chaining and caching.

Thumnail

[!NOTE] Contributions of new tasks through PRs are very welcome! If you're interested in joining the effort, please reach out to Yuan at cyrusyc@berkeley.edu. See project page for some outstanding tasks, or propose new feature requests in Discussion.

Announcement

[April 8, 2025] 🎉 MLIP Arena is accepted as an ICLR AI4Mat Spotlight! 🎉 Huge thanks to all co-authors for their contributions!

Installation

From PyPI (prefect workflow only, without pretrained models)

bash pip install mlip-arena

From source (with integrated pretrained models, advanced)

[!CAUTION] We strongly recommend clean build in a new virtual environment due to the compatibility issues between multiple popular MLIPs. We provide a single installation script using uv for minimal package conflicts and fast installation!

[!CAUTION] To automatically download farichem OMat24 checkpoint, please make sure you have gained downloading access to their HuggingFace model repo (not dataset repo), and login locally on your machine through huggginface-cli login (see HF hub authentication)

Linux

```bash

(Optional) Install uv, way faster than pip, why not? :)

curl -LsSf https://astral.sh/uv/install.sh | sh source $HOME/.local/bin/env

git clone https://github.com/atomind-ai/mlip-arena.git cd mlip-arena

One script uv pip installation

bash scripts/install.sh ```

[!TIP] Sometimes installing all compiled models takes all the available local storage. Optional pip flag --no-cache could be uesed. uv cache clean will be helpful too.

Mac

```bash

(Optional) Install uv

curl -LsSf https://astral.sh/uv/install.sh | sh source $HOME/.local/bin/env

One script uv pip installation

bash scripts/install-macosx.sh ```

Quickstart

The first example: Molecular Dynamics

Arena provides a unified interface to run all the compiled MLIPs. This can be achieved simply by looping through MLIPEnum:

```python from mliparena.models import MLIPEnum from mliparena.tasks import MD from mliparena.tasks.utils import getcalculator

from ase import units from ase.build import bulk

atoms = bulk("Cu", "fcc", a=3.6) * (5, 5, 5)

results = []

for model in MLIPEnum: result = MD( atoms=atoms, calculator=getcalculator( model, calculatorkwargs=dict(), # passing into calculator dispersion=True, dispersionkwargs=dict( damping='bj', xc='pbe', cutoff=40.0 * units.Bohr ), # passing into TorchDFTD3Calculator ), # compatible with custom ASE Calculator ensemble="nve", # nvt, nvt available dynamics="velocityverlet", # compatible with any ASE Dynamics objects and their class names totaltime=1e3, # 1 ps = 1e3 fs time_step=2, # fs ) results.append(result) ```

🚀 Parallelize Benchmarks at Scale

To run multiple benchmarks in parallel, add .submit before the task function and wrap all the tasks into a flow to dispatch the tasks to worker for concurrent execution. See Prefect Doc on tasks and flow for more details.

```python ... from prefect import flow

@flow def runalltasks:

futures = []
for model in MLIPEnum:
    future = MD.submit(
        atoms=atoms,
        ...
    )
    future.append(future)

return [f.result(raise_on_failure=False) for f in futures]

```

For a more practical example using HPC resources, please now refer to MD stability benchmark.

List of implemented tasks

The implemented tasks are available under mlip_arena.tasks.<module>.run or from mlip_arena.tasks import * for convenient imports (currently doesn't work if phonopy is not installed).

OPT: Structure optimization
EOS: Equation of state (energy-volume scan)
MD: Molecular dynamics with flexible dynamics (NVE, NVT, NPT) and temperature/pressure scheduling (annealing, shearing, etc)
PHONON: Phonon calculation driven by phonopy
NEB: Nudged elastic band
NEBFROMENDPOINTS: Nudge elastic band with convenient image interpolation (linear or IDPP)
ELASTICITY: Elastic tensor calculation

Contribute and Development

PRs are welcome. Please clone the repo and submit PRs with changes.

To make change to huggingface space, fetch large files from git lfs first and run streamlit:

git lfs fetch --all git lfs pull streamlit run serve/app.py

Add new benchmark tasks (WIP)

[!NOTE] Please reuse, extend, or chain the general tasks defined above

Add new MLIP models

If you have pretrained MLIP models that you would like to contribute to the MLIP Arena and show benchmark in real-time, there are two ways:

External ASE Calculator (easy)

Implement new ASE Calculator class in mlip_arena/models/externals.
Name your class with awesome model name and add the same name to registry with metadata.

[!CAUTION] Remove unneccessary outputs under results class attributes to avoid error for MD simulations. Please refer to CHGNet as an example.

Hugging Face Model (recommended, difficult)

Inherit Hugging Face ModelHubMixin class to your awesome model class definition. We recommend PytorchModelHubMixin.
Create a new Hugging Face Model repository and upload the model file using pushtohub function.
Follow the template to code the I/O interface for your model here.
Update model registry with metadata

Citation

If you find the work useful, please consider citing the following:

bibtex @inproceedings{ chiang2025mlip, title={{MLIP} Arena: Advancing Fairness and Transparency in Machine Learning Interatomic Potentials through an Open and Accessible Benchmark Platform}, author={Yuan Chiang and Tobias Kreiman and Elizabeth Weaver and Ishan Amin and Matthew Kuner and Christine Zhang and Aaron Kaplan and Daryl Chrzan and Samuel M Blau and Aditi S. Krishnapriyan and Mark Asta}, booktitle={AI for Accelerated Materials Design - ICLR 2025}, year={2025}, url={https://openreview.net/forum?id=ysKfIavYQE} }

Owner

Name: Atomind
Login: atomind-ai
Kind: organization

Repositories: 1
Profile: https://github.com/atomind-ai

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: MLIP Arena
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Yuan
    family-names: Chiang
    email: cyrusyc@lbl.gov
    affiliation: Lawrence Berkeley National Laboratory
    orcid: 'https://orcid.org/0000-0002-4017-7084'
repository-code: 'https://github.com/atomind-ai/mlip-arena'
keywords:
  - Quantum Chemistry
  - Foundation Model
  - Interatomic Potentials
  - Machine Learning
  - Force Fields
license: Apache-2.0

GitHub Events

Total

Create event: 43
Issues event: 12
Release event: 5
Watch event: 31
Delete event: 35
Member event: 1
Issue comment event: 14
Push event: 365
Pull request review comment event: 3
Pull request review event: 5
Pull request event: 88
Fork event: 3

Last Year

Create event: 43
Issues event: 12
Release event: 5
Watch event: 31
Delete event: 35
Member event: 1
Issue comment event: 14
Push event: 365
Pull request review comment event: 3
Pull request review event: 5
Pull request event: 88
Fork event: 3

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 7
Total pull requests: 28
Average time to close issues: 4 months
Average time to close pull requests: 19 days
Total issue authors: 1
Total pull request authors: 3
Average comments per issue: 0.14
Average comments per pull request: 0.0
Merged pull requests: 18
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 7
Pull requests: 28
Average time to close issues: 4 months
Average time to close pull requests: 19 days
Issue authors: 1
Pull request authors: 3
Average comments per issue: 0.14
Average comments per pull request: 0.0
Merged pull requests: 18
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

chiang-yuan (13)

Pull Request Authors

chiang-yuan (48)
lizweaver (1)
christine-zhang1 (1)
anyangml (1)

Top Labels

Issue Labels

enhancement (2)

Pull Request Labels

enhancement (6) good first issue (1) help wanted (1)

Packages

Total packages: 2
Total downloads:
- pypi 93 last-month

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 0
(may contain duplicates)
Total versions: 6
Total maintainers: 1

pypi.org: mlip-arena

Fair and transparent benchmark of machine learning interatomic potentials (MLIPs), beyond error-based regression metrics

Homepage: https://github.com/atomind-ai/mlip-arena
Documentation: https://mlip-arena.readthedocs.io/
License: apache-2.0
Latest release: 0.1.1
published 9 months ago

Versions: 5
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 49 Last month

Rankings

Dependent packages count: 9.8%

Downloads: 23.3%

Average: 29.4%

Dependent repos count: 55.0%

Maintainers (1)

cyrusyc

Last synced: 7 months ago

pypi.org: atomind-mlip

Homepage: https://github.com/atomind-ai/mlip-arena
Documentation: https://atomind-mlip.readthedocs.io/
License: apache-2.0
Latest release: 0.0.1
published almost 2 years ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 44 Last month

Rankings

Dependent packages count: 9.7%

Average: 36.7%

Dependent repos count: 63.7%

Maintainers (1)

cyrusyc

Last synced: about 1 year ago

Dependencies

.github/workflows/release.yaml actions

actions/checkout v3 composite
actions/setup-python v4 composite
softprops/action-gh-release v1 composite

.github/workflows/sync-hf.yaml actions

actions/checkout v3 composite

.github/workflows/test.yaml actions

actions/checkout v4 composite
actions/setup-python v5 composite

pyproject.toml pypi

ase *
datasets *
huggingface_hub *
pymatgen *
safetensors *
torch *
torch-geometric *
torch_dftd >=0.4.0

requirements.txt pypi

ase ==3.23.0
bokeh *
bokeh_sampledata *
numpy *
plotly *
pymatgen ==2024.4.13
scipy *
statsmodels ==0.14.2
streamlit >=1.36.0
torch ==2.2.1

mlip-arena

Science Score: 67.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

.github/README.md

⚔️ MLIP Arena ⚔️

Announcement

Installation

From PyPI (prefect workflow only, without pretrained models)

From source (with integrated pretrained models, advanced)

(Optional) Install uv, way faster than pip, why not? :)

One script uv pip installation

(Optional) Install uv

One script uv pip installation

Quickstart

The first example: Molecular Dynamics

🚀 Parallelize Benchmarks at Scale

List of implemented tasks

Contribute and Development

Add new benchmark tasks (WIP)

Add new MLIP models

External ASE Calculator (easy)

Hugging Face Model (recommended, difficult)

Citation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: mlip-arena

Rankings

Maintainers (1)

pypi.org: atomind-mlip

Rankings

Maintainers (1)

Dependencies