https://github.com/materialsvirtuallab/matpes

A foundational potential energy dataset for materials

https://github.com/materialsvirtuallab/matpes

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 10 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.7%) to scientific vocabulary

Keywords

ai materials-design materials-informatics materials-science ml pes
Last synced: 6 months ago · JSON representation

Repository

A foundational potential energy dataset for materials

Basic Info
  • Host: GitHub
  • Owner: materialsvirtuallab
  • License: bsd-3-clause
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage: http://matpes.ai
  • Size: 91.8 MB
Statistics
  • Stars: 39
  • Watchers: 1
  • Forks: 4
  • Open Issues: 2
  • Releases: 2
Topics
ai materials-design materials-informatics materials-science ml pes
Created over 1 year ago · Last pushed 6 months ago
Metadata Files
Readme License

README.md

GitHub license Linting

Aims

Potential energy surface datasets with near-complete coverage of the periodic table are used to train foundation potentials (FPs), i.e., machine learning interatomic potentials (MLIPs) with near-complete coverage of the periodic table. MatPES is an initiative by the Materials Virtual Lab and the Materials Project to address critical deficiencies in such PES datasets for materials.

  1. Accuracy. MatPES is computed using static DFT calculations with stringent converegence criteria. Please refer to the MatPESStaticSet in [pymatgen] for details.
  2. Comprehensiveness. MatPES structures are sampled using a 2-stage version of DImensionality-Reduced Encoded Clusters with sTratified (DIRECT) sampling from a greatly expanded configuration of MD structures.
  3. Quality. MatPES includes computed data from the PBE functional, as well as the high fidelity r2SCAN meta-GGA functional with improved description across diverse bonding and chemistries.

The initial v2025.1 release comprises ~400,000 structures from 300K MD simulations. This dataset is much smaller than other PES datasets in the literature and yet achieves comparable or, in some cases, improved performance and reliability on trained FPs.

MatPES is part of the MatML ecosystem, which includes the MatGL and maml packages, the MatPES dataset, and the MatCalc.

Getting the DataSet

Hugging Face

The MatPES dataset is available on Hugging Face. You can use the datasets package to download it.

```python from datasets import load_dataset

load_dataset("mavrl/matpes", "pbe")

load_dataset("mavrl/matpes", "r2scan") ```

MatPES Package

The matpes python package, which provides tools for working with the MatPES datasets, can be installed via pip:

shell pip install matpes

Some command line usage examples:

```shell

Download the PBE dataset to the current directory

matpes download pbe

You should see a MatPES-PBE-20240214.json.gz file in your directory.

Extract all entries in the Fe-O chemical system

matpes data -i MatPES-PBE-20240214.json.gz --chemsys Fe-O -o Fe-O.json.gz ```

The matpes.db module provides functionality to create your own MongoDB database with the MatPES downloaded data, which is extremely useful if you are going to be working with the data (e.g., querying, adding entries, etc.) a lot.

MatPES-trained Models

We have released a set of MatPES-trained foundation potentials (FPs) in the M3GNet, CHGNet, TensorNet architectures in the MatGL package. For example, you can load the TensorNet FP trained on MatPES PBE 2025.1 as follows:

```python import matgl

potential = matgl.load_model("TensorNet-MatPES-PBE-v2025.1-PES") ```

The naming of the models follow the format <architecture>-<dataset>-<dataset-version>-PES.

These FPs can be used easily with the MatCalc package to rapidly compute properties. For example:

```python from matcalc.elasticity import ElasticityCalc from matgl.ext.ase import PESCalculator

asecalc = PESCalculator(potential) calculator = ElasticityCalc(asecalc) calculator.calc(structure) ```

Tutorials

We have provided Jupyter notebooks demonstrating how to load the MatPES dataset, train a model and perform fine-tuning.

Citing

If you use the MatPES dataset, please cite the following work:

txt Kaplan, A. D.; Liu, R.; Qi, J.; Ko, T. W.; Deng, B.; Riebesell, J.; Ceder, G.; Persson, K. A.; Ong, S. P. A Foundational Potential Energy Surface Dataset for Materials. arXiv 2025. DOI: 10.48550/arXiv.2503.04070.

In addition, if you use any of the pre-trained FPs or architectures, please cite the references provided on the architecture used as well as MatGL.

Owner

  • Name: Materials Virtual Lab
  • Login: materialsvirtuallab
  • Kind: organization
  • Email: ongsp@ucsd.edu
  • Location: La Jolla, CA

The Materials Virtual Lab is dedicated to the application of first principles calculations and informatics to accelerate materials design.

GitHub Events

Total
  • Create event: 5
  • Release event: 3
  • Issues event: 8
  • Watch event: 36
  • Delete event: 1
  • Issue comment event: 15
  • Public event: 1
  • Push event: 134
  • Pull request event: 8
  • Fork event: 2
Last Year
  • Create event: 5
  • Release event: 3
  • Issues event: 8
  • Watch event: 36
  • Delete event: 1
  • Issue comment event: 15
  • Public event: 1
  • Push event: 134
  • Pull request event: 8
  • Fork event: 2

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 6
  • Total pull requests: 13
  • Average time to close issues: 3 days
  • Average time to close pull requests: about 12 hours
  • Total issue authors: 4
  • Total pull request authors: 4
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.62
  • Merged pull requests: 6
  • Bot issues: 0
  • Bot pull requests: 8
Past Year
  • Issues: 6
  • Pull requests: 13
  • Average time to close issues: 3 days
  • Average time to close pull requests: about 12 hours
  • Issue authors: 4
  • Pull request authors: 4
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.62
  • Merged pull requests: 6
  • Bot issues: 0
  • Bot pull requests: 8
Top Authors
Issue Authors
  • JonathanSchmidt1 (2)
  • QuantumMisaka (2)
  • CheukHinHoJerry (1)
  • shyuep (1)
Pull Request Authors
  • pre-commit-ci[bot] (6)
  • rul048 (3)
  • kenko911 (2)
  • dependabot[bot] (2)
Top Labels
Issue Labels
bug (1)
Pull Request Labels
dependencies (2)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 71 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
pypi.org: matpes

Tools for working with MatPES.

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 71 Last month
Rankings
Dependent packages count: 9.6%
Average: 31.7%
Dependent repos count: 53.8%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/linting.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
pyproject.toml pypi
  • dash *
  • dash_bootstrap_components *
  • pandas >=2
  • plotly >=4.5.0
  • pymatgen *
  • pymatviz *
  • pymongo *
requirements.txt pypi
  • dash *