https://github.com/torchmd/torchmd-net

Training neural network potentials

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 4 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org
✓
Committers with academic emails
3 of 15 committers (20.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.6%) to scientific vocabulary

Keywords

energy-functions equivariant-representations molecular-dynamics neural-networks transformer

Keywords from Contributors

molecular-modeling

Last synced: 10 months ago · JSON representation

Repository

Training neural network potentials

Basic Info

Host: GitHub
Owner: torchmd
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 179 MB

Statistics

Stars: 428
Watchers: 6
Forks: 90
Open Issues: 43
Releases: 35

Topics

energy-functions equivariant-representations molecular-dynamics neural-networks transformer

Created over 5 years ago · Last pushed 11 months ago

Metadata Files

Readme License

TorchMD-NET

TorchMD-NET provides state-of-the-art neural networks potentials (NNPs) and a mechanism to train them. It offers efficient and fast implementations if several NNPs and it is integrated in GPU-accelerated molecular dynamics code like ACEMD, OpenMM and TorchMD. TorchMD-NET exposes its NNPs as PyTorch modules.

Documentation

Documentation is available at https://torchmd-net.readthedocs.io

Available architectures

Installation

TorchMD-Net is available as a pip installable wheel as well as in conda-forge

TorchMD-Net provides builds for CPU-only, CUDA 11.8 and CUDA 12.4. CPU versions are only provided as reference, as the performance will be extremely limited. Depending on which variant you wish to install, you can install it with one of the following commands:

```sh

The following will install the CUDA 12.4 version by default

pip install torchmd-net

The following will install the CUDA 11.8 version

pip install torchmd-net --extra-index-url https://download.pytorch.org/whl/cu118 --extra-index-url https://us-central1-python.pkg.dev/pypi-packages-455608/cu118/simple

The following will install the CUDA 12.4 version

pip install torchmd-net --extra-index-url https://download.pytorch.org/whl/cu124 --extra-index-url https://us-central1-python.pkg.dev/pypi-packages-455608/cu124/simple

The following will install the CPU only version (not recommended)

pip install torchmd-net --extra-index-url https://download.pytorch.org/whl/cpu --extra-index-url https://us-central1-python.pkg.dev/pypi-packages-455608/cpu/simple
```

Alternatively it can be installed with conda or mamba with one of the following commands. We recommend using Miniforge instead of anaconda.

shell mamba install torchmd-net cuda-version=11.8 mamba install torchmd-net cuda-version=12.4

Install from source

TorchMD-Net is installed using pip, but you will need to install some dependencies before. Check this documentation page.

Usage

Specifying training arguments can either be done via a configuration yaml file or through command line arguments directly. Several examples of architectural and training specifications for some models and datasets can be found in examples/. Note that if a parameter is present both in the yaml file and the command line, the command line version takes precedence. GPUs can be selected by setting the CUDA_VISIBLE_DEVICES environment variable. Otherwise, the argument --ngpus can be used to select the number of GPUs to train on (-1, the default, uses all available GPUs or the ones specified in CUDA_VISIBLE_DEVICES). Keep in mind that the GPU ID reported by nvidia-smi might not be the same as the one CUDA_VISIBLE_DEVICES uses.
For example, to train the Equivariant Transformer on the QM9 dataset with the architectural and training hyperparameters described in the paper, one can run: shell mkdir output CUDA_VISIBLE_DEVICES=0 torchmd-train --conf torchmd-net/examples/ET-QM9.yaml --log-dir output/ Run torchmd-train --help to see all available options and their descriptions.

Pretrained models

See here for instructions on how to load pretrained models.

Creating a new dataset

If you want to train on custom data, first have a look at torchmdnet.datasets.Custom, which provides functionalities for loading a NumPy dataset consisting of atom types and coordinates, as well as energies, forces or both as the labels. Alternatively, you can implement a custom class according to the torch-geometric way of implementing a dataset. That is, derive the Dataset or InMemoryDataset class and implement the necessary functions (more info here). The dataset must return torch-geometric Data objects, containing at least the keys z (atom types) and pos (atomic coordinates), as well as y (label), neg_dy (negative derivative of the label w.r.t atom coordinates) or both.

Custom prior models

In addition to implementing a custom dataset class, it is also possible to add a custom prior model to the model. This can be done by implementing a new prior model class in torchmdnet.priors and adding the argument --prior-model <PriorModelName>. As an example, have a look at torchmdnet.priors.Atomref.

Multi-Node Training

In order to train models on multiple nodes some environment variables have to be set, which provide all necessary information to PyTorch Lightning. In the following we provide an example bash script to start training on two machines with two GPUs each. The script has to be started once on each node. Once torchmd-train is started on all nodes, a network connection between the nodes will be established using NCCL.

In addition to the environment variables the argument --num-nodes has to be specified with the number of nodes involved during training.

```shell export NODERANK=0 export MASTERADDR=hostname1 export MASTER_PORT=12910

mkdir -p output CUDAVISIBLEDEVICES=0,1 torchmd-train --conf torchmd-net/examples/ET-QM9.yaml.yaml --num-nodes 2 --log-dir output/ ```

NODE_RANK : Integer indicating the node index. Must be 0 for the main node and incremented by one for each additional node.
MASTER_ADDR : Hostname or IP address of the main node. The same for all involved nodes.
MASTER_PORT : A free network port for communication between nodes. PyTorch Lightning suggests port 12910 as a default.

Known Limitations

Due to the way PyTorch Lightning calculates the number of required DDP processes, all nodes must use the same number of GPUs. Otherwise training will not start or crash.
We observe a 50x decrease in performance when mixing nodes with different GPU architectures (tested with RTX 2080 Ti and RTX 3090).
Some CUDA systems might hang during a multi-GPU parallel training. Try export NCCL_P2P_DISABLE=1, which disables direct peer to peer GPU communication.

Cite

If you use TorchMD-NET in your research, please cite the following papers:

Main reference

@misc{pelaez2024torchmdnet, title={TorchMD-Net 2.0: Fast Neural Network Potentials for Molecular Simulations}, author={Raul P. Pelaez and Guillem Simeon and Raimondas Galvelis and Antonio Mirarchi and Peter Eastman and Stefan Doerr and Philipp Thölke and Thomas E. Markland and Gianni De Fabritiis}, year={2024}, eprint={2402.17660}, archivePrefix={arXiv}, primaryClass={cs.LG} }

TensorNet

@inproceedings{simeon2023tensornet, title={TensorNet: Cartesian Tensor Representations for Efficient Learning of Molecular Potentials}, author={Guillem Simeon and Gianni De Fabritiis}, booktitle={Thirty-seventh Conference on Neural Information Processing Systems}, year={2023}, url={https://openreview.net/forum?id=BEHlPdBZ2e} }

Equivariant Transformer

@inproceedings{ tholke2021equivariant, title={Equivariant Transformers for Neural Network based Molecular Potentials}, author={Philipp Th{\"o}lke and Gianni De Fabritiis}, booktitle={International Conference on Learning Representations}, year={2022}, url={https://openreview.net/forum?id=zNHzqZ9wrRB} }

Graph Network

@article{Majewski2023, title = {Machine learning coarse-grained potentials of protein thermodynamics}, volume = {14}, ISSN = {2041-1723}, url = {http://dx.doi.org/10.1038/s41467-023-41343-1}, DOI = {10.1038/s41467-023-41343-1}, number = {1}, journal = {Nature Communications}, publisher = {Springer Science and Business Media LLC}, author = {Majewski, Maciej and Pérez, Adrià and Th\"{o}lke, Philipp and Doerr, Stefan and Charron, Nicholas E. and Giorgino, Toni and Husic, Brooke E. and Clementi, Cecilia and Noé, Frank and De Fabritiis, Gianni}, year = {2023}, month = sep }

Developer guide

Implementing a new architecture

To implement a new architecture, you need to follow these steps:
1. Create a new class in torchmdnet.models that inherits from torch.nn.Model. Follow TorchMDET as a template. This is a minimum implementation of a model:
```python class MyModule(nn.Module): def _init(self, parameter1, parameter2): super(MyModule, self).init_() # Define your model here self.layer1 = nn.Linear(10, 10) ... # Initialize your model parameters here self.resetparameters()

def reset_parameters(self):
  # Initialize your model parameters here
  nn.init.xavier_uniform_(self.layer1.weight)
...

def forward(self, z: Tensor, # Atomic numbers, shape (natoms, 1) pos: Tensor, # Atomic positions, shape (natoms, 3) batch: Tensor, # Batch vector, shape (natoms, 1). All atoms in the same molecule have the same value and are contiguous. q: Optional[Tensor] = None, # Atomic charges, shape (natoms, 1) s: Optional[Tensor] = None, # Atomic spins, shape (natoms, 1) ) -> Tuple[Tensor, Tensor, Tensor, Tensor, Tensor]: # Define your forward pass here scalarfeatures = ... vectorfeatures = ... # Return the scalar and vector features, as well as the atomic numbers, positions and batch vector return scalarfeatures, vectorfeatures, z, pos, batch ``**2.** Add the model to thealllist intorchmdnet.models.init.py`. This will make the tests pick your model up.
3. Tell models.model.createmodel how to initialize your module by adding a new entry, for instance:
python elif args["model"] == "mymodule": from torchmdnet.models.torchmd_mymodule import MyModule is_equivariant = False # Set to True if your model is equivariant representation_model = MyModule( parameter1=args["parameter1"], parameter2=args["parameter2"], **shared_args, # Arguments typically shared by all models )

4. Add any new parameters required to initialize your module to scripts.train.getargs. For instance:
```python parser.addargument('--parameter1', type=int, default=32, help='Parameter1 required by MyModule') ... **5.** Add an example configuration file to `torchmd-net/examples` that uses your model. **6.** Make tests use your configuration file by adding a case to tests.utils.load_example_args. For instance:python if modelname == "mymodule": configfile = join(dirname(dirname(file)), "examples", "MyModule-QM9.yaml") ```

At this point, if your module is missing some feature the tests will let you know, and you can add it. If you add a new feature to the package, please add a test for it.

Code style

We use black. Please run black on your modified files before committing.

Testing

To run the tests, install the package and run pytest in the root directory of the repository. Tests are a good source of knowledge on how to use the different components of the package.

Owner

Name: TorchMD
Login: torchmd
Kind: organization
Email: gianni.defabritiis@upf.edu

Website: https://www.compscience.org
Twitter: gdefabritiis
Repositories: 8
Profile: https://github.com/torchmd

TorchMD: A deep learning framework for molecular simulations

GitHub Events

Total

Create event: 18
Release event: 12
Issues event: 11
Watch event: 102
Delete event: 4
Issue comment event: 22
Push event: 176
Pull request review event: 4
Pull request event: 22
Fork event: 17

Last Year

Create event: 18
Release event: 12
Issues event: 11
Watch event: 102
Delete event: 4
Issue comment event: 22
Push event: 176
Pull request review event: 4
Pull request event: 22
Fork event: 17

Committers

Last synced: over 1 year ago

All Time

Total Commits: 1,123
Total Committers: 15
Avg Commits per committer: 74.867
Development Distribution Score (DDS): 0.598

Past Year

Commits: 261
Committers: 7
Avg Commits per committer: 37.286
Development Distribution Score (DDS): 0.513

Top Committers

Name	Email	Commits
RaulPPealez	r**z@g**m	452
Philipp Thölke	p**e@g**e	243
Raimondas Galvelis	r**s@a**m	165
Stefan Doerr	s**r@g**m	109
Antonio Mirarchi	a**i@u**u	58
Guillem Simeon	5****n	37
Gianni De Fabritiis	g**s@g**m	23
Peter Eastman	p**n@s**u	21
Ruunyox	c**e@g**m	4
Sebastian Dick	8****k	3
nec4	n**4@r**u	3
Bas Veeling	b**g@g**m	2
Stephen Farr	s**r@a**m	1
Jorge Fabila	4****a	1
Brian Bargh	b**h@g**m	1

Committer Domains (Top 20 + Academic)

acellera.com: 2 rice.edu: 1 stanford.edu: 1 upf.edu: 1 gmx.de: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 58
Total pull requests: 108
Average time to close issues: 28 days
Average time to close pull requests: about 2 months
Total issue authors: 24
Total pull request authors: 16
Average comments per issue: 5.98
Average comments per pull request: 1.47
Merged pull requests: 81
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 11
Pull requests: 24
Average time to close issues: 13 days
Average time to close pull requests: 13 days
Issue authors: 6
Pull request authors: 3
Average comments per issue: 0.82
Average comments per pull request: 0.79
Merged pull requests: 20
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

peastman (18)
raimis (9)
SoumyaCYZ (5)
nec4 (5)
RaulPPelaez (5)
sef43 (4)
FranklinHu1 (3)
PhilippThoelke (3)
Awesomium10 (2)
XJTUNR (2)
Honza-R (1)
AlexDuvalinho (1)
AndChenCM (1)
rgujr001 (1)
eva-not (1)

Pull Request Authors

RaulPPelaez (50)
stefdoerr (35)
raimis (31)
PhilippThoelke (19)
AntonioMirarchi (14)
peastman (6)
nec4 (6)
guillemsimeon (4)
sef43 (3)
brian8128 (2)
mixarcid (2)
shenoynikhil (2)
szaman19 (2)
lsnty5190 (1)
sebastianmdick (1)

Top Labels

Issue Labels

enhancement (4) help wanted (3) bug (2) question (1)

Pull Request Labels

enhancement (2) bug (1)

Packages

Total packages: 2
Total downloads:
- pypi 240 last-month
Total docker downloads: 35

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 0
(may contain duplicates)
Total versions: 5
Total maintainers: 1

proxy.golang.org: github.com/torchmd/torchmd-net

Documentation: https://pkg.go.dev/github.com/torchmd/torchmd-net#section-documentation
License: mit
Latest release: v2.4.11+incompatible
published 11 months ago

Versions: 3
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 5.4%

Average: 5.6%

Dependent repos count: 5.8%

Last synced: 11 months ago

pypi.org: torchmd-net

TorchMD-NET provides state-of-the-art neural networks potentials for biomolecular systems

Homepage: https://github.com/torchmd/torchmd-net
Documentation: https://torchmd-net.readthedocs.io/
License: mit
Latest release: 2.4.10
published about 1 year ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 240 Last month
Docker Downloads: 35

Rankings

Dependent packages count: 9.4%

Average: 31.0%

Dependent repos count: 52.7%

Maintainers (1)

stefdoerr

Last synced: 11 months ago

Dependencies

.github/workflows/CI.yml actions

actions/cache v1 composite
actions/checkout v2 composite
conda-incubator/setup-miniconda v2 composite

setup.py pypi

environment.yml conda

flake8
gxx
h5py
lightning 2.0.8
matplotlib-base
ninja
nnpops 0.5
pip
psutil
pydantic <2
pytest
pytorch 2.0.*
pytorch_cluster 1.6.1
pytorch_geometric 2.3.1
pytorch_scatter 2.1.1
pytorch_sparse 0.6.17
torchmetrics 0.11.4
tqdm

https://github.com/torchmd/torchmd-net

Science Score: 59.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

TorchMD-NET

Documentation

Available architectures

Installation

The following will install the CUDA 12.4 version by default

The following will install the CUDA 11.8 version

The following will install the CUDA 12.4 version

The following will install the CPU only version (not recommended)

Install from source

Usage

Pretrained models

Creating a new dataset

Custom prior models

Multi-Node Training

Known Limitations

Cite

Main reference

TensorNet

Equivariant Transformer

Graph Network

Developer guide

Implementing a new architecture

Code style

Testing

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

proxy.golang.org: github.com/torchmd/torchmd-net

Rankings

pypi.org: torchmd-net

Rankings

Maintainers (1)

Dependencies