https://github.com/choderalab/integrator-benchmark
Code for enumerating and evaluating numerical methods for Langevin dynamics using near-equilibrium estimates of the KL-divergence. Accompanies https://doi.org/10.3390/e20050318
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: springer.com -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.5%) to scientific vocabulary
Keywords
kl-divergence
langevin-dynamics
molecular-dynamics
openmm
Last synced: 9 months ago
·
JSON representation
Repository
Code for enumerating and evaluating numerical methods for Langevin dynamics using near-equilibrium estimates of the KL-divergence. Accompanies https://doi.org/10.3390/e20050318
Basic Info
Statistics
- Stars: 11
- Watchers: 7
- Forks: 3
- Open Issues: 12
- Releases: 0
Topics
kl-divergence
langevin-dynamics
molecular-dynamics
openmm
Created over 9 years ago
· Last pushed about 8 years ago
https://github.com/choderalab/integrator-benchmark/blob/master/
[](https://travis-ci.org/choderalab/integrator-benchmark?branch=master)
# Not all Langevin integrators are equal
Enumerating and evaluating numerical schemes for Langevin dynamics
## Molecular dynamics requires methods for simulating continuous equations in discrete steps
A widely-used approach to investigating equilibrium properties is to simulate Langevin dynamics.
Langevin dynamics is a system of stochastic differential equations defined on a state space of configurations $\mathbf{x}$ and velocities $\mathbf{v}$.
To simulate those equations on a computer, we need to provide explicit instructions for advancing the state of the system $(\mathbf{x},\mathbf{v})$ by very small time increments.
Here, we will consider the family of methods that can be derived by splitting the Langevin system into a sum of three simpler systems, labeled `O`, `R`, and `V`. We then define a numerical method by approximately propagating each of those simpler systems for small increments of time in a specified order.
(TODO: Add LaTeX-rendered Langevin system with underbraces around `O`, `R`, `V` components, using `readme2tex`)
We will refer to a numerical scheme by its encoding string. For example, `OVRVO` means: simulate the `O` component for a time increment of $\Delta t/2$, then the `V` component for $\Delta t/2$, then the `R` component for $\Delta t$, then the `V` component for $\Delta t/2$, and finally the `O` component for $\Delta t/2$. This approximately propagates the entire system for a total time increment of $\Delta t$.
## This introduces error that can be sensitive to details
Using subtly different numerical schemes for the same continuous equations can lead to drastically different behavior at finite timesteps.
As a prototypical example, consider the difference between the behavior of schemes `VRORV` and `OVRVO` on a 1D quartic system:

($\rho$ is the density sampled by the numerical scheme, and $\pi$ is the exact target density.
Each column illustrates an increasing finite timestep $\Delta t$, below the "stability threshold."
Rows 2 and 4 illustrate error in the sampled joint distribution, and rows 1 and 3 illustrate error in the sampled $\mathbf{x}$-marginal distribution.)
The two schemes have the same computational cost per iteration, and introduce nearly identical levels of error into the sampled joint $(\mathbf{x}, \mathbf{v})$ distribution -- but one of these methods introduces about 100x more error in the $\mathbf{x}$ marginal than the other at large timesteps!
### Toy implementation
To illustrate the scheme, here is a toy Python implementation for each of the explicit updates:
```python
# Defined elsewhere: friction coefficient `gamma`, mass `m`
def propagate_R(x, v, h):
"""Linear "drift" -- deterministic position update
using current velocities"""
return (x + (h * v), v)
def propagate_V(x, v, h):
"""Linear "kick" -- deterministic velocity update
using current forces"""
return (x, v + (h * force(x) / m))
def propagate_O(x, v, h):
"""Ornstein-Uhlenbeck -- stochastic velocity update
using a ficticious "heat-bath""""
a, b = exp(-gamma * h), sqrt(1 - exp(-2 * gamma * h))
return (x, (a * v) + b * draw_maxwell_boltzmann_velocities())
propagate = {"O": propagate_O, "R": propagate_R, "V": propagate_V}
```
where `draw_maxwell_boltzmann_velocities()` draws an independent sample from the velocity distribution given by the masses and temperature.
Using the functions we just defined, here's how to implement the inner-loop of the scheme denoted `OVRVO`:
```python
x, v = propagate_O(x, v, dt / 2)
x, v = propagate_V(x, v, dt / 2)
x, v = propagate_R(x, v, dt)
x, v = propagate_V(x, v, dt / 2)
x, v = propagate_O(x, v, dt / 2)
```
And here's the `VRORV` inner-loop:
```python
x, v = propagate_V(x, v, dt / 2)
x, v = propagate_R(x, v, dt / 2)
x, v = propagate_O(x, v, dt)
x, v = propagate_R(x, v, dt / 2)
x, v = propagate_V(x, v, dt / 2)
```
As suggested from these examples, the generic recipe for turning a splitting string into a Langevin integrator is:
```python
def get_n(substep="O", splitting="OVRVO"):
return sum([s == substep for s in splitting])
def simulate_timestep(x, v, dt, splitting="OVRVO"):
for substep in splitting:
x, v = propagate[substep](x, v, dt / get_n(substep, splitting))
return x, v
```
# Systematically enumerating numerical schemes and measuring their error
In this repository, we enumerate numerical schemes for Langevin dynamics by associating strings over the alphabet `{O, R, V}` with explicit numerical methods using [OpenMM `CustomIntegrator`](http://docs.openmm.org/7.1.0/userguide/theory.html#customintegrator)s. We provide schemes for approximating the error introduced by that method in the sampled distribution over $(\mathbf{x},\mathbf{v}) jointly or $\mathbf{x}$ alone using nonequilibrium work theorems.
We further investigate the effects of modifying the mass matrix (aka "hydrogen mass repartitioning") and/or evaluating subsets of the forces per substep (aka "multi-timestep" methods, obtained by expanding the alphabet to `{O, R, V0, V1, ..., V32}`).
(TODO: Details on nonequilibrium error measurement scheme.)
## Relation to prior work
We did not introduce the concept of splitting the Langevin system into these three "building blocks" -- this decomposition is developed lucidly in Chapter 7 of [[Leimkuhler and Matthews, 2015]](http://www.springer.com/us/book/9783319163741). We also did not discover the `VRORV` integrator -- Leimkuhler and Matthews have studied the favorable properties of particular integrator `VRORV` ("`BAOAB`" in their notation) in great detail.
Here, we have:
1. provided a method to translate these strings into efficient `CustomIntegrators` in OpenMM,
2. provided a uniform scheme for measuring the sampling error introduced by any member of this family of methods on any target density (approximating the KL divergence directly, rather than monitoring error in a system-specific choice of low-dimensional observable),
3. considered an expanded alphabet, encompassing many widely-used variants as special cases.
# References
(TODO: Add references from paper!)
Owner
- Name: Chodera lab // Memorial Sloan Kettering Cancer Center
- Login: choderalab
- Kind: organization
- Email: john.chodera@choderalab.org
- Location: Memorial Sloan-Kettering Cancer Center, Manhattan, NY
- Website: http://choderalab.org
- Repositories: 269
- Profile: https://github.com/choderalab
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1