LaMa

Fast and convenient maximum likelihood estimation for latent Markov models like HMMs, HSMMs, SSMs and point processes

https://github.com/janoleko/lama

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.1%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Fast and convenient maximum likelihood estimation for latent Markov models like HMMs, HSMMs, SSMs and point processes

Basic Info
Statistics
  • Stars: 9
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created almost 3 years ago · Last pushed 10 months ago
Metadata Files
Readme

README.Rmd

---
output: github_document
bibliography: vignettes/refs.bib
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "75%",
  fig.align = "center"
)
```

# LaMa - Latent Markov model toolbox 🛠️ 



[![CRAN status](https://www.r-pkg.org/badges/version/LaMa)](https://CRAN.R-project.org/package=LaMa)
[![metacran downloads](https://cranlogs.r-pkg.org/badges/last-month/LaMa)](https://cran.r-project.org/package=LaMa)
[![total downloads](https://cranlogs.r-pkg.org:443/badges/grand-total/LaMa)]( https://cranlogs.r-pkg.org:443/badges/grand-total/LaMa)
[![R-CMD-check](https://github.com/janoleko/LaMa/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/janoleko/LaMa/actions/workflows/R-CMD-check.yaml)


A variety of **latent Markov models** [@mews2024build], including **hidden Markov models** (HMMs), **hidden semi-Markov models** (HSMMs), **state-space models** (SSMs) 
and **continuous-time** variants can be formulated and estimated within the same framework via directly maximising the likelihood function using the so-called **forward algorithm** [@zucchini].
Applied researchers often need custom models that standard software does not easily support. Writing tailored `R` code offers flexibility but suffers from slow estimation speeds. This `R` package solves these issues by providing easy-to-use functions (written in C++ for speed) for common tasks like the forward algorithm. These functions can be combined into custom models in a Lego-type approach, offering up to 10-20 times faster estimation via standard numerical optimisers. In its most recent iteration, `LaMa` allows for automatic differentiation with the `RTMB` package which drastically increases speed and accuracy even more.

The most important families of functions are

* the `forward` family that calculates the log-likelihood for various different models,

* the `tpm` family for calculating transition probability matrices,
  
  
  
  
  

* the `stationary` family to compute stationary and periodically stationary distributions

* as well as the `stateprobs` and `viterbi` families for local and global decoding.

## Installation

You can install the released package version from [CRAN](https://CRAN.R-project.org) with:

```{r installation_cran, eval = FALSE}
install.packages("LaMa")
```

or the development version from Github:
```{r installation_github, eval = FALSE, warning=FALSE, message = FALSE}
remotes::install_github("janoleko/LaMa")
```


## Package documentation

To aid in building fully custom likelihood functions, this package contains several vignettes that demonstrate how to simulate data from and estimate a wide range of models using the functions included in this package.

HMMs, from simple to complex:

* [Introduction to LaMa](https://janoleko.github.io/LaMa/articles/Intro_to_LaMa.html)
* [Inhomogeneous HMMs with covariate effects](https://janoleko.github.io/LaMa/articles/Inhomogeneous_HMMs.html)
* [Longitudinal data](https://janoleko.github.io/LaMa/articles/Longitudinal_data.html)
* [Periodic HMMs](https://janoleko.github.io/LaMa/articles/Periodic_HMM.html)
* [LaMa and RTMB](https://janoleko.github.io/LaMa/articles/LaMa_and_RTMB.html)
* [Penalised splines](https://janoleko.github.io/LaMa/articles/Penalised_splines.html)

Other latent Markov model classes:

* [State-space models](https://janoleko.github.io/LaMa/articles/State_space_models.html)
* [Continuous-time HMMs](https://janoleko.github.io/LaMa/articles/Continuous_time_HMMs.html)
* [Hidden semi-Markov models](https://janoleko.github.io/LaMa/articles/HSMMs.html)
* [Markov-modulated (marked) Poisson processes](https://janoleko.github.io/LaMa/articles/MMMPPs.html)










## Introductory example: Homogeneous HMM

We analyse the `trex` data set contained in the package. It contains hourly step lengths of a Tyrannosaurus rex, living 66 million years ago. To these data, we fit a simple 2-state HMM with state-dependent gamma distributions for the step lengths.

```{r package}
library(LaMa)

head(trex, 3)
```
We start by defining the negative log-likelihood function. This is made really convenient by the functions `tpm()` which computes the transition probability matrix via the multinomial logit link, `stationary()` which computes the stationary distribution of the Markov chain and `forward()` which calculates the log-likelihood via the forward algorithm.

```{r nll}
nll = function(par, step){
  # parameter transformations for unconstrained optimisation
  Gamma = tpm(par[1:2]) # rowwise softmax
  delta = stationary(Gamma) # stationary distribution
  mu = exp(par[3:4]) # state-dependent means
  sigma = exp(par[5:6]) # state-dependent sds
  # calculating all state-dependent probabilities
  allprobs = matrix(1, length(step), 2)
  ind = which(!is.na(step))
  for(j in 1:2) allprobs[ind,j] = dgamma2(step[ind], mu[j], sigma[j])
  # simple forward algorithm to calculate log-likelihood
  -forward(delta, Gamma, allprobs)
}
```

To fit the model, we define the intial parameter vector and numerically optimise the above function using `nlm()`:
```{r model, warning=FALSE}
par = c(-2,-2,             # initial tpm params (logit-scale)
        log(c(0.3, 2.5)),  # initial means for step length (log-transformed)
        log(c(0.2, 1.5)))  # initial sds for step length (log-transformed)

system.time(
  mod <- nlm(nll, par, step = trex$step)
)
```
Really fast for 10.000 data points!

After tranforming the working (unconstrained) parameters to natural parameters using `tpm()` and `stationary()`, we can visualise the results:

```{r visualization}
# transform parameters to working
(Gamma = tpm(mod$estimate[1:2]))
(delta = stationary(Gamma)) # stationary HMM
(mu = exp(mod$estimate[3:4]))
(sigma = exp(mod$estimate[5:6]))

hist(trex$step, prob = TRUE, bor = "white", breaks = 40, main = "", xlab = "step length")
curve(delta[1] * dgamma2(x, mu[1], sigma[1]), add = TRUE, lwd = 2, col = "orange", n=500)
curve(delta[2] * dgamma2(x, mu[2], sigma[2]), add = TRUE, lwd = 2, col = "deepskyblue", n=500)
legend("topright", col = c("orange", "deepskyblue"), lwd = 2, bty = "n", legend = c("state 1", "state 2"))
```

Owner

  • Name: Jan-Ole Koslik
  • Login: janoleko
  • Kind: user
  • Location: Bielefeld
  • Company: Bielefeld University

Statistical Science Student @Bielefeld University

GitHub Events

Total
  • Watch event: 4
  • Push event: 502
  • Fork event: 1
  • Create event: 1
Last Year
  • Watch event: 4
  • Push event: 502
  • Fork event: 1
  • Create event: 1

Packages

  • Total packages: 1
  • Total downloads:
    • cran 269 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 7
  • Total maintainers: 1
cran.r-project.org: LaMa

Fast Numerical Maximum Likelihood Estimation for Latent Markov Models

  • Versions: 7
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 269 Last month
Rankings
Dependent packages count: 28.8%
Dependent repos count: 35.4%
Average: 49.9%
Downloads: 85.6%
Last synced: 10 months ago

Dependencies

DESCRIPTION cran
  • Rcpp >= 1.0.11 imports
  • iHSMM * imports
  • stats * imports