fmcmc

fmcmc: A friendly MCMC framework - Published in JOSS (2019)

https://github.com/uscbiostats/fmcmc

Science Score: 96.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 4 DOI reference(s) in README and JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org
○
Committers with academic emails
✓
Institutional organization owner
Organization uscbiostats has institutional domain (biostatsepi.usc.edu)
✓
JOSS paper metadata
Published in Journal of Open Source Software

Keywords

adaptive bayesian-inference markov-chain-monte-carlo mcmc metropolis-hastings parallel-computing r r-package

Scientific Fields

Economics Social Sciences - 40% confidence

Engineering Computer Science - 40% confidence

Last synced: 6 months ago · JSON representation

Repository

A friendly MCMC framework

Basic Info

Host: GitHub
Owner: USCbiostats
License: other
Language: R
Default Branch: master
Homepage: https://uscbiostats.github.io/fmcmc/
Size: 16.1 MB

Statistics

Stars: 16
Watchers: 3
Forks: 7
Open Issues: 9
Releases: 4

Topics

adaptive bayesian-inference markov-chain-monte-carlo mcmc metropolis-hastings parallel-computing r r-package

Created almost 9 years ago · Last pushed 9 months ago

Metadata Files

Readme Changelog License Code of conduct

README.Rmd

---
output:
  github_document:
    html_preview: false
---

```{r include=FALSE}
knitr::opts_chunk$set(fig.path = "man/figures/", warning = FALSE)
```


# fmcmc: A friendly MCMC framework 

[![DOI](http://joss.theoj.org/papers/10.21105/joss.01427/status.svg)](https://doi.org/10.21105/joss.01427)
[![R CI](https://github.com/USCbiostats/fmcmc/actions/workflows/ci.yml/badge.svg)](https://github.com/USCbiostats/fmcmc/actions/workflows/ci.yml)
[![Build
status](https://ci.appveyor.com/api/projects/status/lirawn11ssw9cq07/branch/master?svg=true)](https://ci.appveyor.com/project/gvegayon/fmcmc/branch/master)
[![Coverage
Status](https://img.shields.io/codecov/c/github/USCbiostats/fmcmc/master.svg)](https://codecov.io/github/USCbiostats/fmcmc?branch=master)
[![Lifecycle:
stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://www.tidyverse.org/lifecycle/#stable)
[![CRAN
status](https://www.r-pkg.org/badges/version/fmcmc)](https://cran.r-project.org/package=fmcmc)
[![CRAN
downloads](http://cranlogs.r-pkg.org/badges/grand-total/fmcmc)](https://cran.r-project.org/package=fmcmc)
[![status](https://tinyverse.netlify.com/badge/fmcmc)](https://CRAN.R-project.org/package=fmcmc)
[![Integrative Methods of Analysis for Genetic Epidemiology](https://raw.githubusercontent.com/USCbiostats/badges/master/tommy-image-badge.svg)](https://image.usc.edu)
 
## What

The `fmcmc` R package provides a lightweight general framework for implementing
Markov Chain Monte Carlo methods based on the Metropolis-Hastings algorithm. This
implementation's primary purpose lies in the fact that the user can incorporate the
following flexibly:

1.  **Automatic convergence checker**: The algorithm splits the MCMC runs according
    to the frequency with which it needs to check convergence. Users can
    use either one of the included functions (`convergence_gelman,`
    `convergence_geweke,` etc.), or provide their own.
    
2.  **Run multiple chains in parallel fashion**: Using either a `PSOCK` cluster
    (default), or providing a personalized cluster object like the ones in
    the `parallel` R package.

3.  **User defined transition kernels**: Besides of canonical Gaussian Kernel,
    users can specify their own or use one of the included in the package, for
    example: `kernel_adapt`, `kernel_ram`, `kernel_normal_reflective`, `kernel_unif`,
    `kernel_mirror`, or `kernel_unif_reflective`.
    
All the above without requiring compiled code. For the latest about `fmcmc,` checkout
the [NEWS.md](NEWS.md) section.

## Who is this for?

While a lot of users rely on MCMC tools such as Stan (via the [rstan](https://cran.r-project.org/package=rstan) package) or
WinBUGS (via [rstan](https://cran.r-project.org/package=R2WinBUGS)), in several settings either these tools are not enough
or provide too much for things that do not need that much. So, this tool is for
you if:

*  You have a simple model to estimate with Metropolis-Hastings.

*  You want to run multiple chains of your model using out-of-the-box parallel
   computing.
   
*  You don't want (or cannot) rely on external tools (so you need good-old
   base R only for your models).
   
*  You want to implement a model in which your model parameters are either bounded
   (like a standard error, for example), or are not, say, continuous (e.g., a
   size variable in a Binomial distribution), so you need a personalized transition
   kernel.
   
In any other case, you may want to take a look at the previously mentioned R 
packages, or check out the [mcmc](https://cran.r-project.org/package=mcmc) R
package, which also implements the Metropolis-Hastings algorithm (although with
not all the features that this R package has), the [adaptMCMC](https://cran.r-project.org/package=adaptMCMC)
R package, or the [MCMCpack](https://cran.r-project.org/package=MCMCpack) R package.

# Installing

If you want to get the latest bleeding-edge version from Github, you can use [devtools](https://cran.r-project.org/package=devtools):

```r
devtools::install_github("USCbiostats/fmcmc")
```

The latest (stable) release is also available on CRAN:

```r
install.packages("fmcmc")
```

# Citation

```{r citation, echo=FALSE, comment=""}
citation("fmcmc")
```


# Example: Linear regression model

## First run

In the following we show how to use the package for estimating parameters in
a linear regression model. First, let's simulate some data to use:

```{r random-data1}
set.seed(78845)
n <- 1000
X <- rnorm(n)
y <- 3.0 + 2.0*X + rnorm(n, sd = 4)
```

In this case, we have three parameters to estimate, the constant
(2.0), the $\beta$ coefficient (2.0), and the standard deviation parameter of the
error (1.5).

To estimate this model, we can either maximize the log-likelihood function--which
is what is usually done--or we could do it using MCMC. In either case, we need
to specify the log(unnormalized some times)-likelihood function:


```{r ll1}
ll <- function(p, X., y.) {
  
  joint_ll <- dnorm(y. - (p[1] + X.*p[2]), sd = p[3], log = TRUE)
  joint_ll <- sum(joint_ll)
  
  # If is undefined, then we explicitly return infinte (instead of NaN, for
  # example)
  if (!is.finite(joint_ll))
    return(-Inf)
  
  joint_ll
}
```

Notice that the function has more than one argument, in this case, `p,` 
the vector of parameters, `X.` and `y.,` the data of our model.

Let's do a first run of the MCMC algorithm using the function of the same name
(first, load the package, of course):

```{r mcmc1}
library(fmcmc)

# Running the MCMC (we set the seed first)
set.seed(1215)
ans <- MCMC(
  ll,
  initial = c(0, 0, sd(y)),
  nsteps  = 5000,
  X.      = X,
  y.      = y
  )
```

As the output object is an object of class `mcmc` from the `coda` R package, we
can use all the functions from it on our output:

```{r first-diagnostics}
library(coda)
plot(ans)
summary(ans)
```

While the summary statistics look very good (we got very close to the original
parameters), the trace of the parameters looks very bad (poor mixing). We can
re-run the algorithm changing the scale parameter in the `kernel_normal` function.
To do so, we can pass `ans` as the `initial` argument so that
the function starts from the last point of that chain:


```{r summary-and-plot1}
ans <- MCMC(
  ll,
  initial = ans,
  nsteps  = 5000,
  X.      = X,
  y.      = y,
  kernel  = kernel_normal(scale = .05) # We can set the scale parameter like this
  )
plot(ans)
```

Much better! Now, what if we use Vihola (2012) Robust Adaptive Metropolis (which
is also implemented in the R package [adaptMCMC](https://cran.r-project.org/package=adaptMCMC))


```{r summary-and-plot-ram}
ans_RAM <- MCMC(
  ll,
  initial = ans,
  nsteps  = 5000,
  X.      = X,
  y.      = y,
  kernel  = kernel_ram() 
  )
plot(ans_RAM)
1 - rejectionRate(ans_RAM)
```

We can also try using Haario et al. (2001) Adaptive Metropolis

```{r summary-and-plot-adapt}
ans_AM <- MCMC(
  ll,
  initial = ans,
  nsteps  = 5000,
  X.      = X,
  y.      = y,
  kernel  = kernel_adapt() 
  )
plot(ans_AM)
1 - rejectionRate(ans_AM)
```

Finally, if needed, we can also access information about the last run using `MCMC_OUTPUT`.
For example, if we wanted to look at the trace of the logposterior function, we could
use the `get_logpost()` function:

```{r get_}
plot(get_logpost(), type = "l")
```

The set of proposed values is also available using the `get_draws()` function:

```{r get_draws}
boxplot(get_draws(), type = "l")
```

If the previous run featured multiple chains, then `get_logpost()` would return a list
instead of length `get_nchains()`. 

## Automatic stop

Now, suppose that the algorithm actually takes a lot of time to actually reach
stationary state, then it would be nice to actually sample from the posterior
distribution once convergence has been reached. In the following example we 
use multiple chains and the Gelman-Rubin diagnostic to check for convergence of the
chain:


```{r mcmc2}
set.seed(1215) # Same seed as before
ans <- MCMC(
  ll,
  initial = c(0, 0, sd(y)),
  nsteps  = 5000,
  X.      = X,
  y.      = y,
  kernel  = kernel_normal(scale = .05),
  nchains = 2,                           # Multiple chains
  conv_checker = convergence_gelman(200) # Checking for conv. every 200 steps
  )
```

As a difference from the previous case, now we didn't have to wait until the 5,000
completed, but the algorithm stopped for us, allowing us to start generating
the desired sample much quicker.

## Kernels: Making sure that we get positive values

For this final example, we will use the `kernel` argument and provide what
corresponds to a transition kernel which makes proposals within certain boundaries,
in particular, we want the algorithm to propose only positive values for the `sd`
parameter, which we now must be positive.

Moreover, since we know that we will only get positive values, we
can go further and modify `ll` skipping the check for finite values:

```{r ll2}
ll <- function(p, X., y.) {
  
  sum(dnorm(y. - (p[1] + X.*p[2]), sd = p[3], log = TRUE))

}
```

Much simpler function! Let's do the call of the MCMC function specifying the 
right transition kernel to increase the acceptance rate. In this example, we will
set the max of all parameters to be 5.0, and the min to be -5.0 for the constant
and 0 for the beta coefficient and the variance parameter, all this using the
`kernel_normal_reflective` (which implements a normal kernel with boundaries) function:

```{r mcmc3}
set.seed(1215) # Same seed as before
ans <- MCMC(
  ll,
  initial = c(0, 0, sd(y)),
  nsteps  = 5000,
  X.      = X,
  y.      = y,
  kernel  = kernel_normal_reflective(
    ub    = 5.0,               # All parameters have the same upper bound
    lb    = c(-5.0, 0.0, 0.0), # But lower bound is specified per parameter
    scale = 0.05               # This is the same scale as before
    ),
  nchains = 2,                           
  conv_checker = convergence_gelman(200)
  )
```

Again, as the proposal kernel has lower and upper bounds, then we are guaranteed
that all proposed states are within the support of the parameter space.

# Other tools

`fmcmc` is just one way to work with Markov Chain Monte Carlo. Besides `stan` and
`WinBUGS`, there are other ways to do MCMC in R: [mcmc](https://cran.r-project.org/package=mcmc),
[HybridMC](https://cran.r-project.org/package=HybridMC), [adaptMCMC](https://cran.r-project.org/package=adaptMCMC), and
[elhmc](https://cran.r-project.org/package=elhmc) among others (take a look at
the [CRAN Task View on Bayesian Inference](https://CRAN.R-project.org/view=Bayesian).)

# Contributing to `fmcmc`

We welcome contributions to `fmcmc`. Whether reporting a bug, starting a discussion by asking a question, or proposing/requesting a new feature, please go by creating a new issue [here](https://github.com/USCbiostats/fmcmc/issues) so that we can talk about it.

Please note that the 'fmcmc' project is released with a [Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this project, you agree to abide by its terms.


# Funding

Supported by National Cancer Institute Grant #1P01CA196596.

Owner

Name: USC Division of Biostatistics
Login: USCbiostats
Kind: organization
Location: Los Angeles, CA

Website: https://biostatsepi.usc.edu/
Repositories: 34
Profile: https://github.com/USCbiostats

JOSS Publication

fmcmc: A friendly MCMC framework

Published

July 09, 2019

DOI

10.21105/joss.01427

Volume 4, Issue 39, Page 1427

Authors

George G Vega Yon

Department of Preventive Medicine, University of Southern California

Paul Marjoram

Department of Preventive Medicine, University of Southern California

Editor

Ariel Rokem

GitHub Events

Total

Push event: 1
Create event: 1

Last Year

Push event: 1
Create event: 1

Committers

Last synced: 7 months ago

All Time

Total Commits: 144
Total Committers: 2
Avg Commits per committer: 72.0
Development Distribution Score (DDS): 0.007

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
George G. Vega Yon	g**n@g**m	143
Paul Marjoram	p**a@g**m	1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 14
Total pull requests: 10
Average time to close issues: 2 months
Average time to close pull requests: 3 months
Total issue authors: 4
Total pull request authors: 3
Average comments per issue: 1.14
Average comments per pull request: 0.0
Merged pull requests: 8
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

gvegayon (6)
fabian-s (4)
dmi3kno (3)
mgzjys (1)

Pull Request Authors

gvegayon (8)
dmi3kno (1)
arokem (1)

Top Labels

Issue Labels

enhancement (5) help wanted (1) question (1) bug (1)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- cran 363 last-month
Total docker downloads: 21,613

Total dependent packages: 2
Total dependent repositories: 3
Total versions: 6
Total maintainers: 1

cran.r-project.org: fmcmc

A friendly MCMC framework

Homepage: https://github.com/USCbiostats/fmcmc
Documentation: http://cran.r-project.org/web/packages/fmcmc/fmcmc.pdf
License: MIT + file LICENSE
Latest release: 0.5-2
published over 2 years ago

Versions: 6
Dependent Packages: 2
Dependent Repositories: 3
Downloads: 363 Last month
Docker Downloads: 21,613

Rankings

Forks count: 9.7%

Dependent packages count: 13.2%

Stargazers count: 14.7%

Average: 15.1%

Dependent repos count: 16.8%

Downloads: 21.1%

Maintainers (1)

g.vegayon@gmail.com

Last synced: 6 months ago

Dependencies

DESCRIPTION cran

R >= 3.3.0 depends
MASS * imports
Matrix * imports
coda * imports
methods * imports
parallel * imports
stats * imports
covr * suggests
knitr * suggests
mcmc * suggests
mvtnorm * suggests
rmarkdown * suggests
tinytest * suggests

.github/workflows/ci.yml actions

actions/checkout v2 composite

.github/workflows/website.yml actions

actions/checkout v2 composite
peaceiris/actions-gh-pages v3 composite

docker/Dockerfile docker

rocker/drd latest build

fmcmc

Science Score: 96.0%

Keywords

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.Rmd

Owner

JOSS Publication

fmcmc: A friendly MCMC framework

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

cran.r-project.org: fmcmc

Rankings

Maintainers (1)

Dependencies