Bernadette
Bernadette: Bayesian Inference and Model Selection for Stochastic Epidemics in R - Published in JOSS (2023)
Science Score: 93.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org, zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Scientific Fields
Physics
Physical Sciences -
40% confidence
Artificial Intelligence and Machine Learning
Computer Science -
40% confidence
Last synced: 4 months ago
·
JSON representation
Repository
Bayesian analysis of infectious disease transmission dynamics, with a particular focus on Coronavirus Disease 2019 (COVID-19)
Basic Info
Statistics
- Stars: 2
- Watchers: 1
- Forks: 4
- Open Issues: 1
- Releases: 3
Created over 4 years ago
· Last pushed 10 months ago
Metadata Files
Readme
Changelog
Contributing
License
Code of conduct
README.Rmd
---
output:
github_document:
pandoc_args: "--webtex"
---
# Bernadette
[](https://github.com/bernadette-eu/Bernadette/actions)
[](https://cran.r-project.org/package=Bernadette)
[](https://Bernadette.r-universe.dev/ui/#package:Bernadette)
[](https://github.com/bernadette-eu/Bernadette/commits/main)
[](https://lifecycle.r-lib.org/articles/stages.html#stable)
[](https://shinyus.ipub.com/cranview/)
[](https://shinyus.ipub.com/cranview/)
[](https://joss.theoj.org/papers/8ee5ffed37ccbd0a8de519553da6ceea)
[](https://zenodo.org/badge/latestdoi/377866248)
## Contribution
This R package is provided for use under the [GPL-3.0](https://www.gnu.org/licenses/gpl-3.0.en.html) License by the author.
Please use the [issue tracker](https://github.com/bernadette-eu/Bernadette/issues) for bug reports and feature requests.
All contributions are welcome, in particular those that improve the approach or the robustness of the code base. We also welcome additions and extensions to the underlying model either in the form of options or improvements.
## Code of Conduct
Please note that this project is released with a [Contributor Code of Conduct](https://github.com/bernadette-eu/Bernadette/blob/master/CODE_OF_CONDUCT.md). By participating in this project you agree to abide by its terms.
## Overview
The **Bernadette** ("Bayesian inference and model selection for stochastic epidemics") R package provides a framework for Bayesian analysis of infectious disease transmission dynamics via diffusion driven multi-type epidemic models with time-varying coefficients, with a particular focus on Coronavirus Disease 2019 (COVID-19).
For models fit using Markov chain Monte Carlo, it allows for computation of approximate leave-one-out cross-validation (LOO, LOOIC) or the Widely Applicable Information Criterion (WAIC) using the **loo** package for model checking and comparison.
**Bernadette** complements packages for the modeling of disease transmission dynamics, see the R Epidemics Consortium website (https://www.repidemicsconsortium.org/), by implementing the Bayesian hierarchical modeling approach described in [1].
It links the observed age-stratified mortality counts for the population of a given country over time to latent infections via an over-dispersed count model.
The change in infections over time is governed by a deterministic multi-type compartmental model which is driven by potentially non-scalar diffusion processes to adequately capture the temporal evolution of the age-stratified transmission rates.
**Bernadette** relaxes the assumption of a homogeneous population, incorporates age structure and accounts for the presence of social structures via publicly available contact matrices.
This allows for further evidence synthesis utilizing information from contact surveys and for sharing statistical strength across age groups and time.
Further, the Bayesian evidence synthesis approach implemented in the **Bernadette** package enables learning the age-stratified virus transmission rates from one group to another, with a particular focus on the transmission rate between and onto vulnerable groups which could support public health authorities in devising interventions targeting these groups.
The effective reproduction number is estimated from the daily age-stratified mortality counts, accounting for variations in transmissibility that are not obvious from reported infection counts.
While estimation of the uncertainty over the effective reproduction number is itself a challenging problem, it is propagated naturally via Markov chain Monte Carlo.
Even though **Bernadette** is motivated by the analysis of healthcare surveillance data related to COVID-19, it provides a template for implementation to a broader range of infectious disease epidemics and outbreaks.
## Website
The website for the BERNADETTE Marie Sklodowska-Curie Action (MSCA) is available at: https://bernadette-eu.github.io/.
## Installation
Installation of the package requires you to have a working C++ toolchain.
To ensure that this is working, please first install **rstan** by following these [installation instructions](https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started) from the Rstan wiki.
The following must return ```TRUE``` before continuing:
```
if (!require("pkgbuild")) {
install.packages("pkgbuild")
}
pkgbuild::has_build_tools(debug = TRUE)
```
Having installed **rstan**, you can then install the latest development version of **Bernadette** (requires the [devtools](https://github.com/r-lib/devtools) package) from [GitHub](https://github.com/) with:
```{r, eval=F, echo=T}
if (!require("devtools")) {
install.packages("devtools")
}
devtools::install_github("bernadette-eu/Bernadette", dependencies = TRUE, build_vignettes = FALSE)
```
or the latest version on CRAN
```{r, eval=F, echo=T}
install.packages("Bernadette")
```
## Load any required libraries
```{r, eval = T, results='hide', message=FALSE, warning=FALSE}
lib <- c("stats",
"ggplot2",
"rstan",
"bayesplot",
"Bernadette",
"loo")
lapply(lib, require, character.only = TRUE)
```
## Example - COVID-19 in Greece
The time series of new daily age-specific mortality and incidence counts are ordered by epidemiological date. The time horizon in the example datasets for Greece spans from 2020-08-31 to 2021-03-28 (210 days).
```{r, results='hide', message=FALSE, warning=FALSE}
Sys.setlocale("LC_ALL", "English")
data(age_specific_mortality_counts)
data(age_specific_cusum_infection_counts)
```
```{r mortality_time_series, echo = FALSE, fig.height=5, fig.width=7}
sub <- age_specific_mortality_counts[c(3, 6:ncol(age_specific_mortality_counts))]
mortality_df_long <- stats::reshape(sub,
direction = "long",
varying = list(names(sub)[2:ncol(sub)]),
v.names = "New_Deaths",
idvar = c("Date"),
timevar = "Group",
times = colnames(sub)[-1])
ggplot(mortality_df_long,
aes(Date = Date,
New_Deaths = New_Deaths)) +
geom_point(aes(x = Date,
y = New_Deaths,
fill= "Reported deaths")) +
facet_wrap(. ~ Group,
scales = "free_y",
ncol = 2,
strip.position = "top")+
scale_fill_manual(values = c('Reported deaths' = 'black'), guide = "none") +
labs(x = "Epidemiological Date",
y = "Number of new deaths") +
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y") +
theme_bw() +
theme(panel.spacing = unit(0.1,"cm"),
axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "bottom",
legend.title = element_blank() )
```
Let's import the rest of the datasets required for model fitting. First, import and plot the age distribution for Greece in 2020:
```{r}
age_distr <- age_distribution(country = "Greece", year = 2020)
plot_age_distribution(age_distr)
```
Information is available for sixteen 5-year age groups, with the last one being "75+". We
shall consider three age groups, \{0-39, 40-64, 65+\}, by providing the following lookup table:
```{r}
lookup_table <- data.frame(Initial = age_distr$AgeGrp,
Mapping = c(rep("0-39", 8),
rep("40-64", 5),
rep("65+" , 3)))
aggr_age <- aggregate_age_distribution(age_distr, lookup_table)
plot_age_distribution(aggr_age)
```
Next, import the projected contact matrix for Greece:
```{r, fig.height=5, fig.width=12}
conmat <- contact_matrix(country = "GRC")
plot_contact_matrix(conmat)
```
Aggregate the contact matrix to the age groups \{0-39, 40-64, 65+\}:
```{r}
aggr_cm <- aggregate_contact_matrix(conmat, lookup_table, aggr_age)
plot_contact_matrix(aggr_cm)
```
Obtain estimates of the age-specific infection-fatality-ratio:
```{r}
ifr_mapping <- c(rep("0-39", 8), rep("40-64", 5), rep("65+", 3))
aggr_age_ifr <- aggregate_ifr_react(age_distr, ifr_mapping, age_specific_infection_counts)
```
Provide the infection-to-death distribution with a mean of 24.19 days:
```{r}
ditd <- itd_distribution(ts_length = nrow(age_specific_mortality_counts),
gamma_mean = 24.19231,
gamma_cv = 0.3987261)
```
## Modeling framework
The aforementioned data streams and expert knowledge are integrated into a coherent modeling framework via a Bayesian evidence synthesis approach. Τhe modeling process is separated into a latent epidemic process and an observation process in an effort to reduce sensitivity to observation noise and to allow for more flexibility in modeling different forms of data.
### Diffusion-driven multi-type transmission process
The transmission of COVID-19 is modeled through an age-stratified deterministic Susceptible-Exposed-Infectious-Removed (SEIR) compartmental model with Erlang-distributed latent and infectious periods. In particular, we introduce Erlang-distributed stage durations in the Exposed and Infected compartments and relax the mathematically convenient but unrealistic assumption of exponential stage durations by assuming that each of the Exposed and Infected compartments are defined by two stages, with the same rate of loss of latency ($\tau$) and infectiousness ($\gamma$) in both stages.
Removed individuals are assumed to be immune to reinfection for at least the duration of the study period.
The population is stratified into $\alpha \in \{1,\ldots,A\}$ age groups and the total size of the age group is denoted by $N_{\alpha} = S_t^{\alpha} + E_{1,t}^{\alpha} + E_{2,t}^{\alpha} + I_{1,t}^{\alpha} + I_{2,t}^{\alpha} + R_t^{\alpha}$, where $S_t^{\alpha}$ represents the number of susceptible, $E_t^{\alpha} = \sum_{j=1}^{2}E_{j,t}^{\alpha}$ represents the number of exposed but not yet infectious, $I_t^{\alpha} = \sum_{j=1}^{2}I_{j,t}^{\alpha}$ is the number of infected and $R_t^{\alpha}$ is the number of removed individuals at time $t$ at age group $\alpha$.
The number of individuals in each compartment is scaled by the total population $N = \sum_{\alpha = 1}^{A}N_{\alpha}$, so that the sum of all compartments equals to one.
The latent epidemic process is expressed by the following non-linear system of ordinary differential equations (ODEs)
$$
\begin{cases}
\frac{dS^{\alpha}_{t}}{dt} & = -\lambda_{\alpha}(t) S^{\alpha}_{t}
\\
\frac{dE^{\alpha}_{1,t}}{dt} & = \lambda_{\alpha}(t) S^{\alpha}_{t} - \tau E^{\alpha}_{1,t}
\\
\frac{dE^{\alpha}_{2,t}}{dt} & = \tau \left( E^{\alpha}_{1,t} - E^{\alpha}_{2,t}\right)
\\
\frac{dI^{\alpha}_{1,t}}{dt} & = \tau E^{\alpha}_{2,t} - \gamma I^{\alpha}_{1,t}
\\
\frac{dI^{\alpha}_{2,t}}{dt} & = \gamma \left( I^{\alpha}_{1,t} - I^{\alpha}_{2,t}\right)
\\
\frac{dR^{\alpha}_{t}}{dt} & = \gamma I^{\alpha}_{2,t},
\end{cases}
$$
where the mean latent and infectious periods are $d_E = \frac{2}{\tau}$, $d_I = \frac{2}{\gamma}$, respectively. The number of new infections in age group $\alpha$ at day $t$ is
$$
\Delta^{\text{infec}}_{t, \alpha} = \int_{t-1}^{t} \tau E^{\alpha}_{2,s} ds.
$$
The time-dependent force of infection $\lambda_{\alpha}(t)$ for age group $\alpha \in \{1,\ldots,A\}$ is expressed as
$$
\lambda_{\alpha}(t) = \sum_{\alpha'=1}^{A}\left[ m_{\alpha,\alpha'}(t) \frac{\left(I^{\alpha'}_{1,t} + I^{\alpha'}_{2,t}\right)}{\mathbb{N}_{\alpha'}}\right],
$$
which is a function of the proportion of infectious individuals in each age group $\alpha' \in \{1,\ldots,A\}$, via the compartments $I^{\alpha'}_{1,t}$, $I^{\alpha'}_{2,t}$ divided by the total size of the age group $\mathbb{N}_{\alpha'}$, and the time-varying person-to-person transmission rate from group $\alpha$ to group $\alpha'$, $m_{\alpha,\alpha'}(t)$.
We parameterize the transmission rate between different age groups $(\alpha,\alpha') \in \{1,\ldots,A\}^2$ by
\(
m_{\alpha,\alpha'}(t) =
\beta^{\alpha\alpha'}_{t}
\cdot
C_{\alpha,\alpha'},
\)
breaking down the transmission rate matrix into its biological and social components: the social component is represented by the average number of contacts between individuals of age group $\alpha$ and age group $\alpha'$ via the contact matrix element $C_{\alpha,\alpha'}$;
$\beta^{\alpha\alpha'}_{t}$ is the time-varying transmissibility of the virus, the probability that a contact between an infectious person in age group $\alpha$ and a susceptible person in age group $\alpha'$ leads to transmission at time $t$.
The formulation below may be viewed as a stochastic extension to the deterministic multi-type SEIR model, using diffusion processes for the coefficients $\beta^{\alpha\alpha'}_{t}$, driven by independent Brownian motions
$$
\begin{cases}
\beta^{\alpha\alpha'}_{t} & = \exp(x^{\alpha\alpha'}_{t})
\\
x^{\alpha\alpha'}_{t} \mid x^{\alpha\alpha'}_{t - 1}, \sigma^{\alpha\alpha'}_x & \sim \operatorname{N}(x^{\alpha\alpha'}_{t - 1}, (\sigma^{\alpha\alpha'})^2_x)
\\
dx^{\alpha\alpha'}_{t} & = \sigma^{\alpha\alpha'}_{x}dW^{\alpha\alpha'}_{t}
\\
dW^{\alpha\alpha'}_{t} & \sim \operatorname{N}(0,d_{t}),
\end{cases}
$$
with volatilities $\sigma^{\alpha\alpha'}_x$, corresponding to the case of little information on the shape of $\beta^{\alpha\alpha'}_{t}$.
The volatility $\sigma^{\alpha\alpha'}_x$ plays the role of the regularizing factor: higher values of $\sigma^{\alpha\alpha'}_x$ lead to greater changes in $\beta^{\alpha\alpha'}_{t}$.
The exponential transformation avoids negative values which have no biological meaning.
A major advantage of considering a diffusion process for modeling $\beta^{\alpha\alpha'}_{t}$ is its ability to capture and quantify the randomness of the underlying transmission dynamics, which is particularly useful when the dynamics are not completely understood.
The diffusion process accounts for fluctuations in transmission that are influenced by non-modeled phenomena, such as new variants, mask-wearing propensity, etc.
The diffusion process also allows for capturing the effect of unknown extrinsic factors on the age-stratified force of infection,
for monitoring of the temporal evolution of the age-specific transmission rate without the implicit inclusion of external variables and for tackling non-stationarity in the data.
We propose a diffusion-driven multi-type latent transmission model which assigns independent Brownian motions to $\log(\beta^{11}_{t}), \log(\beta^{22}_{t}), \ldots, \log(\beta^{AA}_{t})$ with respective age-stratified volatility parameters $\sigma_{x,\alpha}, \alpha \in \{1,\ldots,A\}$, for reasons of parsimony and interpretability.
The contact matrix is scaled by the age-stratified transmissibility in order to obtain the transmission rate matrix process
$$
m_{\alpha,\alpha'}(t) = \beta^{\alpha\alpha'}_{t}
\cdot
C_{\alpha,\alpha'}
\equiv
\beta^{\alpha\alpha}_{t}
\cdot
C_{\alpha,\alpha'}
$$
under the assumption $\beta^{\alpha\alpha'}_t \equiv \beta^{\alpha\alpha}_t, \alpha \neq \alpha'$.
### Observation process
Denote the number of observed deaths on day $t = 1, \ldots, T$ in age group $\alpha \in \{1,\ldots,A\}$ by $y_{t,\alpha}$.
A given infection may lead to observation events (i.e deaths) in the future.
A link between $y_{t,\alpha}$ and the expected number of new age-stratified infections is established via the function
$$
d_{t,\alpha} = \mathbb{E}[y_{t,\alpha}] = \widehat{\text{IFR}}_{\alpha} \times \sum_{s = 1}^{t-1}h_{t-s} \Delta^{\text{infec}}_{s, \alpha}
$$
on the new expected age-stratified mortality counts, $d_{t,\alpha}$, the estimated age-stratified infection fatality rate, $\widehat{\text{IFR}}_{\alpha}$, and the infection-to-death distribution $h$, where $h_s$ gives the probability the death occurs on the $s^{th}$ day after infection, is assumed to be gamma distributed with mean 24.2 days and coefficient of variation 0.39, that is
$$
h \sim \operatorname{Gamma}(6.29, 0.26).
$$
We allow for over-dispersion in the observation processes to account for noise in the underlying data streams, for example due to day-of-week effects on data collection, and link $d_{t,\alpha}$ to $y_{t,\alpha}$ through an over-dispersed count model
$$
y_{t,\alpha}\mid \theta \sim \operatorname{NegBin}\left(d_{t,\alpha}, \xi_{t,\alpha}\right),
$$
where $\xi_{t,\alpha} = \frac{d_{t,\alpha}}{\phi}$, such that $\mathbb{V}[y_{t,\alpha}] = d_{t,\alpha}(1+\phi)$.
The log-likelihood of the observed deaths is given by
$$
\ell^{Deaths}(y\mid \theta) = \sum_{t=1}^{T}\sum_{\alpha=1}^{A}\text{logNegBin}\left(y_{t,\alpha}\mid d_{t,\alpha}, \xi_{t,\alpha}\right),
$$
where $y \in \mathbb{R}^{T \times A}_{0,+}$ are the surveillance data on deaths for all time-points.
### Prior specification
The unknown quantities that need to be inferred are $\theta= (x^{\alpha,\alpha}_{0}, x^{\alpha,\alpha}_{1}, C_{\alpha,\alpha'}, \sigma_{x,\alpha}, \rho, \phi)$ for $(\alpha,\alpha') \in \{1,\ldots,A\}^2$.
$\rho$ expresses the initial proportion of exposed (at time $t_0$).
$L^{synth}$ is the Cholesky factor of the observed contact matrix and $L$ is the Cholesky factor of the random contact matrix $C$, see [1].
The prior distributions for the model parameters that are considered in this example are:
$$
\begin{aligned}
\rho& \sim \operatorname{Beta}(\mathbb{E}[\rho] = 0.1, \mathbb{V}[\rho] = 0.05\cdot\mathbb{E}[\rho])
\\
\tilde{L}_{i,j} & \sim \operatorname{Normal}(0,1),~(i,j) \in \{1, \ldots, A\}^2, i \leq j
\\
L_{i,j} & = L^{synth}_{i,j} + (0.05\cdot L^{synth}_{i,j})\cdot \tilde{L}_{i,j},~(i,j) \in \{1, \ldots, A\}^2, i \leq j
\\
x^{\alpha,\alpha}_{0} & \sim \operatorname{N}(0,1)
\\
x^{\alpha,\alpha}_{1} & \sim \operatorname{N}(0,1)
\\
\sigma_{x,\alpha} & \sim \operatorname{N}_+(0,1)
\\
\phi & \sim \operatorname{Exp}(0.2).
\end{aligned}
$$
## Model fitting
Find the number of cores in your machine and indicate how many will be used for parallel processing:
```{r, eval=F, echo=T}
parallel::detectCores()
rstan_options(auto_write = TRUE)
# Here we sample from six Markov chains in parallel:
chains <- 6
options(mc.cores = chains)
```
The current functionality of the package allows the end-user to define the number of changes of the effective contact rate during the course of 7 days.
Modeling the time evolution of the effective contact rate at a more granular time-scale (i.e. at the time-points
of the observations) would require setting ```ecr_changes = 1``` in the main function ```stan_igbm```. In an effort to reduce the cost of the computational effort, we allow for changes every 7 days, setting ```ecr_changes = 7```.
The following is an optional step. We perform maximization of the joint posterior distribution. The resulting point estimates are, then, used to define the initial points that will be fed to the MCMC sampler.
```{r, eval=F, echo=T}
igbm_fit_init <- stan_igbm(y_data = age_specific_mortality_counts,
contact_matrix = aggr_cm,
age_distribution_population = aggr_age,
age_specific_ifr = aggr_age_ifr[[3]],
itd_distr = ditd,
incubation_period = 3,
infectious_period = 4,
likelihood_variance_type = "linear",
ecr_changes = 7,
prior_scale_x0 = 1,
prior_scale_x1 = 1,
prior_scale_contactmatrix = 0.05,
pi_perc = 0.1,
prior_volatility = normal(location = 0, scale = 1),
prior_nb_dispersion = exponential(rate = 1/5),
algorithm_inference = "optimizing",
seed = 1,
hessian = TRUE)
```
```{r, eval=F, echo=T}
sampler_init <- function(){
list(x0 = igbm_fit_init$par[names(igbm_fit_init$par) %in% "x0"],
x_init = igbm_fit_init$par[grepl("x_init", names(igbm_fit_init$par), fixed = TRUE)],
x_noise = igbm_fit_init$par[grepl("x_noise[", names(igbm_fit_init$par), fixed = TRUE)],
L_raw = igbm_fit_init$par[grepl("L_raw[", names(igbm_fit_init$par), fixed = TRUE)],
pi = igbm_fit_init$par[names(igbm_fit_init$par) %in% "pi"],
volatilities= igbm_fit_init$par[grepl("volatilities", names(igbm_fit_init$par), fixed = TRUE)],
phiD = igbm_fit_init$par[names(igbm_fit_init$par) %in% "phiD"]
)
}
```
We sample from the posterior distribution of the parameters using the Hamiltonian Monte Carlo algorithm implemented in Stan - Note that a longer time horizon and/or a greater number of age groups in ```y_data``` will lead to increased CPU time:
```{r, eval=F, echo=T}
igbm_fit <- stan_igbm(y_data = age_specific_mortality_counts,
contact_matrix = aggr_cm,
age_distribution_population = aggr_age,
age_specific_ifr = aggr_age_ifr[[3]],
itd_distr = ditd,
incubation_period = 3,
infectious_period = 4,
likelihood_variance_type = "linear",
ecr_changes = 7,
prior_scale_x0 = 1,
prior_scale_x1 = 1,
prior_scale_contactmatrix = 0.05,
pi_perc = 0.1,
prior_volatility = normal(location = 0, scale = 1),
prior_nb_dispersion = exponential(rate = 1/5),
algorithm_inference = "sampling",
nBurn = 500,
nPost = 500,
nThin = 1,
chains = chains,
adapt_delta = 0.80,
max_treedepth = 19,
seed = 1,
init = sampler_init)
```
```{r, eval=T, echo=T}
# Import the saved object:
giturl <- "https://github.com/bernadette-eu/Bernadette/blob/master/Example_files/Experiment_v112.RData?raw=true"
load(url(giturl))
```
The implementation of MCMC under this setting took about 5.4 hours:
```{r, eval=T, echo=T}
time_run <- get_elapsed_time(igbm_fit)
apply(time_run, 1, sum)
```
## Summarise the distributions of estimated parameters and derived quantities using the posterior draws.
```{r, eval=T, echo=T}
print_summary <- summary(object = igbm_fit,
y_data = age_specific_mortality_counts)
round(print_summary$summary, 3)
```
Standard functions from the **rstan** package can be used for further assessment of the model:
```{r, eval=T, echo=T}
# Check for divergent transitions:
rstan::check_divergences(igbm_fit)
```
```{r, eval=T, echo=T}
# Prints the number (and percentage) of iterations that saturated the max treedepth:
rstan::check_treedepth(igbm_fit)
```
## MCMC diagnostic plots
Example - Pairs plots between some parameters:
```{r, eval=T, echo=T}
volatilities_names <- paste0("volatilities","[", 1:ncol(age_specific_mortality_counts[,-c(1:5)]),"]")
posterior_1 <- as.array(igbm_fit)
```
```{r, eval=T, echo=T}
bayesplot::mcmc_pairs(posterior_1,
pars = volatilities_names,
off_diag_args = list(size = 1.5))
```
## Plot the posterior estimates of epidemiological quantities
**Visualize the posterior distribution of the random contact matrix:**
```{r, eval=T, echo=T}
plot_posterior_cm(igbm_fit, y_data = age_specific_mortality_counts)
```
**Visualize the posterior distribution of the age-specific transmission rate:**
```{r, eval=T, echo=T}
post_transmrate_summary <- posterior_transmrate(object = igbm_fit,
y_data = age_specific_mortality_counts)
plot_posterior_transmrate(post_transmrate_summary)
```
The piecewise-constant behavior in the posterior estimates of the age-specific transmission rate is caused by the assumption that the effective contact rate is allowed to change every 7 days. The respective posterior trajectories will be smoothed when more frequent changes are allowed.
**Visualize the posterior distribution of the effective reproduction number:**
```{r, eval=T, echo=T}
post_rt_summary <- posterior_rt(object = igbm_fit,
y_data = age_specific_mortality_counts,
age_distribution_population = aggr_age,
infectious_period = 4)
plot_posterior_rt(post_rt_summary)
```
**Visualize the posterior distribution (age-specific or aggregated daily estimates) of the infection counts:**
```{r, eval=T, echo=T}
post_inf_summary <- posterior_infections(object = igbm_fit,
y_data = age_specific_mortality_counts)
```
```{r, eval=T, echo=T}
plot_posterior_infections(post_inf_summary, type = "age-specific")
```
```{r, eval=T, echo=T}
plot_posterior_infections(post_inf_summary, type = "aggregated")
```
**Visualize the posterior distribution (age-specific or aggregated daily estimates) of the mortality counts:**
```{r, eval=T, echo=T}
post_mortality_summary <- posterior_mortality(object = igbm_fit,
y_data = age_specific_mortality_counts)
```
```{r, eval=T, echo=T}
plot_posterior_mortality(post_mortality_summary, type = "age-specific")
```
```{r, eval=T, echo=T}
plot_posterior_mortality(post_mortality_summary, type = "aggregated")
```
## Model checking and comparison with loo
For more information about the **loo** package visit the [dedicated website](https://mc-stan.org/loo/).
```{r, eval=T, echo=T}
# (for bigger models use as many cores as possible)
log_lik_1 <- loo::extract_log_lik(igbm_fit, merge_chains = FALSE)
r_eff_1 <- loo::relative_eff(exp(log_lik_1), cores = 4)
```
The estimated LOO Information Criterion and the respective estimated effective number of parameters are given by
```{r, eval=T, echo=T}
loo_1 <- loo::loo(log_lik_1, r_eff = r_eff_1, cores = 4)
print(loo_1)
```
The estimated Widely Applicable Information Criterion and the respective estimated effective number of parameters are given by
```{r, eval=T, echo=T}
waic_1 <- loo::waic(log_lik_1)
```
```{r, eval=T, echo=T}
print(waic_1)
```
## References
[1] Bouranis, L., Demiris, N., Kalogeropoulos, K., & Ntzoufras, I. (2022). _Bayesian analysis of diffusion-driven multi-type epidemic models with application to COVID-19_. arXiv.
## Original Computing Environment
```{r, eval=F, echo=T}
writeLines(readLines(file.path(Sys.getenv("HOME"), ".R/Makevars")))
```
```{r, eval=F, echo=T}
CXX14FLAGS=-O3 -Wno-unused-variable -Wno-unused-function
CXX14 = $(BINPREF)g++ -m$(WIN) -std=c++1y
CXX11FLAGS=-O3 -Wno-unused-variable -Wno-unused-function
```
```{r, eval=F, echo=T}
sessionInfo()
```
```{r, eval=F, echo=T}
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] loo_2.4.1 Bernadette_1.1.4 bayesplot_1.8.1 rstan_2.21.3 StanHeaders_2.21.0-7
[6] ggplot2_3.3.5
loaded via a namespace (and not attached):
[1] tidyselect_1.1.1 purrr_0.3.4 reshape2_1.4.4 colorspace_2.0-2 vctrs_0.3.8 generics_0.1.0
[7] stats4_4.1.0 utf8_1.2.1 rlang_1.1.1 pkgbuild_1.2.0 pillar_1.6.3 glue_1.4.2
[13] withr_2.4.2 DBI_1.1.1 matrixStats_0.59.0 lifecycle_1.0.1 plyr_1.8.6 stringr_1.4.0
[19] munsell_0.5.0 gtable_0.3.0 codetools_0.2-18 labeling_0.4.2 inline_0.3.19 callr_3.7.0
[25] ps_1.6.0 parallel_4.1.0 fansi_0.5.0 rstantools_2.1.1 Rcpp_1.0.10 scales_1.1.1
[31] backports_1.2.1 checkmate_2.0.0 RcppParallel_5.1.4 farver_2.1.0 gridExtra_2.3 digest_0.6.28
[37] stringi_1.6.2 processx_3.5.2 dplyr_1.0.7 grid_4.1.0 cli_3.6.1 tools_4.1.0
[43] magrittr_2.0.1 tibble_3.1.2 crayon_1.4.1 pkgconfig_2.0.3 ellipsis_0.3.2 prettyunits_1.1.1
[49] ggridges_0.5.3 assertthat_0.2.1 rstudioapi_0.13 R6_2.5.1 compiler_4.1.0
```
Owner
- Name: bernadette-eu
- Login: bernadette-eu
- Kind: user
- Location: Athens, Greece
- Company: Athens University of Economics and Business
- Website: https://bernadette-eu.github.io/
- Twitter: BernadetteMsca
- Repositories: 4
- Profile: https://github.com/bernadette-eu
JOSS Publication
Bernadette: Bayesian Inference and Model Selection for Stochastic Epidemics in R
Published
September 26, 2023
Volume 8, Issue 89, Page 5612
Authors
Tags
Bayesian EpidemicsGitHub Events
Total
- Watch event: 1
- Push event: 3
Last Year
- Watch event: 1
- Push event: 3
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| bernadette-eu | b****b@g****m | 208 |
| Jeffrey Pullin | j****n@g****m | 3 |
| Hugo Gruson | B****o | 2 |
| Lampros Bouranis | l****s@u****e | 2 |
| Arfon Smith | a****n | 1 |
| Andrew Johnson | a****n@a****m | 1 |
Committer Domains (Top 20 + Academic)
arjohnsonau.com: 1
ucdconnect.ie: 1
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 9
- Total pull requests: 5
- Average time to close issues: 1 day
- Average time to close pull requests: 24 days
- Total issue authors: 3
- Total pull request authors: 5
- Average comments per issue: 1.0
- Average comments per pull request: 1.2
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- rowlandseymour (5)
- bernadette-eu (2)
- strengejacke (2)
Pull Request Authors
- Bisaloo (2)
- jeffreypullin (1)
- arfon (1)
- bernadette-eu (1)
- andrjohns (1)
Top Labels
Issue Labels
enhancement (1)
bug (1)
Pull Request Labels
Dependencies
DESCRIPTION
cran
- R >= 4.1 depends
- Rcpp >= 0.12.0 imports
- RcppParallel >= 5.0.1 imports
- dplyr >= 1.0.7 imports
- ggplot2 >= 3.3.5 imports
- lubridate >= 1.7.0 imports
- methods * imports
- readr >= 1.4.0 imports
- rstan >= 2.18.1 imports
- rstantools >= 2.1.1 imports
- scales >= 1.1.1 imports
- scoringRules >= 1.0.0 imports
- tibble >= 3.1.2 imports
- tidyr >= 1.1.3 imports
- utils >= 4.1.0 imports
- covr * suggests
- knitr * suggests
- markdown * suggests
- rmarkdown * suggests
- spelling * suggests
- testthat >= 3.0.0 suggests
.github/workflows/R-CMD-check.yaml
actions
- actions/checkout v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-tinytex v2 composite
.github/workflows/draft-pdf.yml
actions
- actions/checkout v3 composite
- actions/upload-artifact v1 composite
- openjournals/openjournals-draft-action master composite
