This notebook contains several examples of how to use Stan in R with brms. This notebook assumes basic knowledge of Bayesian inference and MCMC. The examples are related to Bayesian data analysis course.
Load packages
Code
library(tidyr)library(dplyr)library(tibble)library(pillar)library(stringr)library(brms)options(brms.backend ="cmdstanr", mc.cores =2)library(posterior)options(posterior.num_args=list(digits=2))options(pillar.negative =FALSE)library(loo)library(priorsense)#options(priorsense.use_plot_theme=FALSE)library(ggplot2)library(bayesplot)theme_set(bayesplot::theme_default(base_family ="sans", base_size=14))library(tidybayes)library(ggdist)library(patchwork)library(RColorBrewer)library(tinytable)options(tinytable_format_num_fmt ="significant_cell", tinytable_format_digits =2, tinytable_tt_digits=2)SEED <-48927# set random seed for reproducibility
2 Bernoulli model
Toy data with sequence of failures (0) and successes (1). We would like to learn about the unknown probability of success. brms wants the data in a data frame format (tibble which is a variant of data frame can be used, too).
As usual in case of generalized linear models, (GLMs) brms defines the priors on the latent model parameters. With Bernoulli the default link function is logit, and thus the prior is set on logit(theta). As there are no covariates logit(theta)=Intercept. The brms default prior for Intercept is student_t(3, 0, 2.5), but we use student_t(7, 0, 1.5) which is close to logistic distribution, and thus makes the prior near-uniform for theta. We can simulate from these priors to check the implied prior on theta. We next compare the result to using normal(0, 1) prior on logit probability. We visualize the implied priors by sampling from the priors.
plogis is the cumulative density function for logistic distribution, which is equal to inverse-logit transformation. Base-R’s Student’s t-distribution does not have mu and sigma arguments, and thus we use a function from ggdist package.
Almost uniform prior on theta could be obtained also with normal(0,1.5)
data.frame(theta =plogis(rnorm(n=20000, mean=0, sd=1.5))) |>mcmc_hist() +xlim(c(0,1)) +labs(title='normal(0, 1.5) for Intercept')
Formula y ~ 1 corresponds to a model \mathrm{logit}(\theta) = \alpha\times 1 = \alpha. brms denotes the \alpha as Intercept.
fit_bern <-brm(y ~1, family =bernoulli(), data = data_bern,prior =prior(student_t(7, 0, 1.5), class='Intercept'),seed = SEED, refresh =0)
Check the summary of the posterior and inference diagnostics.
fit_bern
Family: bernoulli
Links: mu = logit
Formula: y ~ 1
Data: data_bern (Number of observations: 10)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.76 0.64 -0.45 2.05 1.00 1602 1574
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Extract the posterior draws in data frame (df) format
draws <-as_draws_df(fit_bern)
We can select subset of stored variables with subset_draws() and get summary information using summarise_draws()
draws |>subset_draws(variable='b_Intercept') |>summarise_draws()
which makes the table to show only 2 significant digits for each value, which is sufficient accuracy and having fewer digits makes the table easier to read.
draws |>subset_draws(variable='b_Intercept') |>summarise_draws() |>tt()
tinytable_ll7cc0oh3p0c2jqc3w8f
variable
mean
median
sd
mad
q5
q95
rhat
ess_bulk
ess_tail
b_Intercept
0.76
0.73
0.64
0.62
-0.27
1.9
1
1602
1574
We can compute the probability of success by using plogis which is equal to inverse-logit function
draws <- draws |>mutate_variables(theta=plogis(b_Intercept))
Summary of theta
draws |>subset_draws(variable='theta') |>summarise_draws() |>tt()
Prior and likelihood sensitivity plot shows posterior density estimate depending on amount of power-scaling. Overlapping lines indicate low sensitivity and wider gaps between lines indicate greater sensitivity.
Instead of sequence of 0’s and 1’s, we can summarize the data with the number of trials and the number successes and use Binomial model. The prior is specified in the ‘latent space’. The actual probability of success, theta = plogis(alpha), where plogis is the inverse of the logistic function.
Binomial model with the same data and prior
data_bin <-data.frame(N =c(10), y =c(7))
Formula y | trials(N) ~ 1 corresponds to a model \mathrm{logit}(\theta) = \alpha, and the number of trials for each observation is provided by | trials(N)
fit_bin <-brm(y |trials(N) ~1, family =binomial(), data = data_bin,prior =prior(student_t(7, 0,1.5), class='Intercept'),seed = SEED, refresh =0)
Check the summary of the posterior and inference diagnostics.
fit_bin
Family: binomial
Links: mu = logit
Formula: y | trials(N) ~ 1
Data: data_bin (Number of observations: 1)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.75 0.63 -0.44 2.05 1.00 1330 1429
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Extract the posterior draws in data frame format
draws <-as_draws_df(fit_bin)
Summary of latent intercept
draws |>subset_draws(variable='b_Intercept') |>summarise_draws() |>tt()