1 Overview

This is the online appendix of the paper

  • Aki Vehtari, Andrew Gelman, Daniel Simpson, Bob Carpenter, Paul-Christian Bürkner (2019): Rank-normalization, folding, and localization: An improved \(\widehat{R}\) for assessing convergence of MCMC. arXiv preprint arXiv:1903.08008.

The code for the paper is available on Github (https://github.com/avehtari/rhat_ess). Here, we introduce all the code related to the examples presented in the paper and more numerical experiments not discussed in the paper itself.

To help you finding your way through all the examples presented in this online appendix, below please find a list of links to the examples discussed in the paper:

2 Examples

In this section, we will go through some examples to demonstrate the usefulness of our proposed methods as well as the associated workflow in determining convergence. Appendices B-E contain more detailed analysis of different algorithm variants and further examples.

First, we load all the necessary R packages and additional functions.

library(tidyverse)
library(gridExtra)
library(rstan)
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)
library(bayesplot)
theme_set(bayesplot::theme_default(base_family = "sans"))
library(rjags)
library(abind)
source('monitornew.R')
source('monitorplot.R')

2.1 Cauchy: A distribution with infinite mean and variance

This section relates to the examples presented in Section 5.1 of the paper.

Traditional \(\widehat{R}\) is based on calculating within and between chain variances. If the marginal distribution of a chain is such that the variance is infinite, this approach is not well justified, as we demonstrate here with a Cauchy-distributed example. The following Cauchy models are from Michael Betancourt’s case study Fitting The Cauchy Distribution. Appendix C contains more detailed analysis of different algorithm variants and further Cauchy examples.

2.1.1 Nominal parameterization of Cauchy

The nominal Cauchy model with direct parameterization is as follows:

writeLines(readLines("cauchy_nom.stan"))
parameters {
  vector[50] x;
}

model {
  x ~ cauchy(0, 1);
}

generated quantities {
  real I = fabs(x[1]) < 1 ? 1 : 0;
}

2.1.1.1 Default Stan options

Run the nominal model:

fit_nom <- stan(file = 'cauchy_nom.stan', seed = 7878, refresh = 0)
Warning: There were 1421 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
Warning: Examine the pairs() plot to diagnose sampling problems

We get HMC specific diagnostic (Betancourt, 2017) warnings about a very large number of transitions after warmup that exceed the maximum treedepth and low estimated Bayesian fraction of missing information, indicating slow mixing likely due to very long tails of the Cauchy distribution.

mon <- monitor(fit_nom)
print(mon)
Inference for the input samples (4 chains: each with iter = 2000; warmup = 1000):

          Q5    Q50    Q95   Mean    SD  Rhat Bulk_ESS Tail_ESS
x[1]   -5.87   0.02   6.78   0.92 13.64  1.01     1972      356
x[2]   -5.76  -0.03   5.11  -0.27  7.53  1.00     2411      797
x[3]   -6.12   0.00   7.73   0.77 10.28  1.01      440      112
x[4]   -5.30  -0.01   5.41  -0.06  7.19  1.01     2263      691
x[5]  -13.14  -0.04   5.83  -2.78 17.58  1.03      167       44
x[6]   -8.10  -0.06   6.39  -0.92 15.59  1.01     1603      398
x[7]   -7.25  -0.08   7.55  -0.29  9.68  1.01     1220      457
x[8]   -4.79   0.02   6.23   0.57  8.65  1.02     1559      428
x[9]   -4.69   0.01   4.82   0.13  5.43  1.00     2603      778
x[10]  -8.43  -0.04   6.50  -1.92 32.96  1.01     1599      257
x[11]  -6.21   0.02   7.09   0.43 16.21  1.01     2458      469
x[12]  -5.55   0.02   6.62   0.17  6.37  1.00     2830      398
x[13]  -6.16   0.00   6.13   0.44 12.78  1.01     2277      561
x[14]  -5.45   0.02   6.79   0.29  8.44  1.01     3118      642
x[15]  -8.17   0.05   9.55   0.26 30.61  1.01     2121      517
x[16]  -5.83   0.00   6.23   0.31  8.60  1.01     2171      583
x[17] -18.10  -0.07   7.39  -6.89 47.38  1.03      207       65
x[18]  -5.89   0.05   5.90   0.10  5.42  1.01     3409      660
x[19]  -7.04   0.03   6.23  -0.23 12.26  1.01     1422      416
x[20]  -6.58   0.03   7.53  -0.19  7.81  1.02      947      441
x[21]  -8.11   0.01   6.57  -2.16 28.47  1.00     1785      391
x[22]  -5.47   0.04   5.24   0.25  8.30  1.00     2836      706
x[23] -10.04  -0.05   6.55 -11.16 87.97  1.04      314       68
x[24] -13.18  -0.07   5.72  -4.58 30.89  1.01      431       95
x[25]  -6.89  -0.02   5.60  -0.44  8.76  1.02     1888      500
x[26]  -5.33  -0.02   4.69  -0.30  6.55  1.01     3095      727
x[27]  -5.19  -0.04   5.17  -0.26  8.50  1.00     3113      868
x[28]  -6.81  -0.02   6.81   0.15  7.71  1.00     4186      919
x[29]  -9.96  -0.02   6.99  -1.66 18.87  1.01      926      219
x[30]  -6.20   0.02   6.62   0.44 14.44  1.01     3500      552
x[31] -18.84  -0.04   5.77  -4.22 29.48  1.02      208       48
x[32]  -6.33  -0.01   6.21  -0.02  6.73  1.02     3044      743
x[33]  -4.81   0.03   4.93   0.22  7.89  1.01     3504     1003
x[34]  -5.75   0.02   5.75  -0.12  9.23  1.00     1889      656
x[35]  -5.98   0.03   7.16   0.06  8.68  1.01     2701      634
x[36]  -7.37   0.00   9.19   0.08 10.82  1.01     1404      394
x[37]  -5.03   0.02   7.82   2.52 29.21  1.01     2020      353
x[38]  -5.40  -0.01   5.59  -0.06  6.64  1.01     3644      738
x[39]  -4.78  -0.02   5.36   0.27  6.40  1.00     2849      752
x[40]  -6.16   0.00   4.58  -0.91 10.45  1.01     1843      452
x[41]  -5.46   0.07   6.55   0.05  6.06  1.01     2322      616
x[42]  -7.63  -0.05   6.38  -0.75 10.10  1.01     1786      510
x[43]  -7.47  -0.04   6.39  -0.59  8.32  1.01     1871      555
x[44]  -5.22   0.00   5.17   0.11  6.15  1.01     1613      569
x[45]  -4.44   0.04  13.98   3.19 17.67  1.01      295       70
x[46]  -5.04   0.01   5.60   0.22  9.06  1.02     3151      432
x[47]  -5.87   0.03   5.46   0.19 11.73  1.02     2247      495
x[48]  -4.99  -0.03   4.91   0.05 10.22  1.01     2914      718
x[49]  -6.42   0.01  10.78  10.69 83.66  1.02      223       51
x[50]  -6.24   0.02   5.71   0.41 19.28  1.00     2610      783
I       0.00   1.00   1.00   0.51  0.50  1.00      683     4000
lp__  -88.54 -68.47 -50.77 -69.10 11.75  1.01      249      449

For each parameter, Bulk_ESS and Tail_ESS are crude measures of 
effective sample size for bulk and tail quantities respectively (good values is 
ESS > 400), and Rhat is the potential scale reduction factor on rank normalized
split chains (at convergence, Rhat = 1).
which_min_ess <- which.min(mon[1:50, 'Tail_ESS'])

Several values of Rhat greater than 1.01 and some ESS less than 400 also indicate convergence propblems. The Appendix C contains more results with longer chains.

We can further analyze potential problems using local efficiency and rank plots. We specifically investigate x[5], which, in this specific run, had the smallest tail-ESS of 44.

We examine the sampling efficiency in different parts of the posterior by computing the efficiency of small interval probability estimates (see Section Efficiency estimate for small interval probability estimates). Each interval contains \(1/k\) of the draws (e.g., \(5\%\) each, if \(k=20\)). The small interval efficiency measures mixing of an function which indicates when the values are inside or outside the specific small interval. As detailed above, this gives us a local efficiency measure which does not depend on the shape of the distribution.

plot_local_ess(fit = fit_nom, par = which_min_ess, nalpha = 20)

The efficiency of sampling is low in the tails, which is clearly caused by slow mixing in long tails of the Cauchy distribution.
Orange ticks show iterations that exceeded the maximum treedepth.

An alternative way to examine the efficiency in different parts of the posterior is to compute efficiency estimates for quantiles (see Section Efficiency for quantiles). Each interval has a specified proportion of draws, and the efficiency measures mixing of a function which indicates when the values are smaller than or equal to the corresponding quantile.

plot_quantile_ess(fit = fit_nom, par = which_min_ess, nalpha = 40)

Similar as above, we see that the efficiency of sampling is low in the tails. Orange ticks show iterations that exceeded the maximum treedepth.

We may also investigate how the estimated effective sample sizes change when we use more and more draws; Brooks and Gelman (1998) proposed to use similar graph for \(\widehat{R}\). If the effective sample size is highly unstable, does not increase proportionally with more draws, or even decreases, this indicates that simply running longer chains will likely not solve the convergence issues. In the plot below, we see how unstable both bulk-ESS and tail-ESS are for this example.

plot_change_ess(fit = fit_nom, par = which_min_ess)