A workflow for deciding how many digits to report when summarizing the posterior distribution, and how to check how many independent Monte Carlo (MC) draws or dependent Markov chain Monte Carlo (MCMC) draws are needed.

Introduction

When running any iterative algorithm we need to decide when to stop or how many iterations and how many sequences to run. In the context of posterior inference, the answer can be divided in two parts: (1) how many chains and iterations do we need to assess that the algorithm is sampling from something close to the target distribution, and (2) how many draws do we need so that the reported posterior summaries would not change in important ways if the inference would be repeated? This notebook discusses the second part.

Before we can answer how many chains and iterations we need to run, we need to know how many significant digits we want to report. Too often, we see tables filled with numbers like \(1.7705\). It’s very unlikely that all these digits are accurately estimated, and also very unlikely that the accuracy in magnitude of one in ten thousand would be needed for any practical purpose. Reporting too many digits makes it more difficult to read summary tables. Thus, before considering how many iterations we need it’s useful to consider how many digits it is sensible to report.

Without any additional information, we may assume that relative errors smaller than 1% are typically negligible and thus most of the time reporting at most 2 significant digits would be sufficient. Accuracy to just one significant digit can be sufficient in the early stages of analysis and can be sensible and convenient even in final reporting, as discussed in the example below.

MCMC and other Monte Carlo methods are stochastic and if the inference would be repeated with a different random seed (random number generators in computers are usually producing deterministic pseudo-random sequences), the estimates would vary. The amount of variation reduces when more iterations are used. For example, if the posterior would be close to normal with standard deviation 1, \(\mathrm{normal}(\mu, 1)\), then 2000 independent draws from the posterior would provide enough accuracy that the second significant digit of posterior mean would only sometimes vary to one smaller or larger value. On the other hand, for 1 significant digit accuracy, 100 independent draws would be often sufficient, but reliable convergence diagnostics may need more iterations than 100 (see, e.g., Vehtari et al. (2021)). The above thumb-rules are useful for the necessary number of independent draws for a posterior mean of a distribution which is relatively close to normal and in other cases and, for exmaple, for posterior quantiles more draws may be needed.

MCMC in general doesn’t produce independent draws and the effect of dependency affects how many draws are needed to estimate different expectations. As in general, we don’t know beforhand how MCMC will perform for a new posterior, and we don’t know what is the scale of that posterior beforehand, we need to start with some initial guess of number of iterations to run.

How many iterations to run and how many digits to report

Summary of workflow for how many digits to report

  1. Run inference with some default number of iterations
  2. Check convergence diagnostics for all parameters
  3. Check that ESS is big enough for reliable convergence diagnostics for all quantities of interest
  4. Look at the posterior for quantities of interest and decide how many significant digits is reasonable taking into account the posterior uncertainty (using SD, MAD, or tail quantiles)
  5. Check that MCSE is small enough for the desired accuracy of reporting the posterior summaries for the quantities of interest.
  • If the accuracy is not sufficient, report less digits or run more iterations.
  • Halving MCSE requires quadrupling the number of iterations (if CLT holds).
  • Different quantities of interest have different MCSE and may require different number of iterations for the desired accuracy.
  • Some quantities of interest may have posterior distribution with infinite variance, and then the ESS and MCSE are not defined for the expectation. In such cases use, for example, median instead of mean and mean absolute deviation (MAD) instead of standard deviation. ESS and MCSE for (non-extreme) quantiles can be derived from the (non-extreme) cumulative probabilities that always have finite mean and variance.

Example

As an example, we analyse the trend in summer months average temperature 1952–2013 at Kilpisjärvi in northwestern Finnish Lapland. Summer months are June, July, and August, and we analyse the average temperature over these months in each year.

Load packages

library("rprojroot")
root<-has_file(".Workflow-Examples-root")$make_fix_file()
library(tidyr) 
library(dplyr) 
library(cmdstanr)
library(posterior)
options(pillar.negative = FALSE)
library(lemon)
library(ggplot2)
library(bayesplot)
theme_set(bayesplot::theme_default(base_family = "sans"))
SEED <- 48927 # set random seed for reproducibility

Kilpisjärvi data and model

Data

Load Kilpisjärvi summer month average temperatures 1952-2013:

data_kilpis <- read.delim(root("Digits/data","kilpisjarvi-summer-temp.csv"), sep = ";")
data_lin <-list(N = nrow(data_kilpis),
             x = data_kilpis$year,
             xpred = 2016,
             y = data_kilpis[,5])

Plot the data

ggplot() +
  geom_point(aes(x, y), data = data.frame(data_lin), size = 1) +
  labs(y = 'Summer temperature\n at Kilpisjärvi', x= "Year")