Load packages
library(tidyr)
library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = 1)
library(loo)
library(ggplot2)
library(gridExtra)
library(bayesplot)
theme_set(bayesplot::theme_default(base_family = "sans"))
library(shinystan)
library(rprojroot)
root<-has_file(".BDA_R_demos_root")$make_fix_file()
SEED <- 48927 # set random seed for reproducability
This notebook contains several examples of how to use Stan in R with rstan. This notebook assumes basic knowledge of Bayesian inference and MCMC. The Stan models are stored in separate .stan-files. The examples are related to Bayesian data analysis course.
Note that you can easily analyse Stan fit objects returned by
stan()
with a ShinyStan package by calling
launch_shinystan(fit)
.
Toy data with sequence of failures (0) and successes (1). We would like to learn about the unknown probability of success.
data_bern <- list(N = 10, y = c(1, 1, 1, 0, 1, 1, 1, 0, 1, 0))
Bernoulli model with a proper Beta(1,1) (uniform) prior
code_bern <- root("demos_rstan", "bern.stan")
writeLines(readLines(code_bern))
// Bernoulli model
data {
int<lower=0> N; // number of observations
array[N] int<lower=0, upper=1> y; // vector of binary observations
}
parameters {
real<lower=0, upper=1> theta; // probability of success
}
model {
// model block creates the log density to be sampled
theta ~ beta(1, 1); // prior
y ~ bernoulli(theta); // observation model / likelihood
// the notation using ~ is syntactic sugar for
// target += beta_lpdf(theta | 1, 1); // lpdf for continuous theta
// target += bernoulli_lpmf(y | theta); // lpmf for discrete y
// target is the log density to be sampled
//
// y is an array of integers and
// y ~ bernoulli(theta);
// is equivalent to
// for (i in 1:N) {
// y[i] ~ bernoulli(theta);
// }
// which is equivalent to
// for (i in 1:N) {
// target += bernoulli_lpmf(y[i] | theta);
// }
}
Sample form the posterior and show the summary
fit_bern <- stan(file = code_bern, data = data_bern, seed = SEED)
monitor(fit_bern)
Plot the histogram of the posterior draws
draws <- as.data.frame(fit_bern)
mcmc_hist(draws, pars='theta')
# or with base R
# hist(draws[,'theta'])
Instead of sequence of 0’s and 1’s, we can summarize the data with the number of experiments and the number successes:
data_bin <- list(N = 10, y = 7)
And then we use Binomial model with Beta(1,1) prior for the probability of success.
code_binom <- root("demos_rstan","binom.stan")
writeLines(readLines(code_binom))
// Binomial model with beta(1,1) prior
data {
int<lower=0> N; // number of experiments
int<lower=0> y; // number of successes
}
parameters {
real<lower=0, upper=1> theta; // probability of success in range (0,1)
}
model {
// model block creates the log density to be sampled
theta ~ beta(1, 1); // prior
y ~ binomial(N, theta); // observation model / likelihood
// the notation using ~ is syntactic sugar for
// target += beta_lpdf(theta | 1, 1); // lpdf for continuous theta
// target += binomial_lpmf(y | N, theta); // lpmf for discrete y
// target is the log density to be sampled
}
Sample from the posterior and plot the posterior. The histogram should look similar as in the Bernoulli case.
fit_bin <- stan(file = code_binom, data = data_bin, seed = SEED)
monitor(fit_bin)
draws <- as.data.frame(fit_bin)
mcmc_hist(draws, pars = 'theta')
Re-run the model with a new data. The compiled Stan program is re-used making the re-use faster.
data_bin <- list(N = 100, y = 70)
fit_bin <- stan(file = code_binom, data = data_bin, seed = SEED)
monitor(fit_bin)
draws <- as.data.frame(fit_bin)
mcmc_hist(draws, pars = 'theta')
In the above examples the probability of success \(\theta\) was declared as
real<lower=0,upper=1> theta;
Stan makes automatic transformation of the variable to the unconstrained space using logit transofrmation for interval constrained and log transformation for half constraints.
The following example shows how we can also make an explicit transformation and use binomial_logit function which takes the unconstrained parameter as an argument and uses logit transformation internally. This form can be useful for better numerical stability.
code_binomb <- root("demos_rstan", "binomb.stan")
writeLines(readLines(code_binomb))
// Binomial model with a roughly uniform prior for
// the probability of success (theta)
data {
int<lower=0> N; // number of experiments
int<lower=0> y; // number of successes
}
parameters {
// sampling is done for the parameters
real alpha; // logit of probability of success in rang (-Inf,Inf)
}
transformed parameters {
// transformed parameters are deterministic transformations of parameters (and data)
real theta = inv_logit(alpha); // probability of success in range (0,1)
}
model {
// model block creates the log density to be sampled
alpha ~ normal(0, 1.5); // roughly uniform prior for theta
y ~ binomial_logit(N, alpha); // model parameterized with logit of probability
// the notation using ~ is syntactic sugar for
// target += normal_lpdf(alpha | 0, 1.5); // lpdf for continuous theta
// target += binomial_logit_lpmf(y | N, alpha); // lpmf for discrete y
// target is the log density to be sampled
}
Here we have used Gaussian prior in the unconstrained space, which produces close to uniform prior for theta.
Sample from the posterior and plot the posterior. The histogram should look similar as with the previous models.
data_bin <- list(N = 100, y = 70)
fit_bin <- stan(file = code_binomb, data = data_bin, seed = SEED)
monitor(fit_bin)
draws <- as.data.frame(fit_bin)
mcmc_hist(draws, pars = 'theta')
An experiment was performed to estimate the effect of beta-blockers on mortality of cardiac patients. A group of patients were randomly assigned to treatment and control groups:
Data:
data_bin2 <- list(N1 = 674, y1 = 39, N2 = 680, y2 = 22)
To analyse whether the treatment is useful, we can use Binomial model for both groups and compute odds-ratio:
code_binom2 <- root("demos_rstan", "binom2.stan")
writeLines(readLines(code_binom2))
// Comparison of two groups with Binomial
data {
int<lower=0> N1; // number of experiments in group 1
int<lower=0> y1; // number of deaths in group 1
int<lower=0> N2; // number of experiments in group 2
int<lower=0> y2; // number of deaths in group 2
}
parameters {
real<lower=0, upper=1> theta1; // probability of death in group 1
real<lower=0, upper=1> theta2; // probability of death in group 2
}
model {
// model block creates the log density to be sampled
theta1 ~ beta(1, 1); // prior
theta2 ~ beta(1, 1); // prior
y1 ~ binomial(N1, theta1); // observation model / likelihood
y2 ~ binomial(N2, theta2); // observation model / likelihood
// the notation using ~ is syntactic sugar for
// target += beta_lpdf(theta1 | 1, 1); // lpdf for continuous theta1
// target += beta_lpdf(theta2 | 1, 1); // lpdf for continuous theta2
// target += binomial_lpmf(y1 | N1, theta1); // lpmf for discrete y1
// target += binomial_lpmf(y2 | N2, theta2); // lpmf for discrete y2
// target is the log density to be sampled
}
generated quantities {
// generated quantities are computed after sampling
real oddsratio = (theta2 / (1 - theta2)) / (theta1 / (1 - theta1));
}
Sample from the posterior and plot the posterior
fit_bin2 <- stan(file = code_binom2, data = data_bin2, seed = SEED)
monitor(fit_bin2)
draws <- as.data.frame(fit_bin2)
mcmc_hist(draws, pars = 'oddsratio') +
geom_vline(xintercept = 1) +
scale_x_continuous(breaks = c(seq(0.25,1.5,by=0.25)))
The following file has Kilpisjärvi summer month temperatures 1952-2013:
data_kilpis <- read.delim(root("demos_rstan","kilpisjarvi-summer-temp.csv"), sep = ";")
data_lin <-list(N = nrow(data_kilpis),
x = data_kilpis$year,
xpred = 2016,
y = data_kilpis[,5])
Plot the data
ggplot() +
geom_point(aes(x, y), data = data.frame(data_lin), size = 1) +
labs(y = 'Summer temp. @Kilpisjärvi', x= "Year") +
guides(linetype = "none")
To analyse whether the average summer month temperature is rising, we use a linear model with Gaussian model for the unexplained variation.
The folloing Stan code allows also setting hyperparameter values as data allowing easier way to use different priors in different analyses:
code_lin <- root("demos_rstan", "lin.stan")
writeLines(readLines(code_lin))
// Gaussian linear model with adjustable priors
data {
int<lower=0> N; // number of data points
vector[N] x; // covariate / predictor
vector[N] y; // target
real xpred; // new covariate value to make predictions
real pmualpha; // prior mean for alpha
real psalpha; // prior std for alpha
real pmubeta; // prior mean for beta
real psbeta; // prior std for beta
real pssigma; // prior std for half-normal prior for sigma
}
parameters {
real alpha; // intercept
real beta; // slope
real<lower=0> sigma; // standard deviation is constrained to be positive
}
transformed parameters {
// deterministic transformation of parameters and data
vector[N] mu = alpha + beta * x; // linear model
}
model {
alpha ~ normal(pmualpha, psalpha); // prior
beta ~ normal(pmubeta, psbeta); // prior
sigma ~ normal(0, pssigma); // as sigma is constrained to be positive,
// this is same as half-normal prior
y ~ normal(mu, sigma); // observation model / likelihood
// the notation using ~ is syntactic sugar for
// target += normal_lpdf(alpha | pmualpha, psalpha);
// target += normal_lpdf(beta | pmubeta, psbeta);
// target += normal_lpdf(y | mu, sigma);
// target is the log density to be sampled
}
generated quantities {
// sample from the predictive distribution
real ypred = normal_rng(alpha + beta * xpred, sigma);
// compute log predictive densities to be used for LOO-CV
vector[N] log_lik;
for (i in 1 : N) {
log_lik[i] = normal_lpdf(y[i] | mu[i], sigma);
}
}
Create another list with data and priors
data_lin_priors <- c(list(
pmualpha = mean(unlist(data_kilpis[,5])), # centered
psalpha = 100, # weakly informative
pmubeta = 0, # a priori incr. and decr. as likely
psbeta = (.1--.1)/6, # avg temp prob does does not incr. more than a degree per 10 years
pssigma = 1), # total variation in summer average temperatures is less +-3 degrees
data_lin)
Run Stan
fit_lin <- stan(file = code_lin, data = data_lin_priors, seed = SEED)
Warning: There were 106 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
Warning: Examine the pairs() plot to diagnose sampling problems
Stan gives a warning: There were {{get_num_max_treedepth(fit_lin)}}
transitions after warmup that exceeded the maximum treedepth. You can
use ShinyStan (launch_shinystan(fit_lin)
) to look at the
treedepth info and joint posterior of alpha and beta, to get a hint for
the reason. ShinyStan helps also checking divergences, energy
diagnostic, ESS’s and Rhats.
Instead of interactive ShinyStan, we can also check the diagnostics as follows
monitor(fit_lin)
Inference for the input samples (4 chains: each with iter = 2000; warmup = 0):
Q5 Q50 Q95 Mean SD Rhat Bulk_ESS Tail_ESS
alpha -53.0 -28.7 -4.1 -28.6 15.2 1.00 921 1221
beta 0.0 0.0 0.0 0.0 0.0 1.00 920 1213
sigma 1.0 1.1 1.3 1.1 0.1 1.00 1382 1316
mu[1] 8.3 8.7 9.2 8.7 0.3 1.00 1201 1815
mu[2] 8.3 8.8 9.2 8.8 0.3 1.00 1219 1802
mu[3] 8.4 8.8 9.2 8.8 0.3 1.00 1238 1755
mu[4] 8.4 8.8 9.2 8.8 0.3 1.00 1259 1829
mu[5] 8.4 8.8 9.2 8.8 0.2 1.00 1282 1838
mu[6] 8.4 8.8 9.2 8.8 0.2 1.00 1309 1874
mu[7] 8.5 8.8 9.2 8.8 0.2 1.00 1339 1951
mu[8] 8.5 8.9 9.2 8.9 0.2 1.00 1373 1996
mu[9] 8.5 8.9 9.3 8.9 0.2 1.00 1416 2014
mu[10] 8.5 8.9 9.3 8.9 0.2 1.00 1465 2042
mu[11] 8.6 8.9 9.3 8.9 0.2 1.00 1514 2064
mu[12] 8.6 8.9 9.3 8.9 0.2 1.00 1561 2075
mu[13] 8.6 9.0 9.3 9.0 0.2 1.00 1613 2086
mu[14] 8.7 9.0 9.3 9.0 0.2 1.00 1674 2086
mu[15] 8.7 9.0 9.3 9.0 0.2 1.00 1742 2134
mu[16] 8.7 9.0 9.3 9.0 0.2 1.00 1819 2234
mu[17] 8.7 9.0 9.3 9.0 0.2 1.00 1898 2255
mu[18] 8.8 9.1 9.4 9.1 0.2 1.00 1995 2245
mu[19] 8.8 9.1 9.4 9.1 0.2 1.00 2107 2376
mu[20] 8.8 9.1 9.4 9.1 0.2 1.00 2237 2344
mu[21] 8.8 9.1 9.4 9.1 0.2 1.00 2384 2340
mu[22] 8.9 9.1 9.4 9.1 0.2 1.00 2536 2314
mu[23] 8.9 9.1 9.4 9.2 0.2 1.00 2711 2344
mu[24] 8.9 9.2 9.4 9.2 0.2 1.00 2904 2383
mu[25] 8.9 9.2 9.4 9.2 0.2 1.00 3113 2467
mu[26] 9.0 9.2 9.5 9.2 0.2 1.00 3342 2450
mu[27] 9.0 9.2 9.5 9.2 0.1 1.00 3542 2413
mu[28] 9.0 9.2 9.5 9.2 0.1 1.00 3713 2402
mu[29] 9.0 9.3 9.5 9.3 0.1 1.00 3864 2461
mu[30] 9.1 9.3 9.5 9.3 0.1 1.00 3981 2473
mu[31] 9.1 9.3 9.5 9.3 0.1 1.00 4052 2366
mu[32] 9.1 9.3 9.6 9.3 0.1 1.00 4076 2290
mu[33] 9.1 9.3 9.6 9.3 0.1 1.00 4052 2244
mu[34] 9.1 9.4 9.6 9.4 0.1 1.00 3980 2327
mu[35] 9.1 9.4 9.6 9.4 0.1 1.00 3870 2547
mu[36] 9.2 9.4 9.7 9.4 0.2 1.00 3726 2570
mu[37] 9.2 9.4 9.7 9.4 0.2 1.00 3567 2776
mu[38] 9.2 9.4 9.7 9.4 0.2 1.00 3394 2730
mu[39] 9.2 9.5 9.7 9.5 0.2 1.00 3226 2666
mu[40] 9.2 9.5 9.7 9.5 0.2 1.00 3037 2708
mu[41] 9.2 9.5 9.8 9.5 0.2 1.00 2853 2687
mu[42] 9.2 9.5 9.8 9.5 0.2 1.00 2679 2540
mu[43] 9.3 9.5 9.8 9.5 0.2 1.00 2506 2507
mu[44] 9.3 9.6 9.8 9.6 0.2 1.00 2324 2296
mu[45] 9.3 9.6 9.9 9.6 0.2 1.00 2170 2350
mu[46] 9.3 9.6 9.9 9.6 0.2 1.00 2033 2238
mu[47] 9.3 9.6 9.9 9.6 0.2 1.00 1920 2286
mu[48] 9.3 9.6 9.9 9.6 0.2 1.00 1818 2240
mu[49] 9.3 9.6 10.0 9.6 0.2 1.00 1731 2183
mu[50] 9.3 9.7 10.0 9.7 0.2 1.00 1657 2242
mu[51] 9.3 9.7 10.0 9.7 0.2 1.00 1593 2191
mu[52] 9.3 9.7 10.1 9.7 0.2 1.00 1537 2153
mu[53] 9.4 9.7 10.1 9.7 0.2 1.00 1487 2093
mu[54] 9.4 9.7 10.1 9.7 0.2 1.00 1441 2101
mu[55] 9.4 9.8 10.1 9.8 0.2 1.00 1401 2059
mu[56] 9.4 9.8 10.2 9.8 0.2 1.00 1366 2012
mu[57] 9.4 9.8 10.2 9.8 0.2 1.00 1333 1970
mu[58] 9.4 9.8 10.2 9.8 0.3 1.00 1305 1950
mu[59] 9.4 9.8 10.3 9.8 0.3 1.00 1279 1950
mu[60] 9.4 9.9 10.3 9.9 0.3 1.00 1255 1842
mu[61] 9.4 9.9 10.3 9.9 0.3 1.00 1234 1810
mu[62] 9.4 9.9 10.3 9.9 0.3 1.00 1215 1799
ypred 8.0 10.0 11.8 9.9 1.2 1.00 3720 3806
log_lik[1] -1.4 -1.1 -0.9 -1.1 0.1 1.00 1390 1525
log_lik[2] -3.9 -2.9 -2.1 -3.0 0.5 1.01 1273 1667
log_lik[3] -1.5 -1.2 -1.0 -1.2 0.1 1.00 1393 1869
log_lik[4] -1.5 -1.2 -1.0 -1.2 0.2 1.00 1401 1676
log_lik[5] -1.5 -1.2 -1.0 -1.3 0.2 1.00 1415 1789
log_lik[6] -2.0 -1.5 -1.3 -1.6 0.2 1.00 1382 2039
log_lik[7] -1.3 -1.1 -0.9 -1.1 0.1 1.00 1435 1387
log_lik[8] -1.2 -1.1 -0.9 -1.1 0.1 1.00 1453 1717
log_lik[9] -3.7 -2.8 -2.2 -2.9 0.5 1.00 1414 1782
log_lik[10] -2.0 -1.6 -1.3 -1.6 0.2 1.00 1483 2174
log_lik[11] -2.2 -1.7 -1.4 -1.8 0.2 1.00 1563 2295
log_lik[12] -1.2 -1.0 -0.9 -1.1 0.1 1.00 1435 1305
log_lik[13] -1.4 -1.2 -1.0 -1.2 0.1 1.00 1616 1872
log_lik[14] -2.9 -2.3 -1.9 -2.3 0.3 1.00 1634 2294
log_lik[15] -1.2 -1.1 -0.9 -1.1 0.1 1.00 1506 1687
log_lik[16] -1.2 -1.1 -0.9 -1.1 0.1 1.00 1458 1506
log_lik[17] -2.3 -1.9 -1.6 -1.9 0.2 1.00 1903 2437
log_lik[18] -2.3 -1.9 -1.6 -1.9 0.2 1.00 1946 2311
log_lik[19] -3.2 -2.5 -2.1 -2.6 0.3 1.00 1814 1905
log_lik[20] -1.2 -1.1 -0.9 -1.1 0.1 1.00 1452 1534
log_lik[21] -3.7 -3.0 -2.4 -3.0 0.4 1.00 1805 1984
log_lik[22] -1.6 -1.3 -1.2 -1.3 0.1 1.00 2444 2533
log_lik[23] -1.6 -1.4 -1.2 -1.4 0.1 1.00 2650 2729
log_lik[24] -5.2 -4.2 -3.3 -4.2 0.6 1.00 1713 2202
log_lik[25] -1.7 -1.4 -1.2 -1.4 0.1 1.00 3024 2478
log_lik[26] -1.5 -1.3 -1.1 -1.3 0.1 1.00 2610 2410
log_lik[27] -1.2 -1.1 -0.9 -1.1 0.1 1.00 1518 1761
log_lik[28] -1.4 -1.2 -1.1 -1.2 0.1 1.00 2384 2435
log_lik[29] -2.1 -1.8 -1.5 -1.8 0.2 1.00 3294 2446
log_lik[30] -2.6 -2.2 -1.9 -2.2 0.2 1.00 2884 2702
log_lik[31] -2.4 -2.1 -1.8 -2.1 0.2 1.00 3125 2847
log_lik[32] -1.9 -1.6 -1.4 -1.6 0.1 1.00 4147 2788
log_lik[33] -1.6 -1.4 -1.2 -1.4 0.1 1.00 3508 2279
log_lik[34] -1.2 -1.1 -0.9 -1.1 0.1 1.00 1596 1780
log_lik[35] -1.2 -1.0 -0.9 -1.0 0.1 1.00 1407 1535
log_lik[36] -3.4 -2.8 -2.3 -2.8 0.3 1.00 2162 2311
log_lik[37] -1.6 -1.3 -1.2 -1.4 0.1 1.00 3120 2765
log_lik[38] -1.2 -1.0 -0.9 -1.0 0.1 1.00 1424 1555
log_lik[39] -1.5 -1.3 -1.2 -1.3 0.1 1.00 2770 2705
log_lik[40] -1.2 -1.1 -0.9 -1.1 0.1 1.00 1559 1687
log_lik[41] -1.3 -1.1 -1.0 -1.1 0.1 1.00 1723 2246
log_lik[42] -1.3 -1.1 -1.0 -1.1 0.1 1.00 1587 1913
log_lik[43] -1.2 -1.0 -0.9 -1.0 0.1 1.00 1375 1454
log_lik[44] -1.6 -1.3 -1.1 -1.3 0.1 1.00 2234 2687
log_lik[45] -1.3 -1.1 -0.9 -1.1 0.1 1.00 1534 1666
log_lik[46] -1.6 -1.4 -1.2 -1.4 0.1 1.00 2007 2344
log_lik[47] -1.2 -1.1 -0.9 -1.1 0.1 1.00 1410 1595
log_lik[48] -1.4 -1.2 -1.0 -1.2 0.1 1.00 1665 2109
log_lik[49] -1.4 -1.2 -1.0 -1.2 0.1 1.00 1641 2230
log_lik[50] -1.2 -1.0 -0.9 -1.0 0.1 1.00 1370 1539
log_lik[51] -2.8 -2.2 -1.8 -2.2 0.3 1.00 1574 2199
log_lik[52] -1.8 -1.4 -1.2 -1.4 0.2 1.00 1582 2252
log_lik[53] -1.3 -1.1 -0.9 -1.1 0.1 1.00 1496 1588
log_lik[54] -1.9 -1.5 -1.2 -1.5 0.2 1.00 1484 2171
log_lik[55] -1.5 -1.2 -1.0 -1.2 0.1 1.00 1426 1764
log_lik[56] -1.4 -1.1 -1.0 -1.2 0.1 1.00 1408 1672
log_lik[57] -1.8 -1.4 -1.2 -1.5 0.2 1.00 1410 2079
log_lik[58] -1.2 -1.0 -0.9 -1.1 0.1 1.00 1346 1463
log_lik[59] -1.9 -1.5 -1.2 -1.5 0.2 1.00 1353 2033
log_lik[60] -1.8 -1.4 -1.1 -1.4 0.2 1.00 1317 1810
log_lik[61] -2.2 -1.7 -1.3 -1.7 0.3 1.00 1261 1854
log_lik[62] -2.1 -1.6 -1.3 -1.6 0.3 1.00 1256 1828
lp__ -41.2 -38.3 -37.3 -38.7 1.3 1.00 1191 1536
For each parameter, Bulk_ESS and Tail_ESS are crude measures of
effective sample size for bulk and tail quantities respectively (an ESS > 100
per chain is considered good), and Rhat is the potential scale reduction
factor on rank normalized split chains (at convergence, Rhat <= 1.05).
The following diagnostics are explained in CmdStan User’s Guide.
check_hmc_diagnostics(fit_lin)
Divergences:
0 of 4000 iterations ended with a divergence.
Tree depth:
106 of 4000 iterations saturated the maximum tree depth of 10 (2.65%).
Try increasing 'max_treedepth' to avoid saturation.
Energy:
E-BFMI indicated no pathological behavior.
Compute the probability that the summer temperature is increasing.
draws_lin <- rstan::extract(fit_lin, permuted = T)
mean(draws_lin$beta>0) # probability that beta > 0
[1] 0.99025
Plot the data, the model fit and prediction for year 2016.
mu <- apply(draws_lin$mu, 2, quantile, c(0.05, 0.5, 0.95)) %>%
t() %>% data.frame(x = data_lin$x, .) %>% gather(pct, y, -x)
pfit <- ggplot() +
geom_point(aes(x, y), data = data.frame(data_lin), size = 1) +
geom_line(aes(x, y, linetype = pct), data = mu, color = 'red') +
scale_linetype_manual(values = c(2,1,2)) +
labs(y = 'Summer temp. @Kilpisjärvi', x= "Year") +
guides(linetype = "none")
pars <- intersect(names(draws_lin), c('beta','sigma','ypred'))
draws <- as.data.frame(fit_lin)
phist <- mcmc_hist(draws, pars = pars)
grid.arrange(pfit, phist, nrow = 2)
In the above we used the unnormalized data and as x values are far away from zero, this will cause very strong posterior dependency between alpha and beta (did you use ShinyStan for the above model?). The strong posterior dependency can be removed by normalizing the data to have zero mean. The following Stan code makes it in Stan. In generated quantities we do correspnding transformation back to the original scale.
code_lin_std <- root("demos_rstan", "lin_std.stan")
writeLines(readLines(code_lin_std))
// Gaussian linear model with standardized data
data {
int<lower=0> N; // number of data points
vector[N] x; // covariate / predictor
vector[N] y; // target
real xpred; // new covariate value to make predictions
}
transformed data {
// deterministic transformations of data
vector[N] x_std = (x - mean(x)) / sd(x);
vector[N] y_std = (y - mean(y)) / sd(y);
real xpred_std = (xpred - mean(x)) / sd(x);
}
parameters {
real alpha; // intercept
real beta; // slope
real<lower=0> sigma_std; // standard deviation is constrained to be positive
}
transformed parameters {
// deterministic transformation of parameters and data
vector[N] mu_std = alpha + beta * x_std; // linear model
}
model {
alpha ~ normal(0, 1); // weakly informative prior (given standardized data)
beta ~ normal(0, 1); // weakly informative prior (given standardized data)
sigma_std ~ normal(0, 1); // weakly informative prior (given standardized data)
y_std ~ normal(mu_std, sigma_std); // observation model / likelihood
}
generated quantities {
// transform to the original data scale
vector[N] mu = mu_std * sd(y) + mean(y);
real<lower=0> sigma = sigma_std * sd(y);
// sample from the predictive distribution
real ypred = normal_rng((alpha + beta * xpred_std) * sd(y) + mean(y),
sigma_std * sd(y));
// compute log predictive densities to be used for LOO-CV
// to make appropriate comparison to other models, this log density is computed
// using the original data scale (y, mu, sigma)
vector[N] log_lik;
for (i in 1 : N) {
log_lik[i] = normal_lpdf(y[i] | mu[i], sigma);
}
}
fit_lin_std <- stan(file = code_lin_std, data = data_lin, seed = SEED)
Now there were no warnings. You can use ShinyStan
(launch_shinystan(fit_lin)
) to look at the posterior and
diagnostics and compare to the previous model results. We can also check
diagnostics with the following commands.
monitor(fit_lin_std)
Inference for the input samples (4 chains: each with iter = 2000; warmup = 0):
Q5 Q50 Q95 Mean SD Rhat Bulk_ESS Tail_ESS
alpha -0.2 0.0 0.2 0.0 0.1 1 3502 3085
beta 0.1 0.3 0.5 0.3 0.1 1 3660 2609
sigma_std 0.8 1.0 1.1 1.0 0.1 1 3685 2790
mu_std[1] -0.9 -0.5 -0.1 -0.5 0.2 1 3778 3026
mu_std[2] -0.9 -0.5 -0.1 -0.5 0.2 1 3779 3026
mu_std[3] -0.9 -0.5 -0.1 -0.5 0.2 1 3780 3074
mu_std[4] -0.9 -0.5 -0.1 -0.5 0.2 1 3779 3074
mu_std[5] -0.8 -0.5 -0.1 -0.5 0.2 1 3780 3100
mu_std[6] -0.8 -0.4 -0.1 -0.4 0.2 1 3780 3074
mu_std[7] -0.8 -0.4 -0.1 -0.4 0.2 1 3776 3099
mu_std[8] -0.7 -0.4 -0.1 -0.4 0.2 1 3772 3015
mu_std[9] -0.7 -0.4 -0.1 -0.4 0.2 1 3769 2968
mu_std[10] -0.7 -0.4 -0.1 -0.4 0.2 1 3766 2968
mu_std[11] -0.7 -0.4 0.0 -0.4 0.2 1 3764 3083
mu_std[12] -0.6 -0.3 0.0 -0.3 0.2 1 3757 3060
mu_std[13] -0.6 -0.3 0.0 -0.3 0.2 1 3753 3106
mu_std[14] -0.6 -0.3 0.0 -0.3 0.2 1 3750 3145
mu_std[15] -0.6 -0.3 0.0 -0.3 0.2 1 3744 3111
mu_std[16] -0.5 -0.3 0.0 -0.3 0.2 1 3735 3113
mu_std[17] -0.5 -0.3 0.0 -0.3 0.2 1 3729 3088
mu_std[18] -0.5 -0.2 0.0 -0.2 0.2 1 3720 3097
mu_std[19] -0.5 -0.2 0.0 -0.2 0.2 1 3711 3096
mu_std[20] -0.4 -0.2 0.0 -0.2 0.1 1 3699 3173
mu_std[21] -0.4 -0.2 0.1 -0.2 0.1 1 3684 3165
mu_std[22] -0.4 -0.2 0.1 -0.2 0.1 1 3669 3150
mu_std[23] -0.4 -0.2 0.1 -0.1 0.1 1 3656 3126
mu_std[24] -0.3 -0.1 0.1 -0.1 0.1 1 3640 3164
mu_std[25] -0.3 -0.1 0.1 -0.1 0.1 1 3624 3105
mu_std[26] -0.3 -0.1 0.1 -0.1 0.1 1 3607 3043
mu_std[27] -0.3 -0.1 0.1 -0.1 0.1 1 3585 2964
mu_std[28] -0.3 -0.1 0.1 -0.1 0.1 1 3565 3033
mu_std[29] -0.3 0.0 0.2 0.0 0.1 1 3544 3066
mu_std[30] -0.2 0.0 0.2 0.0 0.1 1 3526 3087
mu_std[31] -0.2 0.0 0.2 0.0 0.1 1 3510 3110
mu_std[32] -0.2 0.0 0.2 0.0 0.1 1 3493 3073
mu_std[33] -0.2 0.0 0.2 0.0 0.1 1 3477 3060
mu_std[34] -0.2 0.0 0.2 0.0 0.1 1 3462 3052
mu_std[35] -0.1 0.1 0.3 0.1 0.1 1 3448 2981
mu_std[36] -0.1 0.1 0.3 0.1 0.1 1 3435 2919
mu_std[37] -0.1 0.1 0.3 0.1 0.1 1 3426 2941
mu_std[38] -0.1 0.1 0.3 0.1 0.1 1 3419 2961
mu_std[39] -0.1 0.1 0.3 0.1 0.1 1 3411 2861
mu_std[40] -0.1 0.1 0.4 0.1 0.1 1 3405 2915
mu_std[41] -0.1 0.2 0.4 0.2 0.1 1 3403 2831
mu_std[42] 0.0 0.2 0.4 0.2 0.1 1 3400 2770
mu_std[43] 0.0 0.2 0.4 0.2 0.1 1 3401 2730
mu_std[44] 0.0 0.2 0.5 0.2 0.1 1 3401 2793
mu_std[45] 0.0 0.2 0.5 0.2 0.2 1 3402 2840
mu_std[46] 0.0 0.3 0.5 0.3 0.2 1 3404 2779
mu_std[47] 0.0 0.3 0.5 0.3 0.2 1 3405 2731
mu_std[48] 0.0 0.3 0.6 0.3 0.2 1 3406 2741
mu_std[49] 0.0 0.3 0.6 0.3 0.2 1 3410 2703
mu_std[50] 0.0 0.3 0.6 0.3 0.2 1 3413 2703
mu_std[51] 0.0 0.3 0.6 0.3 0.2 1 3418 2722
mu_std[52] 0.0 0.4 0.7 0.4 0.2 1 3420 2720
mu_std[53] 0.1 0.4 0.7 0.4 0.2 1 3423 2721
mu_std[54] 0.1 0.4 0.7 0.4 0.2 1 3427 2701
mu_std[55] 0.1 0.4 0.7 0.4 0.2 1 3430 2700
mu_std[56] 0.1 0.4 0.8 0.4 0.2 1 3435 2782
mu_std[57] 0.1 0.4 0.8 0.4 0.2 1 3440 2783
mu_std[58] 0.1 0.5 0.8 0.5 0.2 1 3446 2782
mu_std[59] 0.1 0.5 0.8 0.5 0.2 1 3449 2610
mu_std[60] 0.1 0.5 0.9 0.5 0.2 1 3455 2609
mu_std[61] 0.1 0.5 0.9 0.5 0.2 1 3460 2608
mu_std[62] 0.1 0.5 0.9 0.5 0.2 1 3464 2574
mu[1] 8.2 8.7 9.2 8.7 0.3 1 3778 3026
mu[2] 8.3 8.7 9.2 8.7 0.3 1 3779 3026
mu[3] 8.3 8.7 9.2 8.7 0.3 1 3780 3074
mu[4] 8.3 8.8 9.2 8.8 0.3 1 3779 3074
mu[5] 8.4 8.8 9.2 8.8 0.3 1 3780 3100
mu[6] 8.4 8.8 9.2 8.8 0.2 1 3780 3074
mu[7] 8.4 8.8 9.2 8.8 0.2 1 3776 3099
mu[8] 8.5 8.8 9.2 8.8 0.2 1 3772 3015
mu[9] 8.5 8.9 9.2 8.9 0.2 1 3769 2968
mu[10] 8.5 8.9 9.2 8.9 0.2 1 3766 2968
mu[11] 8.5 8.9 9.3 8.9 0.2 1 3764 3083
mu[12] 8.6 8.9 9.3 8.9 0.2 1 3757 3060
mu[13] 8.6 8.9 9.3 8.9 0.2 1 3753 3106
mu[14] 8.6 9.0 9.3 9.0 0.2 1 3750 3145
mu[15] 8.7 9.0 9.3 9.0 0.2 1 3744 3111
mu[16] 8.7 9.0 9.3 9.0 0.2 1 3735 3113
mu[17] 8.7 9.0 9.3 9.0 0.2 1 3729 3088
mu[18] 8.7 9.0 9.3 9.0 0.2 1 3720 3097
mu[19] 8.8 9.1 9.3 9.1 0.2 1 3711 3096
mu[20] 8.8 9.1 9.4 9.1 0.2 1 3699 3173
mu[21] 8.8 9.1 9.4 9.1 0.2 1 3684 3165
mu[22] 8.9 9.1 9.4 9.1 0.2 1 3669 3150
mu[23] 8.9 9.1 9.4 9.1 0.2 1 3656 3126
mu[24] 8.9 9.2 9.4 9.2 0.2 1 3640 3164
mu[25] 8.9 9.2 9.4 9.2 0.2 1 3624 3105
mu[26] 9.0 9.2 9.4 9.2 0.1 1 3607 3043
mu[27] 9.0 9.2 9.5 9.2 0.1 1 3585 2964
mu[28] 9.0 9.2 9.5 9.2 0.1 1 3565 3033
mu[29] 9.0 9.3 9.5 9.3 0.1 1 3544 3066
mu[30] 9.0 9.3 9.5 9.3 0.1 1 3526 3087
mu[31] 9.1 9.3 9.5 9.3 0.1 1 3510 3110
mu[32] 9.1 9.3 9.6 9.3 0.1 1 3493 3073
mu[33] 9.1 9.3 9.6 9.3 0.1 1 3477 3060
mu[34] 9.1 9.4 9.6 9.4 0.1 1 3462 3052
mu[35] 9.1 9.4 9.6 9.4 0.1 1 3448 2981
mu[36] 9.2 9.4 9.6 9.4 0.1 1 3435 2919
mu[37] 9.2 9.4 9.7 9.4 0.1 1 3426 2941
mu[38] 9.2 9.4 9.7 9.4 0.2 1 3419 2961
mu[39] 9.2 9.5 9.7 9.5 0.2 1 3411 2861
mu[40] 9.2 9.5 9.7 9.5 0.2 1 3405 2915
mu[41] 9.2 9.5 9.8 9.5 0.2 1 3403 2831
mu[42] 9.3 9.5 9.8 9.5 0.2 1 3400 2770
mu[43] 9.3 9.5 9.8 9.5 0.2 1 3401 2730
mu[44] 9.3 9.6 9.8 9.6 0.2 1 3401 2793
mu[45] 9.3 9.6 9.9 9.6 0.2 1 3402 2840
mu[46] 9.3 9.6 9.9 9.6 0.2 1 3404 2779
mu[47] 9.3 9.6 9.9 9.6 0.2 1 3405 2731
mu[48] 9.3 9.6 10.0 9.6 0.2 1 3406 2741
mu[49] 9.3 9.7 10.0 9.7 0.2 1 3410 2703
mu[50] 9.4 9.7 10.0 9.7 0.2 1 3413 2703
mu[51] 9.4 9.7 10.0 9.7 0.2 1 3418 2722
mu[52] 9.4 9.7 10.1 9.7 0.2 1 3420 2720
mu[53] 9.4 9.7 10.1 9.7 0.2 1 3423 2721
mu[54] 9.4 9.8 10.1 9.8 0.2 1 3427 2701
mu[55] 9.4 9.8 10.2 9.8 0.2 1 3430 2700
mu[56] 9.4 9.8 10.2 9.8 0.2 1 3435 2782
mu[57] 9.4 9.8 10.2 9.8 0.2 1 3440 2783
mu[58] 9.4 9.8 10.3 9.8 0.3 1 3446 2782
mu[59] 9.4 9.9 10.3 9.9 0.3 1 3449 2610
mu[60] 9.4 9.9 10.3 9.9 0.3 1 3455 2609
mu[61] 9.5 9.9 10.4 9.9 0.3 1 3460 2608
mu[62] 9.5 9.9 10.4 9.9 0.3 1 3464 2574
sigma 1.0 1.1 1.3 1.1 0.1 1 3685 2790
ypred 8.1 10.0 11.9 10.0 1.2 1 4131 3978
log_lik[1] -1.4 -1.1 -0.9 -1.1 0.1 1 3293 2756
log_lik[2] -3.9 -2.9 -2.2 -3.0 0.5 1 3744 2675
log_lik[3] -1.5 -1.2 -1.0 -1.2 0.2 1 4088 3151
log_lik[4] -1.5 -1.2 -1.0 -1.2 0.2 1 3903 3039
log_lik[5] -1.5 -1.2 -1.0 -1.2 0.2 1 3929 3016
log_lik[6] -1.9 -1.5 -1.2 -1.5 0.2 1 4008 2970
log_lik[7] -1.3 -1.1 -0.9 -1.1 0.1 1 2864 2672
log_lik[8] -1.3 -1.1 -0.9 -1.1 0.1 1 3424 2897
log_lik[9] -3.7 -2.9 -2.2 -2.9 0.5 1 3660 2675
log_lik[10] -2.0 -1.6 -1.3 -1.7 0.2 1 3980 3228
log_lik[11] -2.1 -1.7 -1.4 -1.7 0.2 1 4053 3253
log_lik[12] -1.2 -1.1 -0.9 -1.1 0.1 1 2939 2723
log_lik[13] -1.4 -1.2 -1.0 -1.2 0.1 1 3699 2813
log_lik[14] -2.8 -2.3 -1.8 -2.3 0.3 1 4106 2968
log_lik[15] -1.3 -1.1 -0.9 -1.1 0.1 1 3714 3011
log_lik[16] -1.2 -1.1 -0.9 -1.1 0.1 1 3063 2745
log_lik[17] -2.2 -1.8 -1.5 -1.9 0.2 1 4067 3181
log_lik[18] -2.3 -1.9 -1.6 -1.9 0.2 1 3800 3046
log_lik[19] -3.1 -2.5 -2.1 -2.6 0.3 1 3555 2932
log_lik[20] -1.2 -1.1 -0.9 -1.1 0.1 1 3180 2668
log_lik[21] -3.7 -3.0 -2.4 -3.0 0.4 1 3410 2878
log_lik[22] -1.6 -1.3 -1.2 -1.4 0.1 1 4082 3191
log_lik[23] -1.6 -1.4 -1.2 -1.4 0.1 1 4027 3275
log_lik[24] -5.1 -4.1 -3.3 -4.1 0.6 1 4166 2825
log_lik[25] -1.6 -1.4 -1.2 -1.4 0.1 1 3762 3210
log_lik[26] -1.5 -1.3 -1.1 -1.3 0.1 1 3612 3029
log_lik[27] -1.2 -1.1 -0.9 -1.1 0.1 1 3777 3009
log_lik[28] -1.4 -1.2 -1.1 -1.2 0.1 1 4086 2918
log_lik[29] -2.0 -1.8 -1.5 -1.8 0.2 1 3641 3245
log_lik[30] -2.6 -2.2 -1.9 -2.2 0.2 1 3976 2849
log_lik[31] -2.4 -2.1 -1.8 -2.1 0.2 1 3930 2916
log_lik[32] -1.9 -1.6 -1.4 -1.6 0.1 1 3745 3294
log_lik[33] -1.6 -1.4 -1.2 -1.4 0.1 1 3561 3127
log_lik[34] -1.2 -1.1 -0.9 -1.1 0.1 1 3829 2903
log_lik[35] -1.2 -1.0 -0.9 -1.0 0.1 1 3619 2832
log_lik[36] -3.4 -2.8 -2.3 -2.8 0.3 1 3995 2774
log_lik[37] -1.5 -1.3 -1.2 -1.4 0.1 1 3802 3115
log_lik[38] -1.2 -1.0 -0.9 -1.1 0.1 1 3592 2965
log_lik[39] -1.5 -1.3 -1.2 -1.3 0.1 1 3794 3079
log_lik[40] -1.2 -1.1 -0.9 -1.1 0.1 1 3711 2634
log_lik[41] -1.3 -1.1 -1.0 -1.1 0.1 1 3286 2707
log_lik[42] -1.3 -1.1 -1.0 -1.1 0.1 1 3278 2650
log_lik[43] -1.2 -1.0 -0.9 -1.0 0.1 1 3327 2790
log_lik[44] -1.6 -1.3 -1.2 -1.3 0.1 1 3469 2702
log_lik[45] -1.2 -1.1 -0.9 -1.1 0.1 1 3540 2641
log_lik[46] -1.6 -1.4 -1.2 -1.4 0.1 1 3699 2653
log_lik[47] -1.2 -1.1 -0.9 -1.1 0.1 1 3257 2791
log_lik[48] -1.4 -1.2 -1.0 -1.2 0.1 1 3454 2780
log_lik[49] -1.4 -1.2 -1.0 -1.2 0.1 1 3462 2760
log_lik[50] -1.2 -1.0 -0.9 -1.0 0.1 1 3073 2880
log_lik[51] -2.8 -2.2 -1.8 -2.2 0.3 1 3491 2970
log_lik[52] -1.7 -1.4 -1.2 -1.4 0.2 1 3665 2854
log_lik[53] -1.3 -1.1 -0.9 -1.1 0.1 1 3373 2649
log_lik[54] -1.8 -1.5 -1.2 -1.5 0.2 1 3654 2867
log_lik[55] -1.5 -1.2 -1.0 -1.2 0.1 1 3727 2858
log_lik[56] -1.4 -1.1 -1.0 -1.2 0.1 1 3616 2831
log_lik[57] -1.9 -1.5 -1.2 -1.5 0.2 1 3593 2976
log_lik[58] -1.2 -1.1 -0.9 -1.1 0.1 1 2748 2730
log_lik[59] -1.9 -1.5 -1.2 -1.5 0.2 1 3604 2956
log_lik[60] -1.8 -1.4 -1.1 -1.4 0.2 1 3702 2702
log_lik[61] -2.3 -1.7 -1.3 -1.8 0.3 1 3638 3020
log_lik[62] -2.1 -1.6 -1.2 -1.6 0.3 1 3661 3008
lp__ -31.7 -28.9 -27.9 -29.2 1.3 1 1754 2805
For each parameter, Bulk_ESS and Tail_ESS are crude measures of
effective sample size for bulk and tail quantities respectively (an ESS > 100
per chain is considered good), and Rhat is the potential scale reduction
factor on rank normalized split chains (at convergence, Rhat <= 1.05).
check_hmc_diagnostics(fit_lin_std)
Divergences:
0 of 4000 iterations ended with a divergence.
Tree depth:
0 of 4000 iterations saturated the maximum tree depth of 10.
Energy:
E-BFMI indicated no pathological behavior.
We see that there are no warnings by diagnostics and ESS’s are higher than with the previous case with non-standardized data.
Next we check that we get similar probability for beta>0.
draws_lin_std <- rstan::extract(fit_lin_std, permuted = T)
mean(draws_lin_std$beta>0) # probability that beta > 0
[1] 0.993
The temperatures used in the above analyses are averages over three months, which makes it more likely that they are normally distributed, but there can be extreme events in the feather and we can check whether more robust Student’s \(t\) observation model woul give different results.
code_lin_std_t <- root("demos_rstan", "lin_std_t.stan")
writeLines(readLines(code_lin_std_t))
// Linear student-t model
data {
int<lower=0> N; // number of data points
vector[N] x; // covariate / predictor
vector[N] y; // target
real xpred; // new covariate value to make predictions
}
transformed data {
// deterministic transformations of data
vector[N] x_std = (x - mean(x)) / sd(x);
vector[N] y_std = (y - mean(y)) / sd(y);
real xpred_std = (xpred - mean(x)) / sd(x);
}
parameters {
real alpha; // intercept
real beta; // slope
real<lower=0> sigma_std; // standard deviation is constrained to be positive
real<lower=1> nu; // degrees of freedom is constrained >1
}
transformed parameters {
// deterministic transformation of parameters and data
vector[N] mu_std = alpha + beta * x_std; // linear model
}
model {
alpha ~ normal(0, 1); // weakly informative prior (given standardized data)
beta ~ normal(0, 1); // weakly informative prior (given standardized data)
sigma_std ~ normal(0, 1); // weakly informative prior (given standardized data)
nu ~ gamma(2, 0.1); // Juárez and Steel(2010)
y_std ~ student_t(nu, mu_std, sigma_std); // observation model / likelihood
}
generated quantities {
// transform to the original data scale
vector[N] mu = mu_std * sd(y) + mean(y);
real<lower=0> sigma = sigma_std * sd(y);
// sample from the predictive distribution
real ypred = student_t_rng(nu,
(alpha + beta * xpred_std) * sd(y) + mean(y),
sigma_std * sd(y));
// compute log predictive densities to be used for LOO-CV
// to make appropriate comparison to other models, this log density is computed
// using the original data scale (y, mu, sigma)
vector[N] log_lik;
for (i in 1 : N) {
log_lik[i] = student_t_lpdf(y[i] | nu, mu[i], sigma);
}
}
fit_lin_std_t <- stan(file = code_lin_std_t, data = data_lin, seed = SEED)
We get some warnings, but these specific warnings are not critical if counts are small as here.
Let’s examine further diagnostics.
monitor(fit_lin_std_t)
Inference for the input samples (4 chains: each with iter = 2000; warmup = 0):
Q5 Q50 Q95 Mean SD Rhat Bulk_ESS Tail_ESS
alpha -0.2 0.0 0.2 0.0 0.1 1 4115 2979
beta 0.1 0.3 0.5 0.3 0.1 1 3791 2655
sigma_std 0.8 0.9 1.1 0.9 0.1 1 3852 2987
nu 7.1 21.0 51.4 24.3 14.4 1 3482 2394
mu_std[1] -1.0 -0.6 -0.1 -0.6 0.3 1 3913 2774
mu_std[2] -0.9 -0.5 -0.1 -0.5 0.2 1 3921 2773
mu_std[3] -0.9 -0.5 -0.1 -0.5 0.2 1 3925 2714
mu_std[4] -0.9 -0.5 -0.1 -0.5 0.2 1 3926 2734
mu_std[5] -0.9 -0.5 -0.1 -0.5 0.2 1 3931 2724
mu_std[6] -0.8 -0.5 -0.1 -0.5 0.2 1 3934 2764
mu_std[7] -0.8 -0.4 -0.1 -0.4 0.2 1 3940 2743
mu_std[8] -0.8 -0.4 -0.1 -0.4 0.2 1 3946 2763
mu_std[9] -0.7 -0.4 -0.1 -0.4 0.2 1 3949 2868
mu_std[10] -0.7 -0.4 -0.1 -0.4 0.2 1 3955 2890
mu_std[11] -0.7 -0.4 -0.1 -0.4 0.2 1 3964 2880
mu_std[12] -0.7 -0.4 0.0 -0.4 0.2 1 3970 2869
mu_std[13] -0.6 -0.3 0.0 -0.3 0.2 1 3975 2890
mu_std[14] -0.6 -0.3 0.0 -0.3 0.2 1 3984 2913
mu_std[15] -0.6 -0.3 0.0 -0.3 0.2 1 3990 2869
mu_std[16] -0.6 -0.3 0.0 -0.3 0.2 1 4001 2901
mu_std[17] -0.5 -0.3 0.0 -0.3 0.2 1 4011 2924
mu_std[18] -0.5 -0.2 0.0 -0.2 0.2 1 4021 2957
mu_std[19] -0.5 -0.2 0.0 -0.2 0.2 1 4032 3037
mu_std[20] -0.5 -0.2 0.0 -0.2 0.1 1 4046 3037
mu_std[21] -0.4 -0.2 0.0 -0.2 0.1 1 4053 3099
mu_std[22] -0.4 -0.2 0.1 -0.2 0.1 1 4064 3108
mu_std[23] -0.4 -0.2 0.1 -0.2 0.1 1 4074 3090
mu_std[24] -0.4 -0.1 0.1 -0.1 0.1 1 4083 3026
mu_std[25] -0.3 -0.1 0.1 -0.1 0.1 1 4090 3240
mu_std[26] -0.3 -0.1 0.1 -0.1 0.1 1 4104 3212
mu_std[27] -0.3 -0.1 0.1 -0.1 0.1 1 4114 3268
mu_std[28] -0.3 -0.1 0.1 -0.1 0.1 1 4118 3162
mu_std[29] -0.3 0.0 0.2 0.0 0.1 1 4115 3160
mu_std[30] -0.2 0.0 0.2 0.0 0.1 1 4117 3162
mu_std[31] -0.2 0.0 0.2 0.0 0.1 1 4117 3036
mu_std[32] -0.2 0.0 0.2 0.0 0.1 1 4117 2956
mu_std[33] -0.2 0.0 0.2 0.0 0.1 1 4107 2911
mu_std[34] -0.2 0.0 0.2 0.0 0.1 1 4100 2857
mu_std[35] -0.1 0.1 0.3 0.1 0.1 1 4090 2892
mu_std[36] -0.1 0.1 0.3 0.1 0.1 1 4085 2829
mu_std[37] -0.1 0.1 0.3 0.1 0.1 1 4067 2900
mu_std[38] -0.1 0.1 0.3 0.1 0.1 1 4055 2880
mu_std[39] -0.1 0.1 0.3 0.1 0.1 1 4037 2895
mu_std[40] -0.1 0.2 0.4 0.2 0.1 1 4030 2862
mu_std[41] 0.0 0.2 0.4 0.2 0.1 1 4018 2890
mu_std[42] 0.0 0.2 0.4 0.2 0.1 1 4002 2833
mu_std[43] 0.0 0.2 0.4 0.2 0.1 1 3985 2681
mu_std[44] 0.0 0.2 0.5 0.2 0.1 1 3971 2723
mu_std[45] 0.0 0.2 0.5 0.2 0.1 1 3958 2723
mu_std[46] 0.0 0.3 0.5 0.3 0.2 1 3939 2781
mu_std[47] 0.0 0.3 0.5 0.3 0.2 1 3927 2700
mu_std[48] 0.0 0.3 0.6 0.3 0.2 1 3913 2800
mu_std[49] 0.0 0.3 0.6 0.3 0.2 1 3899 2823
mu_std[50] 0.1 0.3 0.6 0.3 0.2 1 3888 2843
mu_std[51] 0.1 0.4 0.6 0.4 0.2 1 3877 2768
mu_std[52] 0.1 0.4 0.7 0.4 0.2 1 3870 2844
mu_std[53] 0.1 0.4 0.7 0.4 0.2 1 3863 2931
mu_std[54] 0.1 0.4 0.7 0.4 0.2 1 3859 3050
mu_std[55] 0.1 0.4 0.7 0.4 0.2 1 3852 3057
mu_std[56] 0.1 0.4 0.8 0.4 0.2 1 3850 3032
mu_std[57] 0.1 0.5 0.8 0.5 0.2 1 3846 2984
mu_std[58] 0.1 0.5 0.8 0.5 0.2 1 3837 2894
mu_std[59] 0.1 0.5 0.9 0.5 0.2 1 3832 2940
mu_std[60] 0.1 0.5 0.9 0.5 0.2 1 3824 2874
mu_std[61] 0.2 0.5 0.9 0.5 0.2 1 3824 2893
mu_std[62] 0.2 0.6 0.9 0.6 0.2 1 3822 2850
mu[1] 8.2 8.7 9.1 8.7 0.3 1 3913 2774
mu[2] 8.2 8.7 9.2 8.7 0.3 1 3921 2773
mu[3] 8.3 8.7 9.2 8.7 0.3 1 3925 2714
mu[4] 8.3 8.7 9.2 8.7 0.3 1 3926 2734
mu[5] 8.3 8.8 9.2 8.8 0.3 1 3931 2724
mu[6] 8.4 8.8 9.2 8.8 0.3 1 3934 2764
mu[7] 8.4 8.8 9.2 8.8 0.2 1 3940 2743
mu[8] 8.4 8.8 9.2 8.8 0.2 1 3946 2763
mu[9] 8.4 8.8 9.2 8.8 0.2 1 3949 2868
mu[10] 8.5 8.9 9.2 8.9 0.2 1 3955 2890
mu[11] 8.5 8.9 9.2 8.9 0.2 1 3964 2880
mu[12] 8.5 8.9 9.3 8.9 0.2 1 3970 2869
mu[13] 8.6 8.9 9.3 8.9 0.2 1 3975 2890
mu[14] 8.6 8.9 9.3 8.9 0.2 1 3984 2913
mu[15] 8.6 9.0 9.3 9.0 0.2 1 3990 2869
mu[16] 8.7 9.0 9.3 9.0 0.2 1 4001 2901
mu[17] 8.7 9.0 9.3 9.0 0.2 1 4011 2924
mu[18] 8.7 9.0 9.3 9.0 0.2 1 4021 2957
mu[19] 8.8 9.0 9.3 9.0 0.2 1 4032 3037
mu[20] 8.8 9.1 9.4 9.1 0.2 1 4046 3037
mu[21] 8.8 9.1 9.4 9.1 0.2 1 4053 3099
mu[22] 8.8 9.1 9.4 9.1 0.2 1 4064 3108
mu[23] 8.9 9.1 9.4 9.1 0.2 1 4074 3090
mu[24] 8.9 9.2 9.4 9.2 0.2 1 4083 3026
mu[25] 8.9 9.2 9.4 9.2 0.2 1 4090 3240
mu[26] 8.9 9.2 9.4 9.2 0.2 1 4104 3212
mu[27] 9.0 9.2 9.5 9.2 0.1 1 4114 3268
mu[28] 9.0 9.2 9.5 9.2 0.1 1 4118 3162
mu[29] 9.0 9.3 9.5 9.3 0.1 1 4115 3160
mu[30] 9.0 9.3 9.5 9.3 0.1 1 4117 3162
mu[31] 9.1 9.3 9.5 9.3 0.1 1 4117 3036
mu[32] 9.1 9.3 9.6 9.3 0.1 1 4117 2956
mu[33] 9.1 9.3 9.6 9.3 0.1 1 4107 2911
mu[34] 9.1 9.4 9.6 9.4 0.1 1 4100 2857
mu[35] 9.1 9.4 9.6 9.4 0.1 1 4090 2892
mu[36] 9.2 9.4 9.6 9.4 0.1 1 4085 2829
mu[37] 9.2 9.4 9.7 9.4 0.1 1 4067 2900
mu[38] 9.2 9.5 9.7 9.4 0.1 1 4055 2880
mu[39] 9.2 9.5 9.7 9.5 0.2 1 4037 2895
mu[40] 9.2 9.5 9.7 9.5 0.2 1 4030 2862
mu[41] 9.3 9.5 9.8 9.5 0.2 1 4018 2890
mu[42] 9.3 9.5 9.8 9.5 0.2 1 4002 2833
mu[43] 9.3 9.6 9.8 9.6 0.2 1 3985 2681
mu[44] 9.3 9.6 9.9 9.6 0.2 1 3971 2723
mu[45] 9.3 9.6 9.9 9.6 0.2 1 3958 2723
mu[46] 9.3 9.6 9.9 9.6 0.2 1 3939 2781
mu[47] 9.3 9.6 9.9 9.6 0.2 1 3927 2700
mu[48] 9.4 9.7 10.0 9.7 0.2 1 3913 2800
mu[49] 9.4 9.7 10.0 9.7 0.2 1 3899 2823
mu[50] 9.4 9.7 10.0 9.7 0.2 1 3888 2843
mu[51] 9.4 9.7 10.1 9.7 0.2 1 3877 2768
mu[52] 9.4 9.7 10.1 9.7 0.2 1 3870 2844
mu[53] 9.4 9.8 10.1 9.8 0.2 1 3863 2931
mu[54] 9.4 9.8 10.1 9.8 0.2 1 3859 3050
mu[55] 9.4 9.8 10.2 9.8 0.2 1 3852 3057
mu[56] 9.4 9.8 10.2 9.8 0.2 1 3850 3032
mu[57] 9.5 9.9 10.2 9.9 0.2 1 3846 2984
mu[58] 9.5 9.9 10.3 9.9 0.2 1 3837 2894
mu[59] 9.5 9.9 10.3 9.9 0.3 1 3832 2940
mu[60] 9.5 9.9 10.3 9.9 0.3 1 3824 2874
mu[61] 9.5 9.9 10.4 9.9 0.3 1 3824 2893
mu[62] 9.5 10.0 10.4 10.0 0.3 1 3822 2850
sigma 0.9 1.1 1.3 1.1 0.1 1 3852 2987
ypred 8.0 10.0 11.9 10.0 1.2 1 3857 3646
log_lik[1] -1.4 -1.1 -0.9 -1.1 0.1 1 3357 3025
log_lik[2] -4.0 -3.0 -2.2 -3.1 0.5 1 4158 3227
log_lik[3] -1.6 -1.2 -1.0 -1.3 0.2 1 4048 2754
log_lik[4] -1.5 -1.2 -1.0 -1.2 0.2 1 3923 2873
log_lik[5] -1.5 -1.2 -1.0 -1.2 0.2 1 3954 3007
log_lik[6] -2.0 -1.5 -1.2 -1.6 0.2 1 4128 3018
log_lik[7] -1.2 -1.0 -0.9 -1.0 0.1 1 3327 3068
log_lik[8] -1.3 -1.1 -0.9 -1.1 0.1 1 3639 2642
log_lik[9] -3.8 -2.9 -2.3 -3.0 0.4 1 4240 2998
log_lik[10] -2.2 -1.7 -1.4 -1.7 0.3 1 4015 3049
log_lik[11] -2.2 -1.7 -1.4 -1.8 0.2 1 4123 3099
log_lik[12] -1.2 -1.0 -0.9 -1.0 0.1 1 3525 3192
log_lik[13] -1.4 -1.2 -1.0 -1.2 0.1 1 4005 2921
log_lik[14] -2.9 -2.3 -1.8 -2.3 0.3 1 4055 2806
log_lik[15] -1.3 -1.1 -0.9 -1.1 0.1 1 3834 2860
log_lik[16] -1.2 -1.0 -0.9 -1.0 0.1 1 3695 3135
log_lik[17] -2.3 -1.9 -1.5 -1.9 0.2 1 4099 2907
log_lik[18] -2.4 -2.0 -1.6 -2.0 0.2 1 3958 2974
log_lik[19] -3.2 -2.6 -2.1 -2.6 0.3 1 4179 2949
log_lik[20] -1.2 -1.0 -0.9 -1.0 0.1 1 3797 3059
log_lik[21] -3.7 -3.0 -2.5 -3.0 0.4 1 4337 3069
log_lik[22] -1.6 -1.4 -1.2 -1.4 0.1 1 4049 3254
log_lik[23] -1.7 -1.4 -1.2 -1.4 0.1 1 4031 3211
log_lik[24] -4.9 -4.0 -3.2 -4.0 0.5 1 4258 2794
log_lik[25] -1.7 -1.4 -1.2 -1.4 0.1 1 4273 3138
log_lik[26] -1.5 -1.3 -1.1 -1.3 0.1 1 4240 3177
log_lik[27] -1.2 -1.0 -0.9 -1.0 0.1 1 3915 2796
log_lik[28] -1.4 -1.2 -1.0 -1.2 0.1 1 4027 3166
log_lik[29] -2.1 -1.8 -1.6 -1.8 0.2 1 4014 2999
log_lik[30] -2.7 -2.2 -1.9 -2.2 0.2 1 4027 2814
log_lik[31] -2.5 -2.1 -1.8 -2.1 0.2 1 4010 2849
log_lik[32] -2.0 -1.7 -1.4 -1.7 0.2 1 4095 3187
log_lik[33] -1.6 -1.4 -1.2 -1.4 0.1 1 4274 3048
log_lik[34] -1.2 -1.1 -0.9 -1.1 0.1 1 3901 2838
log_lik[35] -1.2 -1.0 -0.8 -1.0 0.1 1 3848 3093
log_lik[36] -3.5 -2.9 -2.4 -2.9 0.3 1 4114 2864
log_lik[37] -1.6 -1.4 -1.2 -1.4 0.1 1 4037 2723
log_lik[38] -1.2 -1.0 -0.9 -1.0 0.1 1 3814 3118
log_lik[39] -1.5 -1.3 -1.1 -1.3 0.1 1 4029 2772
log_lik[40] -1.2 -1.1 -0.9 -1.1 0.1 1 3816 2881
log_lik[41] -1.3 -1.1 -1.0 -1.1 0.1 1 4176 3161
log_lik[42] -1.3 -1.1 -0.9 -1.1 0.1 1 4090 3189
log_lik[43] -1.2 -1.0 -0.9 -1.0 0.1 1 3765 3064
log_lik[44] -1.6 -1.4 -1.2 -1.4 0.1 1 4190 3214
log_lik[45] -1.2 -1.1 -0.9 -1.1 0.1 1 3750 2897
log_lik[46] -1.6 -1.4 -1.2 -1.4 0.1 1 4056 2809
log_lik[47] -1.2 -1.0 -0.9 -1.0 0.1 1 3800 3015
log_lik[48] -1.4 -1.2 -1.0 -1.2 0.1 1 4140 3270
log_lik[49] -1.5 -1.2 -1.0 -1.2 0.1 1 4136 3327
log_lik[50] -1.2 -1.0 -0.9 -1.0 0.1 1 3553 2838
log_lik[51] -2.8 -2.2 -1.8 -2.2 0.3 1 4025 2852
log_lik[52] -1.8 -1.4 -1.2 -1.4 0.2 1 4068 2837
log_lik[53] -1.3 -1.1 -0.9 -1.1 0.1 1 3640 2971
log_lik[54] -1.8 -1.5 -1.2 -1.5 0.2 1 4065 2961
log_lik[55] -1.4 -1.2 -1.0 -1.2 0.1 1 3906 2970
log_lik[56] -1.4 -1.1 -0.9 -1.1 0.1 1 3773 2971
log_lik[57] -1.9 -1.5 -1.2 -1.5 0.2 1 4032 3086
log_lik[58] -1.2 -1.0 -0.9 -1.0 0.1 1 3271 2770
log_lik[59] -2.0 -1.5 -1.2 -1.6 0.2 1 4020 3060
log_lik[60] -1.8 -1.4 -1.1 -1.4 0.2 1 4057 3008
log_lik[61] -2.4 -1.8 -1.4 -1.8 0.3 1 3969 2963
log_lik[62] -2.1 -1.6 -1.2 -1.6 0.3 1 4055 2991
lp__ -50.4 -47.3 -46.0 -47.6 1.5 1 2109 2567
For each parameter, Bulk_ESS and Tail_ESS are crude measures of
effective sample size for bulk and tail quantities respectively (an ESS > 100
per chain is considered good), and Rhat is the potential scale reduction
factor on rank normalized split chains (at convergence, Rhat <= 1.05).
check_hmc_diagnostics(fit_lin_std_t)
Divergences:
0 of 4000 iterations ended with a divergence.
Tree depth:
0 of 4000 iterations saturated the maximum tree depth of 10.
Energy:
E-BFMI indicated no pathological behavior.
We get similar diagnostics as for the linear Gaussian model with non-standardised data.
Compute the probability that the summer temperature is increasing.
draws_lin_std_t <- extract(fit_lin_std_t, permuted = T)
mean(draws_lin_std_t$beta>0) # probability that beta > 0
[1] 0.9955
We get similar probability as with Gaussian obervation model.
Plot data and the model fit
mu <- apply(draws_lin_std_t$mu, 2, quantile, c(0.05, 0.5, 0.95)) %>%
t() %>% data.frame(x = data_lin$x, .) %>% gather(pct, y, -x)
pfit <- ggplot() +
geom_point(aes(x, y), data = data.frame(data_lin), size = 1) +
geom_line(aes(x, y, linetype = pct), data = mu, color = 'red') +
scale_linetype_manual(values = c(2,1,2)) +
labs(y = 'Summer temp. @Kilpisjärvi', x= "Year") +
guides(linetype = "none")
pars <- intersect(names(draws_lin_std_t), c('beta','sigma','nu','ypred'))
draws <- as.data.frame(fit_lin_std_t)
phist <- mcmc_hist(draws, pars = pars)
grid.arrange(pfit, phist, nrow = 2)
We see also that the marginal posterior of nu is wide with lot of mass for values producing distrbution really close to Gaussian.
We can use leave-one-out cross-validation to compare the expected predictive performance. For the following lines to work, the log-likelihood needs to be evaluated in the stan code. For an example, see lin.stan and Computing approximate leave-one-out cross-validation usig PSIS-LOO.
loo_lin_std <- loo(fit_lin_std)
loo_lin_std_t <- loo(fit_lin_std_t)
loo_compare(loo_lin_std, loo_lin_std_t)
elpd_diff se_diff
model1 0.0 0.0
model2 -0.5 0.3
There is no practical difference between Gaussian and Student’s \(t\) observation model for this data.
Let’s compare the temperatures in three summer months.
data_kilpis <- read.delim(root("demos_rstan","kilpisjarvi-summer-temp.csv"), sep = ";")
data_grp <-list(N = 3*nrow(data_kilpis),
K = 3,
x = rep(1:3, nrow(data_kilpis)),
y = c(t(data_kilpis[,2:4])))
code_grp_aov <- root("demos_rstan", "grp_aov.stan")
writeLines(readLines(code_grp_aov))
// Comparison of k groups with common variance (ANOVA)
data {
int<lower=0> N; // number of observations
int<lower=0> K; // number of groups
array[N] int<lower=1, upper=K> x; // discrete group indicators
vector[N] y; // real valued observations
}
parameters {
vector[K] mu; // group means
real<lower=0> sigma; // common standard deviation constrained to be positive
}
model {
mu ~ normal(0, 100); // weakly informative prior
sigma ~ normal(0, 1); // weakly informative prior
y ~ normal(mu[x], sigma); // observation model / likelihood
}
Fit the model
fit_grp <- stan(file = code_grp_aov, data = data_grp, seed = SEED)
monitor(fit_grp)
check_hmc_diagnostics(fit_grp)
Results do not differ much from the previous, because there is only few groups and quite much data per group, but this works as an example of a hierarchical model.
code_grp_prior_mean <- root("demos_rstan", "grp_prior_mean.stan")
writeLines(readLines(code_grp_prior_mean))
// Comparison of k groups with common variance and
// hierarchical prior for the mean
data {
int<lower=0> N; // number of observations
int<lower=0> K; // number of groups
array[N] int<lower=1, upper=K> x; // discrete group indicators
vector[N] y; // real valued observations
}
parameters {
real mu0; // prior mean
real<lower=0> sigma0; // prior std constrained to be positive
vector[K] mu; // group means
real<lower=0> sigma; // common std constrained to be positive
}
model {
mu0 ~ normal(10, 10); // weakly informative prior
sigma0 ~ normal(0, 10); // weakly informative prior
mu ~ normal(mu0, sigma0); // population prior with unknown parameters
// log-normal prior sets normal prior on logarithm of the paremeter,
// which is useful for positive parameters that shouldn't be very
// close to 0. BDA3 Chapter 5 uses scaled inverse Chi^2 prior, but
// as these don't need to be (semi-)conjugate, thinking in terms of
// log-normal can be easier.
sigma ~ lognormal(0, .5); // weakly informative prior
y ~ normal(mu[x], sigma); // observation model / likelihood
}
Fit the model
fit_grp <- stan(file = code_grp_prior_mean, data = data_grp, seed = SEED)
Warning: There were 8 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: Examine the pairs() plot to diagnose sampling problems
monitor(fit_grp)
check_hmc_diagnostics(fit_grp)
We got a small number of divergences, so we increase
adapt_delta=0.95
. Note that thisis useful only in case of
small number of divergences, and increasing
adapt_delta>0.99
doesn’t usually make sense, and in such
cases there is need to investigate the identifiability or parameter
transformations.
fit_grp <- stan(file = code_grp_prior_mean, data = data_grp, seed = SEED, control=list(adapt_delta=0.95));
monitor(fit_grp)
check_hmc_diagnostics(fit_grp)
code_grp_prior_mean_var <- root("demos_rstan", "grp_prior_mean_var.stan")
writeLines(readLines(code_grp_prior_mean_var))
// Comparison of k groups with unequal variance and
// hierarchical priors for the mean and the variance
data {
int<lower=0> N; // number of data points
int<lower=0> K; // number of groups
array[N] int<lower=1, upper=K> x; // group indicator
vector[N] y; //
}
parameters {
real mu0; // prior mean
real<lower=0> musigma0; // prior std
vector[K] mu; // group means
real lsigma0; // prior mean
real<lower=0> lsigma0s; // prior std
vector<lower=0>[K] sigma; // group stds
}
model {
mu0 ~ normal(10, 10); // weakly informative prior
musigma0 ~ normal(0, 10); // weakly informative prior
mu ~ normal(mu0, musigma0); // population prior with unknown parameters
// lsigma0 is prior mean in log scale
lsigma0 ~ normal(0, 1); // weakly informative prior
// log-normal prior sets normal prior on logarithm of the paremeter,
// which is useful for positive parameters that shouldn't be very
// close to 0. BDA3 Chapter 5 uses scaled inverse Chi^2 prior, but
// as these don't need to be (semi-)conjugate thinking in terms of
// log-normal can be easier.
lsigma0s ~ lognormal(log(0.1), .5); // weakly informative prior
sigma ~ lognormal(lsigma0, lsigma0s); // population prior with unknown parameters
y ~ normal(mu[x], sigma[x]); // observation model / likelihood
}
Fit the model
fit_grp <- stan(file = code_grp_prior_mean_var, data = data_grp, seed = SEED)
Warning: There were 2 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: Examine the pairs() plot to diagnose sampling problems
monitor(fit_grp)
check_hmc_diagnostics(fit_grp)
We got a small number of divergences, so we increase
adapt_delta=0.95
. Note that thisis useful only in case of
small number of divergences, and increasing
adapt_delta>0.99
doesn’t usually make sense, and in such
cases there is need to investigate the identifiability or parameter
transformations.
fit_grp <- stan(file = code_grp_prior_mean_var, data = data_grp, seed = SEED, control=list(adapt_delta=0.95));
monitor(fit_grp)
check_hmc_diagnostics(fit_grp)
Plot the results
draws_grp <- extract(fit_grp, permuted = T)
temps <- data.frame(draws_grp$mu) %>%
setNames(c('June','July','August'))
mcmc_areas(temps) + xlab('Temperature')
Probabilities that June is hotter than July, June is hotter than August and July is hotter than August:
paste('p(TempJun > TempJul) = ', mean(temps$June > temps$July))
[1] "p(TempJun > TempJul) = 0"
paste('p(TempJun > TempAug) = ', mean(temps$June > temps$August))
[1] "p(TempJun > TempAug) = 0"
paste('p(TempJul > TempAug) = ', mean(temps$July > temps$August))
[1] "p(TempJul > TempAug) = 1"