This notebook demonstrates Bayesian posterior distributions of model based R-squared and LOO-R2 measures for linear and logistic regression. This notebook is a supplement for the paper
We specifically define R-squared as \[ R^2 = \frac{{\mathrm Var}_{\mu}}{{\mathrm Var}_{\mu}+{\mathrm Var}_{\mathrm res}}, \] where \({\mathrm Var}_{\mu}\) is variance of modelled predictive means, and \({\mathrm Var}_{\mathrm res}\) is the modelled residual variance. Specifically both of these are computed only using posterior quantities from the fitted model.
The residual based R2 uses draws from the residual distribution and is defined as \[ {\mathrm Var}_{\mu}^s = V_{n=1}^N \hat{y}_n^s\\ {\mathrm Var}_{\mathrm res}^s = V_{n=1}^N \hat{e}_n^s, \] where \(\hat{e}_n^s=y_n-\hat{y}_n^s\).
The model based R2 uses draws from the modeled residual variances. For linear regression we define \[ {\mathrm Var}_{\mathrm res}^s = (\sigma^2)^s, \] and for logistic regression, following Tjur (2009), we define \[ {\mathrm Var}_{\mathrm res}^s = \frac{1}{N}\sum_{n=1}^N (\pi_n^s(1-\pi_n^s)), \] where \(\pi_n^s\) are predicted probabilities.
The LOO-R2 uses LOO residuals and is defined \(1-{\mathrm Var}_{\mathrm loo-res}/{\mathrm Var}_{y}\) and \[ {\mathrm Var}_{y} = V_{n=1}^N {y}_n\\ {\mathrm Var}_{\mathrm loo-res} = V_{n=1}^N \hat{e}_{{\mathrm loo},n}, \] where \(\hat{e}_{{\mathrm loo},n}=y_n-\hat{y}_{{\mathrm loo},n}\). The uncertainty in LOO-R2 comes from not knowing the future data distribution (Vehtari and Ojanen, 2012). We can approximate draws from the corresponding distribution using Bayesian bootstrap (Vehtari and Lampinen, 2002).
File contents:
Load packages
Data comes from the Regression and Other Stories examples which are included in the rosdata R package, which can be installed with remotes::install_github("avehtari/ROS-Examples",subdir = "rpackage")
library("rosdata")
library("rstanarm")
options(mc.cores = parallel::detectCores())
library("loo")
library("ggplot2")
library("bayesplot")
theme_set(bayesplot::theme_default(base_family = "sans"))
library("latex2exp")
library("foreign")
library("bayesboot")
SEED <- 1800
set.seed(SEED)
A more complicated version of this function that is compatible with stan_glmer models and specifying newdata will be available in the rstanarm package.
@param fit A fitted model object returned by stan_glm. @return A vector of R-squared values with length equal to the number of posterior draws.
Bayes-R2 function using residuals
bayes_R2_res <- function(fit) {
y <- rstanarm::get_y(fit)
ypred <- rstanarm::posterior_epred(fit)
if (family(fit)$family == "binomial" && NCOL(y) == 2) {
trials <- rowSums(y)
y <- y[, 1]
ypred <- ypred %*% diag(trials)
}
e <- -1 * sweep(ypred, 2, y)
var_ypred <- apply(ypred, 1, var)
var_e <- apply(e, 1, var)
var_ypred / (var_ypred + var_e)
}
Bayes-R2 function using modelled (approximate) residual variance
bayes_R2 <- function(fit) {
mupred <- rstanarm::posterior_epred(fit)
var_mupred <- apply(mupred, 1, var)
if (family(fit)$family == "binomial" && NCOL(y) == 1) {
sigma2 <- apply(mupred*(1-mupred), 1, mean)
} else {
sigma2 <- as.matrix(fit, pars = c("sigma"))^2
}
var_mupred / (var_mupred + sigma2)
}
LOO-R2 function using LOO-residuals and Bayesian bootstrap
loo_R2 <- function(fit) {
y <- get_y(fit)
ypred <- posterior_epred(fit)
ll <- log_lik(fit)
M <- length(fit$stanfit@sim$n_save)
N <- fit$stanfit@sim$n_save[[1]]
r_eff <- relative_eff(exp(ll), chain_id = rep(1:M, each = N))
psis_object <- psis(log_ratios = -ll, r_eff = r_eff)
ypredloo <- E_loo(ypred, psis_object, log_ratios = -ll)$value
eloo <- ypredloo-y
n <- length(y)
rd<-rudirichlet(4000, n)
vary <- (rowSums(sweep(rd, 2, y^2, FUN = "*")) -
rowSums(sweep(rd, 2, y, FUN = "*"))^2)*(n/(n-1))
vareloo <- (rowSums(sweep(rd, 2, eloo^2, FUN = "*")) -
rowSums(sweep(rd, 2, eloo, FUN = "*")^2))*(n/(n-1))
looR2 <- 1-vareloo/vary
looR2[looR2 < -1] <- -1
looR2[looR2 > 1] <- 1
return(looR2)
}
x <- 1:5 - 3
y <- c(1.7, 2.6, 2.5, 4.4, 3.8) - 3
xy <- data.frame(x,y)
Lsq fit
fit <- lm(y ~ x, data = xy)
ols_coef <- coef(fit)
yhat <- ols_coef[1] + ols_coef[2] * x
r <- y - yhat
rsq_1 <- var(yhat)/(var(y))
rsq_2 <- var(yhat)/(var(yhat) + var(r))
round(c(rsq_1, rsq_2), 3)
[1] 0.766 0.766
Bayes fit
fit_bayes <- stan_glm(y ~ x, data = xy,
prior_intercept = normal(0, 0.2, autoscale = FALSE),
prior = normal(1, 0.2, autoscale = FALSE),
prior_aux = NULL,
seed = SEED,
refresh = 0
)
posterior <- as.matrix(fit_bayes, pars = c("(Intercept)", "x"))
post_means <- colMeans(posterior)
Median Bayesian R2
round(median(bayesR2res<-bayes_R2_res(fit_bayes)), 2)
[1] 0.8
round(median(bayesR2<-bayes_R2(fit_bayes)), 2)
[1] 0.74
LOO-R2
looR2<-loo_R2(fit_bayes)
Warning: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
round(median(looR2), 2)
[1] 0.56
With small n LOO-R2 is much lower than median of Bayesian R2
Figures
The first section of code below creates plots using base R graphics. Below that there is code to produce the plots using ggplot2.
# take a sample of 20 posterior draws
keep <- sample(nrow(posterior), 20)
samp_20_draws <- posterior[keep, ]
Base graphics version
par(mar=c(3,3,1,1), mgp=c(1.7,.5,0), tck=-.01)
plot(
x, y,
ylim = range(x),
xlab = "x",
ylab = "y",
main = "Least squares and Bayes fits",
bty = "l",
pch = 20
)
abline(coef(fit)[1], coef(fit)[2], col = "black")
text(-1.6,-.7, "Least-squares\nfit", cex = .9)
abline(0, 1, col = "blue", lty = 2)
text(-1, -1.8, "(Prior regression line)", col = "blue", cex = .9)
abline(coef(fit_bayes)[1], coef(fit_bayes)[2], col = "blue")
text(1.4, 1.2, "Posterior mean fit", col = "blue", cex = .9)
points(
x,
coef(fit_bayes)[1] + coef(fit_bayes)[2] * x,
pch = 20,
col = "blue"
)
par(mar=c(3,3,1,1), mgp=c(1.7,.5,0), tck=-.01)
plot(
x, y,
ylim = range(x),
xlab = "x",
ylab = "y",
bty = "l",
pch = 20,
main = "Bayes posterior simulations"
)
for (s in 1:nrow(samp_20_draws)) {
abline(samp_20_draws[s, 1], samp_20_draws[s, 2], col = "#9497eb")
}
abline(
coef(fit_bayes)[1],
coef(fit_bayes)[2],
col = "#1c35c4",
lwd = 2
)
points(x, y, pch = 20, col = "black")
ggplot version
theme_update(
plot.title = element_text(face = "bold", hjust = 0.5),
axis.text = element_text(size = rel(1.1))
)
fig_1a <-
ggplot(xy, aes(x, y)) +
geom_point() +
geom_abline( # ols regression line
intercept = ols_coef[1],
slope = ols_coef[2],
size = 0.4
) +
geom_abline( # prior regression line
intercept = 0,
slope = 1,
color = "blue",
linetype = 2,
size = 0.3
) +
geom_abline( # posterior mean regression line
intercept = post_means[1],
slope = post_means[2],
color = "blue",
size = 0.4
) +
geom_point(
aes(y = post_means[1] + post_means[2] * x),
color = "blue"
) +
annotate(
geom = "text",
x = c(-1.6, -1, 1.4),
y = c(-0.7, -1.8, 1.2),
label = c(
"Least-squares\nfit",
"(Prior regression line)",
"Posterior mean fit"
),
color = c("black", "blue", "blue"),
size = 3.8
) +
ylim(range(x)) +
ggtitle("Least squares and Bayes fits")
plot(fig_1a)
fig_1b <-
ggplot(xy, aes(x, y)) +
geom_abline( # 20 posterior draws of the regression line
intercept = samp_20_draws[, 1],
slope = samp_20_draws[, 2],
color = "#9497eb",
size = 0.25
) +
geom_abline( # posterior mean regression line
intercept = post_means[1],
slope = post_means[2],
color = "#1c35c4",
size = 1
) +
geom_point() +
ylim(range(x)) +
ggtitle("Bayes posterior simulations")
plot(fig_1b)
Bayesian R^2 Posterior and median
mcmc_hist(data.frame(bayesR2), binwidth=0.02)+xlim(c(0,1))+
xlab(TeX('$R^2$'))+
geom_vline(xintercept=median(bayesR2))+
ggtitle('Bayesian R squared posterior and median')
Compare draws from residual variances and sigma
fit<-fit_bayes
y <- rstanarm::get_y(fit)
ypred <- rstanarm::posterior_epred(fit)
e <- -1 * sweep(ypred, 2, y)
var_e <- apply(e, 1, var)
sigma <- as.matrix(fit, pars = c("sigma"))
qplot(sqrt(var_e),sigma)+geom_abline()+
ggtitle('Toy data with n=5')+
xlab('Draws from residual sd')+
ylab('Draws from sigma')
Draws from residual sd are truncated and not properly using the model information, which is reflected in R2 distribution below.
qplot(bayes_R2_res(fit_bayes),bayes_R2(fit_bayes))+geom_abline()+
ggtitle('Toy data with n=5')+
xlab('Bayesian R2 with draws from residual variance')+
ylab('Bayesian R2 with draws from sigma')
Compare residual and model based Bayesian R2, and Bayesian LOO-R2
pxl<-xlim(-0.5,1)
p1<-mcmc_hist(data.frame(bayesR2res), binwidth=0.02)+pxl+
ggtitle('Toy data with n=5')+
xlab('Bayesian R2 with draws from residuals')+
geom_vline(xintercept=median(bayesR2res))
p2<-mcmc_hist(data.frame(bayesR2), binwidth=0.02)+pxl+
xlab('Bayesian R2 with draws from sigma')+
geom_vline(xintercept=median(bayesR2))
p3<-mcmc_hist(data.frame(looR2), binwidth=0.02)+pxl+
xlab('Bayesian LOO-R2')+
geom_vline(xintercept=median(looR2))
bayesplot_grid(p1,p2,p3)
With small n, draws from the residual variance are not a good representation of the uncertainty.
set.seed(20)
y<-rbinom(n=20,size=1,prob=(1:20-0.5)/20)
data <- data.frame(rvote=y, income=1:20)
fit_logit <- stan_glm(rvote ~ income, family=binomial(link="logit"), data=data,
refresh=0)
Median Bayesian R2
round(median(bayesR2res<-bayes_R2_res(fit_logit)), 2)
[1] 0.36
round(median(bayesR2<-bayes_R2(fit_logit)), 2)
[1] 0.38
LOO-R2
looR2<-loo_R2(fit_logit)
Warning: Some Pareto k diagnostic values are slightly high. See help('pareto-k-diagnostic') for details.
round(median(looR2), 2)
[1] 0.3
With small n LOO-R2 is much lower than median of Bayesian R2
Compare residual and model based Bayesian R2, and Bayesian LOO-R2
pxl<-xlim(-.5,1)
p1<-mcmc_hist(data.frame(bayesR2res), binwidth=0.05)+pxl+
ggtitle('Toy logistic data with n=20')+
xlab('Bayesian R2 with draws from residuals')+
geom_vline(xintercept=median(bayesR2res))
p2<-mcmc_hist(data.frame(bayesR2), binwidth=0.05)+pxl+
xlab('Bayesian R2 with draws from sigma')+
geom_vline(xintercept=median(bayesR2))
p3<-mcmc_hist(data.frame(looR2), binwidth=0.05)+pxl+
xlab('Bayesian LOO-R2')+
geom_vline(xintercept=median(looR2))
bayesplot_grid(p1,p2,p3)
With small n, draws from the residual variance are not a good representation of the uncertainty.
Predicting the yields of mesquite bushes, n=46
Prepare data
head(mesquite)
obs group diam1 diam2 total_height canopy_height density weight
1 1 MCD 1.8 1.15 1.30 1.00 1 401.3
2 2 MCD 1.7 1.35 1.35 1.33 1 513.7
3 3 MCD 2.8 2.55 2.16 0.60 1 1179.2
4 4 MCD 1.3 0.85 1.80 1.20 1 308.0
5 5 MCD 3.3 1.90 1.55 1.05 1 855.2
6 6 MCD 1.4 1.40 1.20 1.00 1 268.7
mesquite$canopy_volume <- mesquite$diam1 * mesquite$diam2 * mesquite$canopy_height
mesquite$canopy_area <- mesquite$diam1 * mesquite$diam2
mesquite$canopy_shape <- mesquite$diam1 / mesquite$diam2
(n <- nrow(mesquite))
[1] 46
Predict log weight model with log canopy volume, log canopy shape, and group
fit_5 <- stan_glm(log(weight) ~ log(canopy_volume) + log(canopy_shape) + group,
data=mesquite, refresh=0)
Median Bayesian R2
round(median(bayesR2res<-bayes_R2_res(fit_5)), 2)
[1] 0.87
round(median(bayesR2<-bayes_R2(fit_5)), 2)
[1] 0.87
LOO-R2
looR2<-loo_R2(fit_5)
Warning: Some Pareto k diagnostic values are slightly high. See help('pareto-k-diagnostic') for details.
round(median(looR2), 2)
[1] 0.86
LOO-R2 is slightly lower than median of Bayesian R2
Compare residual and model based Bayesian R2
pxl<-xlim(0.6, 0.95)
p1<-mcmc_hist(data.frame(bayesR2res), binwidth=0.01)+pxl+
ggtitle('Mesquite data with n=46')+
xlab('Bayesian R2 with draws from residuals')+
geom_vline(xintercept=median(bayesR2res))
p2<-mcmc_hist(data.frame(bayesR2), binwidth=0.01)+pxl+
xlab('Bayesian R2 with draws from sigma')+
geom_vline(xintercept=median(bayesR2))
p3<-mcmc_hist(data.frame(looR2), binwidth=0.01)+pxl+
xlab('Bayesian LOO-R2')+
geom_vline(xintercept=median(looR2))
bayesplot_grid(p1,p2,p3)
With small n, draws from the residual variance are not a good representation of the uncertainty and R2 distribution is more narrow (50%) than when using draws from the model sigma.
Predict low birth weight, n=189, from Hosmer and Lemeshow (2000) This data was also used by Tjur (2009).
Prepare data
head(lowbwt)
ID low age lwt race smoke ptl ht ui ftv bwt
1 85 0 19 182 2 0 0 0 1 0 2523
2 86 0 33 155 3 0 0 0 0 3 2551
3 87 0 20 105 1 1 0 0 0 1 2557
4 88 0 21 108 1 1 0 0 1 2 2594
5 89 0 18 107 1 1 0 0 1 0 2600
6 91 0 21 124 3 0 0 0 0 0 2622
lowbwt$race <- factor(lowbwt$race)
(n <- nrow(lowbwt))
[1] 189
Predict low birth weight
fit <- stan_glm(low ~ age + lwt + race + smoke,
family=binomial(link="logit"), data=lowbwt, refresh=0)
Median Bayesian R2
round(median(bayesR2res<-bayes_R2_res(fit)), 2)
[1] 0.12
round(median(bayesR2<-bayes_R2(fit)), 2)
[1] 0.12
LOO-R2
looR2<-loo_R2(fit)
round(median(looR2), 2)
[1] 0.03
LOO-R2 is much lower than median of Bayesian R2
Compare residual and model based Bayesian R2
pxl<-xlim(-0.15, 0.3)
p1<-mcmc_hist(data.frame(bayesR2res), binwidth=0.01)+pxl+
ggtitle('Low birth weight data with n=189')+
xlab('Bayesian R2 with draws from residuals')+
geom_vline(xintercept=median(bayesR2res))
p2<-mcmc_hist(data.frame(bayesR2), binwidth=0.01)+pxl+
xlab('Bayesian R2 with draws from sigma')+
geom_vline(xintercept=median(bayesR2))
p3<-mcmc_hist(data.frame(looR2), binwidth=0.01)+pxl+
xlab('Bayesian LOO-R2')+
geom_vline(xintercept=median(looR2))
bayesplot_grid(p1,p2,p3)
With moderate n, the distribution obtained using draws from the residual variances is still more narrow than the distribution obtained using draws from the model based variance.
Predict birth weight, n=189, from Hosmer and Lemeshow (2000). Tjur (2009) used logistic regression for dichotomized birth weight. Below we use the continuos valued birth weight.
Predict birth weight
fit <- stan_glm(bwt ~ age + lwt + race + smoke, data=lowbwt, refresh=0)
Median Bayesian R2
round(median(bayesR2res<-bayes_R2_res(fit)), 2)
[1] 0.16
round(median(bayesR2<-bayes_R2(fit)), 2)
[1] 0.16
LOO-R2
looR2<-loo_R2(fit)
round(median(looR2), 2)
[1] 0.09
LOO-R2 is much lower than median of Bayesian R2
Compare residual and model based Bayesian R2
pxl<-xlim(-0.12, 0.36)
p1<-mcmc_hist(data.frame(bayesR2res), binwidth=0.01)+pxl+
ggtitle('Birth weight data with n=189')+
xlab('Bayesian R2 with draws from residuals')+
geom_vline(xintercept=median(bayesR2res))
p2<-mcmc_hist(data.frame(bayesR2), binwidth=0.01)+pxl+
xlab('Bayesian R2 with draws from sigma')+
geom_vline(xintercept=median(bayesR2))
p3<-mcmc_hist(data.frame(looR2), binwidth=0.01)+pxl+
xlab('Bayesian LOO-R2')+
geom_vline(xintercept=median(looR2))
bayesplot_grid(p1,p2,p3)
With moderate n, the distribution obtained using draws from the residual variances is still more narrow than the distribution obtained using draws from the model based variance.
Children’s test scores data, n=434
Prepare data
head(kidiq)
kid_score mom_hs mom_iq mom_work mom_age
1 65 1 121.11753 4 27
2 98 1 89.36188 4 25
3 85 1 115.44316 4 27
4 83 1 99.44964 3 25
5 115 1 92.74571 4 27
6 98 0 107.90184 1 18
(n <- nrow(kidiq))
[1] 434
Predict test score
fit_3 <- stan_glm(kid_score ~ mom_hs + mom_iq, data=kidiq,
seed=1507, refresh=0)
Median Bayesian R2
round(median(bayesR2res<-bayes_R2_res(fit_3)), 2)
[1] 0.22
round(median(bayesR2<-bayes_R2(fit_3)), 2)
[1] 0.21
LOO-R2
looR2<-loo_R2(fit_3)
round(median(looR2), 2)
[1] 0.2
LOO-R2 is slightly lower than median of Bayesian R2
Compare residual and model based Bayesian R2
pxl<-xlim(0.05, 0.35)
p1<-mcmc_hist(data.frame(bayesR2res), binwidth=0.01)+pxl+
ggtitle('KidIQ data with n=434')+
xlab('Bayesian R2 with draws from residuals')+
geom_vline(xintercept=median(bayesR2res))
p2<-mcmc_hist(data.frame(bayesR2), binwidth=0.01)+pxl+
xlab('Bayesian R2 with draws from sigma')+
geom_vline(xintercept=median(bayesR2))
p3<-mcmc_hist(data.frame(looR2), binwidth=0.01)+pxl+
xlab('Bayesian LOO-R2')+
geom_vline(xintercept=median(looR2))
bayesplot_grid(p1,p2,p3)
Predict respondents’ yearly earnings using survey data from 1990. logistic regression n=1374, linear regression n=1187
Prepare data
head(earnings)
height weight male earn earnk ethnicity education mother_education father_education walk
1 74 210 1 50000 50 White 16 16 16 3
2 66 125 0 60000 60 White 16 16 16 6
3 64 126 0 30000 30 White 16 16 16 8
4 65 200 0 25000 25 White 17 17 NA 8
5 63 110 0 50000 50 Other 16 16 16 5
6 68 165 0 62000 62 Black 18 18 18 1
exercise smokenow tense angry age
1 3 2 0 0 45
2 5 1 0 0 58
3 1 2 1 1 29
4 1 2 0 0 57
5 6 2 0 0 91
6 1 2 2 2 54
earnings_all <- earnings
earnings_all$positive <- earnings_all$earn > 0
(n_all <- nrow(earnings_all))
[1] 1816
# only non-zero earnings
earnings <- earnings_all[earnings_all$positive, ]
(n <- nrow(earnings))
[1] 1629
earnings$log_earn <- log(earnings$earn)
earnings_all$positive <- as.numeric(earnings_all$positive)
Bayesian logistic regression on non-zero earnings Null model
fit_0 <- stan_glm(positive ~ 1, family = binomial(link = "logit"),
data = earnings_all, refresh=0)
loo0<-loo(fit_0)
Predict using height and sex
fit_1a <- stan_glm(positive ~ height + male, family = binomial(link = "logit"),
data = earnings_all, refresh=0)
loo1a<-loo(fit_1a)
loo_compare(loo0, loo1a)
elpd_diff se_diff
fit_1a 0.0 0.0
fit_0 -47.2 8.6
There is a clear difference in predictive performance.
Median Bayesian R2
round(median(bayesR2res<-bayes_R2_res(fit_1a)), 3)
[1] 0.046
round(median(bayesR2<-bayes_R2(fit_1a)), 3)
[1] 0.046
LOO-R2
looR2<-loo_R2(fit_1a)
round(median(looR2), 2)
[1] 0.04
LOO-R2 is slightly lower than median of Bayesian R2
Compare residual and model based Bayesian R2
pxl<-xlim(0.02, 0.11)
p1<-mcmc_hist(data.frame(bayesR2res), binwidth=0.002)+pxl+
ggtitle('Earnings data with n=1374')+
xlab('Bayesian R2 with draws from residuals')+
geom_vline(xintercept=median(bayesR2res))
p2<-mcmc_hist(data.frame(bayesR2), binwidth=0.002)+pxl+
xlab('Bayesian R2 with draws from sigma')+
geom_vline(xintercept=median(bayesR2))
p3<-mcmc_hist(data.frame(looR2), binwidth=0.002)+pxl+
xlab('Bayesian LOO-R2')+
geom_vline(xintercept=median(looR2))
bayesplot_grid(p1,p2,p3)
With plenty of data, there is not much difference between using draws from residuals or from sigma.
Bayesian probit regression on non-zero earnings
fit_1p <- stan_glm(positive ~ height + male,
family = binomial(link = "probit"),
data = earnings_all, refresh=0)
loo1p<-loo(fit_1p)
loo_compare(loo1a, loo1p)
elpd_diff se_diff
fit_1p 0.0 0.0
fit_1a 0.0 0.2
There is no practical difference in predictive performance between logit and probit.
Median Bayesian R2
round(median(bayesR2), 3)
[1] 0.046
round(median(bayesR2p<-bayes_R2(fit_1p)), 3)
[1] 0.046
LOO-R2
looR2p<-loo_R2(fit_1p)
round(median(looR2), 2)
[1] 0.04
round(median(looR2p), 2)
[1] 0.04
LOO-R2 is slightly lower than median of Bayesian R2
Compare logistic and probit models using new Bayesian R2
pxl<-xlim(0.02, 0.11)
p1<-mcmc_hist(data.frame(bayesR2), binwidth=0.002)+pxl+
ggtitle('Earnings data with n=1374')+
xlab('Bayesian R2 for logistic model')+
geom_vline(xintercept=median(bayesR2))
p2<-mcmc_hist(data.frame(bayesR2p), binwidth=0.002)+pxl+
xlab('Bayesian R2 for probit model')+
geom_vline(xintercept=median(bayesR2p))
p3<-mcmc_hist(data.frame(looR2), binwidth=0.002)+pxl+
xlab('Bayesian LOO-R2 for logistic model')+
geom_vline(xintercept=median(looR2))
p4<-mcmc_hist(data.frame(looR2p), binwidth=0.002)+pxl+
xlab('Bayesian LOO-R2 for probit model')+
geom_vline(xintercept=median(looR2p))
bayesplot_grid(p1,p3,p2,p4)
There is no practical difference in predictive performance between logit and probit.
Bayesian model on positive earnings on log scale
fit_1b <- stan_glm(log_earn ~ height + male, data = earnings, refresh=0)
Median Bayesian R2
round(median(bayesR2res<-bayes_R2_res(fit_1b)), 3)
[1] 0.081
round(median(bayesR2<-bayes_R2(fit_1b)), 3)
[1] 0.08
LOO-R2
looR2<-loo_R2(fit_1b)
round(median(looR2), 2)
[1] 0.08
LOO-R2 is slightly lower than median of Bayesian R2
Compare residual and model based Bayesian R2
pxl<-xlim(0.02, 0.15)
p1<-mcmc_hist(data.frame(bayesR2res), binwidth=0.002)+pxl+
ggtitle('Positive earnings data with n=1187')+
xlab('Bayesian R2 with draws from residuals')+
geom_vline(xintercept=median(bayesR2res))
p2<-mcmc_hist(data.frame(bayesR2), binwidth=0.002)+pxl+
xlab('Bayesian R2 with draws from sigma')+
geom_vline(xintercept=median(bayesR2))
p3<-mcmc_hist(data.frame(looR2), binwidth=0.002)+pxl+
xlab('Bayesian LOO-R2')+
geom_vline(xintercept=median(looR2))
bayesplot_grid(p1,p2,p3)
With plenty of data, there is not much difference between using draws from residuals or from sigma.