**Load packages**

```
library("rprojroot")
root<-has_dirname("modelselection_tutorial")$make_fix_file()
library("loo")
library("rstanarm")
options(mc.cores = parallel::detectCores())
library("ggplot2")
library("bayesplot")
theme_set(bayesplot::theme_default(base_family = "sans"))
```

We demonstrate different cross-validation variants for hierarchical models. Related video discusses the examples used here starting at 22:53. For time series specific cross-validation, see BÃ¼rkner, Gabry and Vehtari (2019).

Here we use a simple grouped data. The example data is taken from section 6 of Gelfand *et al.* (1990), and concerns 30 young rats whose weights were measured weekly for five week.

**Load data**

```
sourceToList = function(file){
source(file, local = TRUE)
d = mget(ls())
d$file = NULL
d
}
#
rats = sourceToList(root("rats.data.R"))
rats = with(rats, list(
N = N,
Npts = length(y),
rat = rep(1:nrow(y), ncol(y)),
x = rep(x, each = nrow(y)),
y = as.numeric(y),
xbar = xbar
))
```

**Create dataframe**

```
dfrats <- with(rats, data.frame(age=x, age_c=x-22, weight=y, rat=rat))
N <- dim(dfrats)[1]
```

**Plot data**

```
pr <- ggplot(data=dfrats, aes(x=age, y=weight)) +
geom_line(aes(group=rat), color="black", size=0.1) +
geom_point(color="black", size=2) +
labs(x="Age (days)", y="Weight (g)", title="Rats data")
pr
```

Just looking at the data it seems that if the rat growth would be modelled with linear model (up to age 36 days), individual intercept are likely and possibly also individual slopes.

We compare three models. One with population effect only, another with additional hierarchical intercept term, and third one with hierarchical intercept and slope terms.

**Simple linear model**

`fit_1 <- stan_glm(weight ~ age, data=dfrats, refresh=0)`

**Linear model with hierarchical intercept**

`fit_2 <- stan_glmer(weight ~ age + (1 | rat), data=dfrats, refresh=0)`

**Linear model with hierarchical intercept and slope**

`fit_3 <- stan_glmer(weight ~ age + (age | rat), data=dfrats, refresh=0)`

In leave-one-out cross-validation, one observation is left out at time and predicted given all the other observations

```
pr1 <- pr +
geom_point(data=dfrats[69,], color="red", size=5, shape=1) +
ggtitle('Leave-one-out')
pr1
```