Assignment 8

LOO-CV model comparison

Author

anonymous

1 General information

This is the template for assignment 8. You can download the qmd-file or copy the code from this rendered document after clicking on </> Code in the top right corner.

Please replace the instructions in this template by your own text, explaining what you are doing in each exercise.

Setup

This block will only be visible in your HTML output, but will be hidden when rendering to PDF with quarto for the submission. Make sure that this does not get displayed in the PDF!

The following loads several needed packages:

library(bayesplot)
This is bayesplot version 1.10.0
- Online documentation and vignettes at mc-stan.org/bayesplot
- bayesplot theme set to bayesplot::theme_default()
   * Does _not_ affect other ggplot2 plots
   * See ?bayesplot_theme_set for details on theme setting
library(cmdstanr)
This is cmdstanr version 0.5.3
- CmdStanR documentation and vignettes: mc-stan.org/cmdstanr
- CmdStan path: /root/.cmdstan/cmdstan-2.31.0
- CmdStan version: 2.31.0

A newer version of CmdStan is available. See ?install_cmdstan() to install it.
To disable this check set option or environment variable CMDSTANR_NO_VER_CHECK=TRUE.
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(ggplot2)
library(ggdist) # for stat_dotsinterval
library(posterior)
This is posterior version 1.4.0

Attaching package: 'posterior'
The following object is masked from 'package:bayesplot':

    rhat
The following objects are masked from 'package:stats':

    mad, sd, var
The following objects are masked from 'package:base':

    %in%, match
library(brms)
Loading required package: Rcpp
Loading 'brms' package (version 2.19.0). Useful instructions
can be found by typing help('brms'). A more detailed introduction
to the package is available through vignette('brms_overview').

Attaching package: 'brms'
The following objects are masked from 'package:ggdist':

    dstudent_t, pstudent_t, qstudent_t, rstudent_t
The following object is masked from 'package:bayesplot':

    rhat
The following object is masked from 'package:stats':

    ar
# Globally specfiy cmdstan backend for brms
options(brms.backend="cmdstanr")
# Tell brms to cache results if possible
options(brms.file_refit="on_change")

# Set more readable themes with bigger font for plotting packages
ggplot2::theme_set(theme_minimal(base_size = 14))
bayesplot::bayesplot_theme_set(theme_minimal(base_size = 14))

2 A hierarchical model for chicken weight time series

2.1 Exploratory data analysis

2.2 (a)

# Useful functions: ggplot, aes(x=...), geom_histogram

2.3 (b)

# Useful functions: ggplot, aes(x=...,y=...,group=...,color=...), geom_line

2.4 Linear regression

2.5 (c)

In brms, a regression can be specified as below, see also below (#m) or the last template. Fill in the appropriate variables, data, and likelihood family. Specify the priors, then run the model (by removing #| eval: false below).

priors <- c(
  prior(normal(0, <value>), coef = "Time"),
  prior(normal(0, <value>), coef = "Diet2"),
  prior(normal(0, <value>), coef = "Diet3"),
  prior(normal(0, <value>), coef = "Diet4")
)

f1 <- brms::brm(
  # This specifies the formula
  <OUTCOME> ~ 1 + <PREDICTOR> + <PREDICTOR>,
  # This specifies the dataset
  data = <data>,
  # This specifies the observation model family
  family = <observation_family>,
  # This passes the priors specified above to brms
  prior = priors,
  # This causes brms to cache the results
  file = "additional_files/assignment8/f1"
)

2.6 (d)

# Useful functions: brms::pp_check

2.7 (e)

# Useful functions: brms::pp_check(..., type = ..., group=...)

2.8 Log-normal linear regression

2.9 (f)

log_priors <- c(
  prior(normal(0, log(3)), coef = "Time"),
  prior(normal(0, log(5)), coef = "Diet2"),
  prior(normal(0, log(5)), coef = "Diet3"),
  prior(normal(0, log(5)), coef = "Diet4")
)

2.10 Hierarchical log-normal linear regression

2.11 (g)

2.12 (h)

2.13 Model comparison using the ELPD

2.14 (i)

# Useful functions: loo, loo_compare

2.15 (j)

# Useful functions: plot(loo(...), label_points = TRUE)

2.16 (k)

Creating a dummy example plot

This block will only be visible in your HTML output, but will be hidden when rendering to PDF with quarto for the submission. Make sure that this does not get displayed in the PDF!

Creating a dummy fit just to be able to generate an example plot below. Generate a similar plot for your hierarchical model.

# The brms-formula (weights ~ ...) below is not one that you should be using in your models!
dummy_fit <- brms::brm(
  weight ~ 1 + Time + Chick,
  data = ChickWeight,
  file="additional_files/assignment8/dummy_fit"
)
# Adjust the chicken_idxs variable to select appropriate chickens
chicken_idxs = c(1,3,11,43)
# Create this plot for your hierarchical model for selected chickens
brms::pp_check(
  dummy_fit, type = "intervals_grouped", group = "Chick", 
  newdata=ChickWeight |> filter(Chick %in% chicken_idxs)
)

2.17 Model comparison using the RMSE

2.18 (l)

rmse function implementation

This block will only be visible in your HTML output, but will be hidden when rendering to PDF with quarto for the submission. Make sure that this does not get displayed in the PDF!

The below function takes a brms fit object and computes either the root-mean-square error (RMSE) or the PSIS-LOO-RMSE, i.e. the RMSE using LOO-CV estimated using PSIS-LOO.

# Compute RMSE or LOO-RMSE
rmse <- function(fit, use_loo=FALSE){
  mean_y_pred <- if(use_loo){
    brms::loo_predict(fit)
  }else{
    colMeans(brms::posterior_predict(fit)) 
  }
  sqrt(mean(
    (mean_y_pred - brms::get_y(fit))^2
  ))
}