Posterior predictive checking demo

Light speed example with poorly chosen test statistic

ggplot2 is used for plotting, tidyr for manipulating data frames

library(ggplot2)
theme_set(theme_minimal())
library(tidyr)
library(latex2exp)
library(rprojroot)
root<-has_file(".BDA_R_demos_root")$make_fix_file()

Data

y <- read.table(root("demos_ch6","light.txt"))$V1

Sufficient statistics

n <- length(y)
s <- sd(y)
my <- mean(y)

Replicated data

sampt <- replicate(1000, rt(n, n-1)*sqrt(1+1/n)*s+my) %>%
  as.data.frame()

Test statistic here is variance, which is not a good choice as the model has variance parameter and the posterior of that parameter has been updated by the same data used now in checking.

sampt_vars <- data.frame(x = sapply(sampt, var))

Plot test statistics for the data and replicates. Vertical line corresponds to the original data, and the histogram to the replicate data.

title1 <- 'Light speed example with poorly chosen test statistic
Pr(T(yrep,theta) <= T(y,theta)|y)=0.42'
ggplot(data = sampt_vars) +
  geom_histogram(aes(x = x), fill = 'steelblue',
                 color = 'black', binwidth = 6) +
  geom_vline(aes(xintercept = x), data = data.frame(x = s^2),
             color = 'red') +
  labs(x = TeX('Variance of \\mathit{y} and \\mathit{y}^{\\mathrm{rep}}'),
       y = '', title = title1) +
  scale_y_continuous(breaks=NULL)

LS0tCnRpdGxlOiAiQmF5ZXNpYW4gZGF0YSBhbmFseXNpcyBkZW1vIDYuMyIKYXV0aG9yOiAiQWtpIFZlaHRhcmksIE1hcmt1cyBQYWFzaW5pZW1pIgpkYXRlOiAiYHIgZm9ybWF0KFN5cy5EYXRlKCkpYCIKb3V0cHV0OgogIGh0bWxfZG9jdW1lbnQ6CiAgICB0aGVtZTogcmVhZGFibGUKICAgIGNvZGVfZG93bmxvYWQ6IHRydWUKLS0tCiMjIFBvc3RlcmlvciBwcmVkaWN0aXZlIGNoZWNraW5nIGRlbW8KCkxpZ2h0IHNwZWVkIGV4YW1wbGUgd2l0aCBwb29ybHkgY2hvc2VuIHRlc3Qgc3RhdGlzdGljCgpnZ3Bsb3QyIGlzIHVzZWQgZm9yIHBsb3R0aW5nLCB0aWR5ciBmb3IgbWFuaXB1bGF0aW5nIGRhdGEgZnJhbWVzCgpgYGB7ciBzZXR1cCwgbWVzc2FnZT1GQUxTRSwgZXJyb3I9RkFMU0UsIHdhcm5pbmc9RkFMU0V9CmxpYnJhcnkoZ2dwbG90MikKdGhlbWVfc2V0KHRoZW1lX21pbmltYWwoKSkKbGlicmFyeSh0aWR5cikKbGlicmFyeShsYXRleDJleHApCmxpYnJhcnkocnByb2pyb290KQpyb290PC1oYXNfZmlsZSgiLkJEQV9SX2RlbW9zX3Jvb3QiKSRtYWtlX2ZpeF9maWxlKCkKYGBgCgpEYXRhCgpgYGB7ciB9CnkgPC0gcmVhZC50YWJsZShyb290KCJkZW1vc19jaDYiLCJsaWdodC50eHQiKSkkVjEKYGBgCgpTdWZmaWNpZW50IHN0YXRpc3RpY3MKCmBgYHtyIH0KbiA8LSBsZW5ndGgoeSkKcyA8LSBzZCh5KQpteSA8LSBtZWFuKHkpCmBgYAoKUmVwbGljYXRlZCBkYXRhCgpgYGB7ciB9CnNhbXB0IDwtIHJlcGxpY2F0ZSgxMDAwLCBydChuLCBuLTEpKnNxcnQoMSsxL24pKnMrbXkpICU+JQogIGFzLmRhdGEuZnJhbWUoKQpgYGAKClRlc3Qgc3RhdGlzdGljIGhlcmUgaXMgdmFyaWFuY2UsIHdoaWNoIGlzIG5vdCBhIGdvb2QgY2hvaWNlIGFzIHRoZQptb2RlbCBoYXMgdmFyaWFuY2UgcGFyYW1ldGVyIGFuZCB0aGUgcG9zdGVyaW9yIG9mIHRoYXQgcGFyYW1ldGVyCmhhcyBiZWVuIHVwZGF0ZWQgYnkgdGhlIHNhbWUgZGF0YSB1c2VkIG5vdyBpbiBjaGVja2luZy4KCmBgYHtyIH0Kc2FtcHRfdmFycyA8LSBkYXRhLmZyYW1lKHggPSBzYXBwbHkoc2FtcHQsIHZhcikpCmBgYAoKUGxvdCB0ZXN0IHN0YXRpc3RpY3MgZm9yIHRoZSBkYXRhIGFuZCByZXBsaWNhdGVzLgpWZXJ0aWNhbCBsaW5lIGNvcnJlc3BvbmRzIHRvIHRoZSBvcmlnaW5hbCBkYXRhLCBhbmQKdGhlIGhpc3RvZ3JhbSB0byB0aGUgcmVwbGljYXRlIGRhdGEuCgpgYGB7ciB9CnRpdGxlMSA8LSAnTGlnaHQgc3BlZWQgZXhhbXBsZSB3aXRoIHBvb3JseSBjaG9zZW4gdGVzdCBzdGF0aXN0aWMKUHIoVCh5cmVwLHRoZXRhKSA8PSBUKHksdGhldGEpfHkpPTAuNDInCmdncGxvdChkYXRhID0gc2FtcHRfdmFycykgKwogIGdlb21faGlzdG9ncmFtKGFlcyh4ID0geCksIGZpbGwgPSAnc3RlZWxibHVlJywKICAgICAgICAgICAgICAgICBjb2xvciA9ICdibGFjaycsIGJpbndpZHRoID0gNikgKwogIGdlb21fdmxpbmUoYWVzKHhpbnRlcmNlcHQgPSB4KSwgZGF0YSA9IGRhdGEuZnJhbWUoeCA9IHNeMiksCiAgICAgICAgICAgICBjb2xvciA9ICdyZWQnKSArCiAgbGFicyh4ID0gVGVYKCdWYXJpYW5jZSBvZiBcXG1hdGhpdHt5fSBhbmQgXFxtYXRoaXR7eX1ee1xcbWF0aHJte3JlcH19JyksCiAgICAgICB5ID0gJycsIHRpdGxlID0gdGl0bGUxKSArCiAgc2NhbGVfeV9jb250aW51b3VzKGJyZWFrcz1OVUxMKQpgYGAKCg==