Bayesian Data Analysis course - Aalto 2023

Published

December 4, 2024

Aalto 2023 course can be taken online except for the final project presentation. The lectures will be given on campus, but recorded and the recording will be made available online after the lecture. If you are unable to register for the course at the moment in the Sisu, there is no need to email the lecturer. You can start taking the course and register before the end of the course. Sisu shows rooms on campus for the computer exercises, and you can come to ask questions on campus, but you can also ask in Zulip during the same times. You can choose which TA session to join each week separately, without a need to register for those sessions.

All the course material is available in a git repo (and these pages are for easier navigation). All the material can be used in other courses. Text and videos licensed under CC-BY-NC 4.0. Code licensed under BSD-3.

The material will be updated during the course. Exercise instructions and slides will be updated at latest on Monday of the corresponding week. The updated material will appear on the web pages, or you can clone the repo and pull before checking new material. If you don’t want to learn git, you can download the latest zip file.

Book: BDA3

The electronic version of the course book Bayesian Data Analysis, 3rd ed, by Andrew Gelman, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Donald Rubin is available for non-commercial purposes. Hard copies are available from the publisher and many book stores. Aalto library has also copies. See also home page for the book, errata for the book, and chapter notes.

Prerequisites

This course has been designed so that there is strong emphasis in computational aspects of Bayesian data analysis and using the latest computational tools. The project brings together the overall Bayesian workflow aspects.

If you find BDA3 too difficult to start with, I recommend

Communication channels

  • MyCourses is used for some intial announcements, linking to Zulip and Peergrade, and some questionnaires.
  • The primary communication channel is the Zulip course chat (link in MyCourses, login with Aalto account)
    • Don’t ask via email or direct messages. By asking via common streams in the course chat, more eyes will see your question, it will get answered faster and it’s likely that other students benefit from the answer.
    • Login with Aalto account to the Zulip course chat. You can adjust the notifications in the settings.
    • If you have any questions, please ask in the public streams and get answers from course staff or other students (active students helping others will get bonus points).
    • In the chat system, we will have separate streams for each assignment and the project.
    • Stream #general can be used for any kind of general discussions and questions related to the course.
    • All important announcements will be posted to #announcements (no discussion on this stream).
    • Any kind of feedback is welcome on stream #feedback.
    • We have also streams #r, #python, and #stan for questions that are not specific to assignments or the project.
    • Stream #queue is used as a queue for getting help during TA sessions.
    • The lecturer and teaching assistants have names with “(staff)” or “(TA)” in the end of their names.
  • A weekly lecture time on campus includes times for questions and answers
  • If you need one-to-one help, please take part in the TA sessions and ask there.
  • If you find errors in material, post in #feedback stream or submit an issue in github.
  • Peergrade alerts: If you are worried that you forget the deadlines, you can set peergade to send you email when assignment opens for submission, 24 hours before assignment close for submission, assignment is open for reviewing, 24 hours before an assignment closes for reviewing if you haven’t started yet, someone likes my feedback (once a day). Click your name -> User Settings to choose which alerts you want.

Assessment

Assignments (60%) and a project work with presentation (40%). Minimum of 50% of points must be obtained from both the assignments and project work. You can get bonus points from chat activity (e.g. helping other students and reporting typos in the material) and answering time usage questionnaries.

Schedule 2023

The course consists of 12 lectures, 9 assignments, a project work, and a project presentation in periods I and II. It’s good start reading the material for the next lecture and assignment while making the assignment related to the previous lecture. There are 9 assignments and a project work with presentation, and thus the assignments are not in one-to-one correspondence with the lectures. The schedule below lists the lectures and how they connect to the topics, book chapters and assignments.

Schedule overview

Here is an overview of the schedule. Scroll down the page to see detailed instructions for each block. When you are working on assignment related to previous lecture, it is good to start reading the book chapters relaed to the next lecture and assignment.

Readings Lectures Assignment Lecture Date Assignment due date
1. Introduction BDA3 Chapter 1 2023 Lecture 1.1 Introduction,
2023 Lecture 1.2 Course practicalities,
Slides 1.1,
Slides 1.2
Assignment 1 2023-09-04 2023-09-10
2. Basics of Bayesian inference BDA3 Chapter 1,
BDA3 Chapter 2
2023 Lecture 2.1,
2023 Lecture 2.2,
Slides 2
Assignment 2 2023-09-11 2023-09-17
3. Multidimensional posterior BDA3 Chapter 3 2023 Lecture 3.1,
2023 Lecture 3.2
Slides 3
Assignment 3 2023-09-18 2023-09-24
4. Monte Carlo BDA3 Chapter 10 2023 Lecture 4.1,
2023 Lecture 4.2,
Slides 4
Assignment 4 2023-09-25 2023-10-01
5. Markov chain Monte Carlo BDA3 Chapter 11 2023 Lecture 5.1,
2023 Lecture 5.2,
Slides 5
Assignment 5 2023-10-02 2023-10-08
6. Stan, HMC, PPL BDA3 Chapter 12 + extra material on Stan 2023 Lecture 6.1,
2023 Lecture 6.2,
Slides 6
Assignment 6 2023-10-09 2023-10-22
7. Hierarchical models and exchangeability BDA3 Chapter 5 2023 Lecture 7.1,
2023 Lecture 7.2,
2022 Project info,
Slides 7
Assignment 7 2023-10-23 2023-11-05
8. Model checking & cross-validation BDA3 Chapter 6, BDA3 Chapter 7, Visualization in Bayesian workflow, Practical Bayesian cross-validation 2023 Lecture 8.1,
2023 Lecture 8.2,
Slides 8a,Slides 8b
Start project work 2023-10-30 N/A
9. Model comparison, selection, and hypothesis testing BDA3 Chapter 7 (not 7.2 and 7.3),
Practical Bayesian cross-validation
2023 Lecture 9.1,
2023 Lecture 9.2,
Slides 9
Assignment 8 2023-11-06 2023-11-12
10. Decision analysis BDA3 Chapter 9 2023 Lecture 10.1, 2023 Lecture 10.2,
Slides 10a, Slides 10b
Assignment 9 2023-11-13 2023-11-19
11. Variable selectio with projpred, project presentation example BDA3 Chapter 4 2023 Lecture 11.1, 2023 Lecture 11.2, 2023 Lecture 11.3,
Slides 11a, Slides Project Presentation, Slides 11 extra
Project work 2023-11-20 N/A
12. TBA Optional:


Project work 2023-11-27 N/A
13. Project evaluation Project presentations: 11.-15.12. Evaluation week

1) Course introduction, BDA 3 Ch 1, prerequisites assignment

Course practicalities, material, assignments, project work, peergrading, QA sessions, TA sessions, prerequisites, chat, etc.

  • Login with Aalto account to the Zulip course chat with link in MyCourses
  • Signin to Peergrade with link in MyCourses.
  • Introduction/practicalities lecture Monday 2023-09-04 14:15-16, hall C, Otakaari 1**
  • Read BDA3 Chapter 1
  • There are no R/Python demos for Chapter 1
  • Make and submit Assignment 1. Deadline Sunday 2023-09-10 23:59
    • We highly recommend to submit all assignments Friday before 3pm so that you can get TA help before submission. As the course has students who work weekdays (e.g. FiTech students), the late submission until Sunday night is allowed, but we can’t provide support during the weekends.
    • this assignment checks that you have sufficient prerequisite skills (basic probability calculus, and R or Python)
    • General information about assignments
  • Get help in TA sessions 2023-09-06 14-16, 2023-09-07 12-14, 2023-09-08 10-12
    • in Sisu these are marked as exercise sessions, but we call them TA sessions
    • these are optional and you can choose which one to join
    • see more info about TA sessions
  • Optional: Make BDA3 exercises 1.1-1.4, 1.6-1.8 (model solutions available for 1.1-1.6)
  • Start reading Chapters 1+2, see instructions below

2) BDA3 Ch 1+2, basics of Bayesian inference

BDA3 Chapters 1+2, basics of Bayesian inference, observation model, likelihood, posterior and binomial model, predictive distribution and benefit of integration, priors and prior information, and one parameter normal model.

3) BDA3 Ch 3, multidimensional posterior

Multiparameter models, joint, marginal and conditional distribution, normal model, bioassay example, grid sampling and grid evaluation. BDA3 Ch 3.

4) BDA3 Ch 10, Monte Carlo

Numerical issues, Monte Carlo, how many simulation draws are needed, how many digits to report, direct simulation, curse of dimensionality, rejection sampling, and importance sampling. BDA3 Ch 10.

5) BDA3 Ch 11, Markov chain Monte Carlo

Markov chain Monte Carlo, Gibbs sampling, Metropolis algorithm, warm-up, convergence diagnostics, R-hat, and effective sample size. BDA3 Ch 11.

6) BDA3 Ch 12 + Stan, HMC, PPL, Stan

HMC, NUTS, dynamic HMC and HMC specific convergence diagnostics, probabilistic programming and Stan. BDA3 Ch 12 + extra material

7) BDA3 Ch 5, hierarchical models

Hierarchical models and exchangeability. BDA3 Ch 5.

8) BDA3 Ch 6+7 + extra material, model checking, cross-validation

Model checking and cross-validation.

9) BDA3 Ch 7, extra material, model comparison and selection

PSIS-LOO, K-fold-CV, model comparison and selection. Extra lecture on variable selection with projection predictive variable selection.

10) BDA3 Ch 9, decision analysis + BDA3 Ch 4 Laplace approximation and asymptotics

Decision analysis. BDA3 Ch 9. + Laplace approximation and asymptotics. BDA Ch 4.

11) Variable selection with projpred, project presentation example, extra

12) TBA

  • Lecture Monday 2023-11-27 14:15-16, hall T2, CS building
  • TBA
  • Work on project. TAs help with projects. Project deadline 3.12. 23:59
  • TA sessions 2023-11-29 14-16, 2023-11-30 12-14, 2023-12-01 10-12.

13) Project evaluation

  • Project report deadline 3.12. 23:59 (submit to peergrade).
    • Review project reports done by your peers before 7.12. 23:59, and reflect on your feedback.
  • Project presentations 11.-15.12. (evaluation week)

R and Python

We strongly recommend using R in the course as there are more packages for Stan and statistical analysis in R. If you are already fluent in Python, but not in R, then using Python may be easier, but it can still be more useful to learn also R. Unless you are already experienced and have figured out your preferred way to work with R, we recommend

See FAQ for frequently asked questions about R problems in this course. The demo codes provide useful starting points for all the assignments.

English-Finnish-English statistics dictionary

Excellent online English-Finnish-English statistics dictionary:

Shorter English-Finnish dictionary for the terms specific for this course

Sanasta “bayesilainen” esiintyy Suomessa muutamaa erilaista kirjoitustapaa. Olen käyttänyt muotoa “bayesilainen”, joka on muodostettu yleisen vieraskielisten nimien taivutussääntöjen mukaan: “Jos nimi on kirjoitettuna takavokaalinen mutta äännettynä etuvokaalinen, kirjoitetaan päätteseen tavallisesti takavokaali etuvokaalin sijasta, esim. Birminghamissa, Thamesilla.” Terho Itkonen, Kieliopas, 6. painos, Kirjayhtymä, 1997.

Suomen tilastoseura sen sijaan suosittaa muotoa “bayseiläinen”. Heidän perustelunsa löytyy Tilastotieteen sanastosta (ks. linkki yllä). Tilastotieteen sanaston verkkoversiossa on hakutoiminto, ja PDF-versio sisältää käännösten perusteluita sekä hieman tilastotieteen varhaista historiaa Suomessa.