ONLINE COURSE – Introduction to Stan for Bayesian Data Analysis (ISBD01) This course will be delivered live
2 June 2021 - 3 June 2021£275.00
This course will now be delivered live by video link in light of travel restrictions due to the COVID-19 (Coronavirus) outbreak.
This is a ‘LIVE COURSE’ – the instructor will be delivering lectures and coaching attendees through the accompanying computer practical’s via video link, a good internet connection is essential.
TIME ZONE – UK local time (GMT+0) – however all sessions will be recorded and made available allowing attendees from different time zones to follow a day behind with an additional 1/2 days support after the official course finish date (please email firstname.lastname@example.org for full details or to discuss how we can accommodate you).
Stan (https://mc-stan.org) is “a state-of-the-art platform for statistical modeling and high-performance statistical
computation. Thousands of users rely on Stan for statistical modeling, data analysis, and prediction in the social,
biological, and physical sciences, engineering, and business.” Stan is a powerful programming language for developing
and fitting custom Bayesian statistical models. In this course, we provide a general introduction to the Stan language,
and describe how to use it to develop and run Bayesian models. We begin by first covering the theory behind Stan,
which covers Bayesian inference, Markov Chain Monte Carlo (MCMC) for sampling from probability distributions, and
the efficient Hamiltonian Monte Carlo (HMC) method that Stan implements. Next, we learn how to write Stan models by
creating simple Bayesian such as binomial models and models using normal distributions. In so doing, the basics of the
Stan language will be apparent. Although Stan can be used with multiple different type of statistical programs (Python,
Julia, Matlab, Stata), we will use Stan with R exclusively, specifically using the rstan or cmdstanr packages. Using these
packages, we will can compile and sample from a HMC sampler for the Bayesian models we defined, plot and
summarize the results, evaluate the models, etc. We then cover some widely used and practically useful models
including linear regression, logistic regression, multilevel and mixed effects models. We will end by covering some
more complex models, including probabilistic mixture models.
THIS IS ONE COURSE IN OUR R SERIES – LOOK OUT FOR COURSES WITH THE SAME COURSE IMAGE TO FIND MORE IN THIS SERIES
This course is aimed at anyone who is in interested in doing advanced Bayesian data analysis using Stan. Stan is a state
of the art tool for advanced analysis across all academic scientific disciplines, engineering, and business, and other
Venue – Delivered remotely
Time zone – GMT+0
Availability – TBC
Duration – 2 days
Contact hours – Approx. 15 hours
ECT’s – Equal to 1 ECT’s
Language – English
PLEASE READ – CANCELLATION POLICY: Cancellations are accepted up to 28 days before the course start date subject to a 25% cancellation fee. Cancellations later than this may be considered, contact email@example.com. Failure to attend will result in the full cost of the course being charged. In the unfortunate event that a course is cancelled due to unforeseen circumstances a full refund of the course fees will be credited.
Dr. Mark Andrews
Works at – Senior Lecturer, Psychology Department, Nottingham Trent University, England
Teaches – Introduction to statistics using R and Rstudio; Introduction data visualization using GG plot 2; Introduction data wrangling using R and Rstudio; Introduction to generalised linear models using R and Rstudio; Introduction to mixed models using R an d Rstudio; Introduction to Bayesian data analysis for social and behavioural sciences using R and Stan; Structural Equation Models, Path Analysis, Causal Modelling and Latent Variable Models Using R; Generalised Linear, Nonlinear and General Additive Models; Python for data science, machine learning, and scientific computing
Mark Andrews is a Senior Lecturer in the Psychology Department at Nottingham Trent University in Nottingham, England. Mark is a graduate of the National University of Ireland and obtained an MA and PhD from Cornell University in New York. Mark’s research focuses on developing and testing Bayesian models of human cognition, with particular focus on human language processing and human memory. Mark’s research also focuses on general Bayesian data analysis, particularly as applied to data from the social and behavioural sciences. Since 2015, he and his colleague Professor Thom Baguley have been funded by the UK’s ESRC funding body to provide intensive workshops on Bayesian data analysis for researchers in the social sciences.
This course will be largely practical, hands-on, and workshop based. For each topic, there will first be some lecture style presentation, i.e., using slides or blackboard, to introduce and explain key concepts and theories. Then, we will cover how to perform the various statistical analyses using R. Any code that the instructor produces during these sessions will be uploaded to a publicly available GitHub site after each session. For the breaks between sessions, and between days, optional exercises will be provided. Solutions to these exercises and brief discussions of them will take place after each break.
The course will take place online using Zoom. On each day, the live video broadcasts will occur during UK local time (GMT+0) at:
All sessions will be video recorded and made available to all attendees as soon as possible, hopefully soon after each 2hr session.
If some sessions are not at a convenient time due to different time zones, attendees are encouraged to join as many of the live broadcasts as possible. For example, attendees from North America may be able to join the live sessions from 3pm-5pm and 6pm-8pm, and then catch up with the 12pm-2pm recorded session once it is uploaded. By joining any live sessions that are possible will allow attendees to benefit from asking questions and having discussions, rather than just watching prerecorded sessions.
At the start of the first day, we will ensure that everyone is comfortable with how Zoom works, and we’ll discuss the procedure for asking questions and raising comments.
Although not strictly required, using a large monitor or preferably even a second monitor will make the learning experience better, as you will be able to see my RStudio and your own RStudio simultaneously.
All the sessions will be video recorded, and made available immediately on a private video hosting website. Any materials, such as slides, data sets, etc., will be shared via GitHub.
Assumed quantitative knowledge
We assume familiarity with inferential statistics concepts like hypothesis testing and statistical significance, and practical experience with linear regression, logistic regression, mixed effects models using R.
Assumed computer background
Some experience and familiarity with R is required. No prior experience with Stan itself is required.
Equipment and software requirements
A computer with a working version of R or RStudio is required. R and RStudio are both available as free and open
source software for PCs, Macs, and Linux computers. In addition to R and RStudio, some R packages, particularly rstan and cmdstanr, are required. Installing these packages will install Stan, which is a standalone program to R. Instructions on how to install R/RStudio and all required R packages will be provided before the course begins.
UNSURE ABOUT SUITABLILITY THEN PLEASE ASK firstname.lastname@example.org
Wednesday 2nd – Classes from 12:00 to 20:00
Topic 1: Hamiltonian Monte Carlo for Bayesian inference. We begin by describing Bayesian inference, whose
objective is the calculation of a probability distribution over a high dimensional space, namely the posterior
distribution. In general, this posterior distribution can not be described analytically, and so to summarize or
make predictions from the posterior distribution, we must draw samples from it. For this, we can use Markov
Chain Monte Carlo (MCMC) methods including the Metropolis sampler, sometimes known as random-walk
Metropolis. Hamiltonian Monte Carlo (HMC), which Stan implements, is ultimately an efficient version of the
Metropolis sampler that does not involve random walk behaviour. In this introductory section of the course, we
will go through these major theoretical topics in sufficient detail to be able to understand how Stan works.
Topic 2: Univariate models. To learn the Stan language and how to use it to develop Bayesian models, we will
start with simple models. In particular, we will look at binomial models and models involving univariate normal
distributions. The models will allow us to explore many of the major features of the Stan language, including
how to specify priors, in conceptually easy examples. Here, we will also learn how to use rstan and cmdstanr to
compile the HMC sampler from the defined Stan model, and draw samples from it.
Thursday 3rd – Classes from 12:00 to 20:00
Topic 3: Regression models. Having learned the basics of Stan using simple models, we now turn to more
practically useful examples including linear regression, general linear models with categorical predictor
variables, logistic regression, Poisson regression, etc. All of these examples involve the use of similar
programming features and specifications, and so they are easily extensible to other regression models.
Topic 4: Multilevel and mixed effects models. As an extension of the regression models that we consider in the
previous topic, here we consider multilevel and mixed effects models. We primarily concentrate on linear mixed
effects models, and consider the different ways to specify these models in Stan.
Topic 5: Because Stan is a programming language, it essentially gives us the means to create any bespoke or
custom statistical model, and not just those that are widely used. In this final topic, we will cover some more
complex cases to illustrate it power. In particular, we will cover probabilistic mixture models, which are a type
of latent variable model.