Program

Daily Schedule

  • 9:30-12:00 == Lectures

  • 12:00-13:30 == Break

  • 13:30-16:00 == Lectures

Monday, May 31

  • Morning == Analysis of Administrative Health Care Data
  • Afternoon == States and events: an orientation to risks and hazards

Tuesday, June 1

  • Morning == Bayesian inference and Markov Chain Monte Carlo
  • Afternoon == Data analysis using penalized regression methods

Wednesday, June 2

  • Morning == An introduction to correlated data models
  • Afternoon == An introduction to Bayesian adaptive designs for clinical trials

Thursday, June 3

  • Morning == Bayesian disease mapping

Course Descriptions

Title: Analysis of Administrative Health Care Data

Instructor: Robert Platt

Administrative health care data are an important resource for research in health services, pharmacoepidemiology, and clinical medicine. The course will describe the different sources of administrative data, differentiating electronic medical record data from health care claims data, and discuss some of the statistical challenges involved in using these data. Students will learn core analytic skills for working with these data. Using examples from pharmacoepidemiology and a realistic simulation of a health care claims database, we will discuss several key aspects of analyses of these data. Students will be provided with sample data and code.

Title: States and events: an orientation to risks and hazards

Instructor: James A. Hanley

I will begin with the fundamental scientific concept of ‘state’ (exemplified by the term ‘status’ in a popular social media network) and introduce the statistical parameters and distributions to describe and compare the speeds with which events (transitions between states) occur. To do so, I will use historical and contemporary examples from demography, industrial testing, and medical and epidemiological research. The statistical techniques used to address these research questions are often referred to as ‘survival analysis’, an overly-narrow term that misses the unity that comes from handling ‘censored’/’interval’ data of any type by likelihood methods and by conditioning. I will describe some of the subtleties involved in comparisons of time durations or event rates, and some embarrassing/serious statistical blunders. I will explain how most of the statistical information in large databases can be extracted using a ‘smart-sampling’ approach widely used in epidemiology/economics.

Title: Bayesian inference and Markov Chain Monte Carlo

Instructor: David A. Stephens

This course will introduce students to applied techniques in Bayesian inference, with a focus on Bayesian solutions to classical statistical problems such as regression and hierarchical modelling, and more advanced topics such as spatial and non-parametric modelling. It will also demonstrate how sampling-based computational solutions, such as those provided by Markov chain Monte Carlo, can be obtained when exact or analytically tractable solutions cannot be obtained easily.

Title: Data analysis using penalized regression methods

Instructor: Sahir Bhatnagar

In high-dimensional (HD) data, where the number of covariates (p) greatly exceeds the number of observations (n), estimation can benefit from the bet-on-sparsity principle, i.e., only a small number of predictors are relevant in the response. This assumption can lead to more interpretable models, improved predictive accuracy, and algorithms that are computationally efficient. In medical data, where the sample sizes are particularly small due to high data collection costs, we must often assume a sparse model because there isn’t enough information to estimate p parameters. For these reasons, penalized regression methods have generated substantial interest over the past decade since they can set model coefficients exactly to zero. We will provide an overview of these methods from both the frequentist and Bayesian point of view. We will provide details on both the theoretical and computational aspects of these methods and demonstrate a real-data example with R code.

Title: An introduction to correlated data models

Instructor: Erica E. M. Moodie

This course will provide a basic introduction to methods for analysis of correlated data. These data arise when observations are not gathered independently, such as through longitudinal data, household data, or classrooms. We will examine exploratory methods and introduce regression approaches for both continuous outcomes.

Title: An introduction to Bayesian adaptive designs for clinical trials

Instructor: Shirin Golchi

This course will cover a brief overview of conventional clinical trials and introduces Bayesian adaptive trials as flexible alternatives in various phases of clinical trials. Various adaptations and stopping rules will be explored and a variety of adaptive design examples will be covered. The course will include hands-on simulations for assessment of design operating characteristics in Bayesian adaptive trials.

Title: Bayesian disease mapping

Instructor: Alexandra M. Schmidt

The mapping of disease incidence and prevalence has long been a part of public health, epidemiology, and the study of disease in human populations (Koch 2005). This short course provides an introduction to Bayesian disease mapping. I will start by describing some exploratory tools to analyze areal data, and introduce the conditional autoregressive (CAR) model. CAR specifications are commonly used as latent structures in hierarchical models for areal level data. I will also discuss alternative models to the CAR structure, and introduce a couple of packages available in R to fit such models. Examples include the analysis of the number of cases of dengue fever across the districts of the city of Rio de Janeiro, and the number of cases of diabetes across neighbourhoods of Montreal as a function of an estimated neighbourhood soda-consumption index.