Program

Daily schedule

9:00 - 10:30 am: first half of morning course

10:30 -10:45 am: break

10:45 - 12:15 pm: second half of morning course

12:15 - 1:30 pm: lunch

1:30 - 3:00 pm: first half of afternoon course

3:00 - 3:30 pm: coffee break

3:30 - 5:00 pm: second half of afternoon course

May 28

Analysis of Administrative Health Care Data
Robert Platt

May 29

Morning
States and events: an orientation to rates, risks, and hazards
James A. Hanley

Afternoon
Flexible modeling of survival data: challenges, methods and applications
Michal Abrahamowicz

May 30

Morning
A Tutorial on ADMM algorithms
Yi Yang

Afternoon
An Introduction to Bayesian Inference and MCMC
David A. Stephens

May 31

An introduction to causal inference and propensity score methods
David A. Stephens and Erica E. M. Moodie

June 1

Analysis of spatially structured data
Alexandra M. Schmidt

Description of courses

Analysis of Administrative Health Care Data
Presenter: Robert Platt

Administrative health care data are an important resource for research in health services, pharmacoepidemiology, and clinical medicine. The course will describe the different sources of administrative data, differentiating electronic medical record data from health care claims data, and discuss some of the statistical challenges involved in using these data. Students will learn core analytic skills for working with these data. Using examples from pharmacoepidemiology and a realistic simulation of a health care claims database, we will discuss several key aspects of analyses of these data. Students will be provided with sample data and code. Topics include:
-Description of data sources
-Types of questions typically addressed using administrative data (description, prediction, estimation of causal effects)
-Data structure and management
-key challenges
-Design challenges
-Analysis methods/worked examples (e.g., estimation using high-dimensional propensity scores).

States and events: an orientation to rates, risks, and hazards.
Presenter: James A. Hanley

I will begin with the fundamental scientific concept of ‘state’ (exemplified by the term ‘status’ in a popular social media network) and introduce the statistical parameters and distributions to describe and compare the speeds with which events (transitions between states) occur. To do so, I will use historical and contemporary examples from demography, industrial testing, and medical and epidemiological research. The statistical techniques used to address these research questions are often referred to as ‘survival’ analysis — an overly-narrow term that misses the unity that comes from handling ‘censored’/‘interval’ data of any type by likelihood methods and by conditioning. I will describe some of the subtleties involved in comparisons of time durations or event rates, and some embarrassing/serious statistical blunders. I will explain how most of the statistical information in large databases can be extracted using a ‘smart-sampling’ approach widely used in epidemiology/economics. In case circumstances do not permit us to enact a live version of the ‘Kaplan-Meier Theatre’ (Gerds, 2016), please have read his article ahead of time. See also http://www.biostat.mcgill.ca/hanley, and in particular the link to the ‘Bridge of Life’ material. Several articles relevant to the lecture can be found using the terms ‘longevity’, ‘Titanic’, ‘Oscar’, ‘tumbler’, ’avalanche’ ‘HIV’, ‘HPV’, and ‘screening’ in the search box in JH’s home page.

Title: Flexible modeling of survival data: challenges, methods and applications
Presenter: Michal Abrahamowicz

Cox’s PH model is one of the most popular statistical methods, with > 20,000 references in medical literature alone. Yet, most users are not aware of the underlying assumptions and do not realize the implications of the potential violation of these assumptions. Recent statistical research in this area provides an excellent opportunity to both introduce students to more advanced modeling techniques and illustrate how applications of the state-of-the-art statistical methods may yield new insights into complex processes studied in clinical and public health research.
Survival or time-to-event analysis focuses on dynamic processes that evolve over time. The class will focus on two aspects of time-related changes that require more advanced methods. Firstly, we will deal with associations that vary over time. To this end, we will explore the reasons for frequent violation of the PH assumption, that postulates that the effects of risk factors remain constant over time, and introduce the flexible methods that address this issue, while accounting for potential non-linear effects of continuous variables. Secondly, we will introduce the concept of time-varying covariates, and explain the challenges in modeling the effects of variables that change their values during the study period. Then, we will introduce flexible methods for modeling time-varying covariates. Simulation results will be used to assess the performance of the proposed methods. Guidance regarding use of R programs that implement these methods will be also provided.
The practical relevance of the flexible methods will be illustrated using real-life examples of prediction of survival after lung cancer diagnosis and septic shock, and adverse effects of medications.

Title: A Tutorial on ADMM algorithms
Presenter: Yi Yang

The alternating direction method of multipliers (ADMM) has been regarded as a very flexible approach for solving large-scale sparse feature learning problems efficiently. When the number of observations of the data size and/or the feature dimension is large, a consensus (distributed) version of ADMM might be used, which is capable of distributing the computation task and the data set to multiple computing nodes. In this short course, I will give a tutorial on the ADMM algorithm and its applications in solving sparse learning models.

Title: An Introduction to Bayesian Inference and MCMC.
Presenter: David A. Stephens

This module will give an introduction to applied Bayesian inference techniques, focussing on Bayesian solutions to classical statistical problems, including regression and hierarchical modelling, and more advanced topics such as spatial and non-parametric modelling. It will also demonstrate how sampling-based computational solutions, such as those provided by Markov chain Monte Carlo, can be obtained when exact or analytically tractable solutions cannot be obtained easily.

Title: An introduction to causal inference and propensity score methods
Presenters: David A. Stephens and Erica E. M. Moodie

Causal inference attempts to uncover the structure of the data and eliminate all non-causative explanations for an observed association. Most inference problems in biostatistics seek to uncover causal relationships, which is hindered by issues such as confounding in non-experimental data or non-compliance in randomized studies. This workshop will introduce fundamental principles in causal inference, with a particular focus on the propensity score and covariate balance.

Title: Analysis of spatially structured data
Presenter: Alexandra M. Schmidt

This course aims at giving an introduction to spatial modelling of point referenced and areal level data under the Bayesian paradigm. The course is divided into 3 parts. The first will introduce spatially referenced data, define Gaussian processes and discuss stationarity and isotropy, then define variograms, and the most used correlation functions. Bayesian kriging will also be discussed therein. The second part will introduce modelling of areal data and disease mapping. In the third part, recent topics in spatial statistics, like spatial confounding, modeling of skewed processes will be discussed. We will use the platform STAN to analyze several real data sets.