Biostatistics is the study of statistical estimation and inference in the context of biology and the health sciences. The domain encompasses a wide range of topics, such as statistical genetics, design and analysis of randomized control trials, modelling time-to-event data. Research in biostatistics is often motivated by particular biomedical questions or applications which raise methodological investigations into a general mathematical or statistical framework of the problem. Thus, biostatistics is a discipline which requires both a solid foundation in inference and asymptotic theory mixed with versatility and interdisciplinary collaborative skills. A sample of areas of interest of the group members includes the following topics:

- survival analyses
- statistical genetics
- causal inference
- methodology for longitudinal data

Specific techniques developed or extended in the group include semi- and non-parametric response modelling, data quality in genomic studies, Bayesian techniques for diagnostic testing, g-estimation and inference of dynamic treatment regimes, and high-dimensional propensity scores.

- Belkacem Abdous (Laval)
- Michal Abrahamowicz (McGill)
- Andrea Benedetti (McGill)
- Antonio Ciampi (McGill)
- Celia Greenwood (McGill)
- James A. Hanley (McGill)
- Lawrence Joseph (McGill)
- Aurélie Labbe (McGill)
- Fabrice Larribe (UQAM)
- Erica Moodie (McGill)
- Robert Platt (McGill)

Examples of applications of statistics and probability in epidemiologic research. Sources of epidemiologic data (surveys, experimental and non-experimental studies). Elementary data analysis for single and comparative epidemiologic parameters.

Statistical methods for multinomial outcomes, overdispersion, and continuous and categorical correlated data; approaches to inference (estimating equations, likelihood-based methods, semi-parametric methods); analysis of longitudinal data; theoretical content and applications.

Common data-analytic problems. Practical approaches to complex data. Graphical and tabular presentation of results. Writing reports for scientific journals, research collaborators, consulting clients.

Introduction to practical Bayesian methods. Topics will include Bayesian philosophy, simple Bayesian models including linear and logistic regression, hierarchical models, and numerical techniques, including an introduction to the Gibbs sampler. Programming in R and WinBUGS.

Multivariable regression models for proportions, rates, and their differences/ratios; Conditional logistic regression; Proportional hazards and other parametric/semi-parametric models; unmatched, nested, and self-matched case-control studies; links to Cox's method; Rate ratio estimation when "time-dependent" membership in contrasted categories.

Advanced applied biostatistics course dealing with flexible modeling of non-linear effects of continuous covariates in multivariable analyses, and survival data, including e.g. time-varying covariates and time-dependent or cumulative effects. Focus on the concepts, limitations and advantages of specific methods, and interpretation of their results. In addition to 3 hours of weekly lectures, shared with epidemiology students, an additional hour/week focuses on statistical inference and complex simulation methods. Students get hands-on experience in designing and implementing simulations for survival analyses, through individual term projects.

Bayesian design and analysis with applications specifically geared towards epidemiological research. Topics may include multi-leveled hierarchical models, diagnostic tests, Bayesian sample size methods, issues in clinical trials, measurement error and missing data problems. Programming in R and WinBUGS.

Foundations of causal inference in biostatistics. Statistical methods based on potential outcomes; propensity scores, marginal structural models, instrumental variables, structural nested models. Introduction to semiparametric theory.

This course will provide a basic introduction to methods for analysis of correlated, or dependent, data. These data arise when observations are not gathered independently; examples are longitudinal data, household data, cluster samples, etc. Basic descriptive methods and introduction to regression methods for both continuous and discrete outcomes.