4th Summer Institute in Statistics for Clinical Research

Module 14: Modern Statistical Learning Methods for Observational Biomedical Data and Applications to Comparative Effectiveness Research

Session 4, Thursday 8:30 AM - 5:00 PM: Thu Jul 27

While clinical trials provide the highest level of evidence to compare clinical treatments or public health interventions, they are often not feasible due to ethical, logistic or economical constraints. Observational studies provide an opportunity to learn about the effect of interventions for which little or no trial data are available. These studies constitute a potentially rich and relatively cheap source of information. However, in such studies, treatment or intervention allocation may be strongly confounded by other important patient characteristics and much care is needed to disentangle observed relationships and infer causal effects.

In this course, we will provide an overview of modern statistical techniques for analyzing observational data. We will focus primarily on recent advances in the field of targeted learning, which facilitate the use of state-of-the-art machine learning tools to flexibly adjust for confounding all the while yielding valid statistical inference. This is in contrast to conventional techniques for confounding adjustment, which generally rely on restrictive statistical models and may therefore lead to severely biased inference.

We will discuss methods for comparative effectiveness studies involving both single and multiple time-point interventions, and we will also address the problem of missing data. Methods will be illustrated using data from recent observational studies and extracted from electronic medical records.  Analyses will be illustrated in R but knowledge of R is not required for this module.