2018 SISMID Modules

Scholarship applications open January 5. Registration opens February 1.

Session 1 - Monday, July 9, 8:30 a.m.-5 p.m.; Tuesday, July 10, 8:30 a.m.-5 p.m., and Wednesday, July 11, 8:30 a.m.-Noon

Module 1: Probability and Statistical Inference

Instructor(s): Hughes, JamesWillis, Amy

  • The laws of probability and the binomial, multinomial, and normal distributions.

  • Descriptive statistics and methods of inference including maximum likelihood, confidence intervals and simple Bayes methods.

  • Classical hypothesis testing topics, including type I and II errors, two-sample tests, chi-square tests and contingency table analysis, and exact and permutation tests

  • Resampling methods such as the bootstrap and jackknife are covered as well.

Also offered as part of the Summer Institute in Statistical Genetics (SISG 2018).

Module 2: Mathematical Models of Infectious Diseases

Instructor(s): Rohani, PejmanDrake, John

Prerequisites: Students are expected to have a working knowledge of the R computing environment. Programming will be in R. Students new to R should complete a tutorial before the module.

This module covers the principles of dynamic models of infectious diseases. This module will focus on developing and analyzing compartmental models such as the susceptible-infected-recovered (SIR) model. Topics include deriving the basic reproductive ratio using the next generation method, incorporating different mechanisms of heterogeneity in transmission (for instance, age-structure, behavior or seasonality), formulating exact stochastic birth-death models, carrying out sensitivity analysis and statistical fitting of simple models to data. The module will alternate between lectures and computer labs.

Background Reading: Keeling & Rohani (2008) Modeling Infectious Diseases in Humans and Animals, Princeton University Press.

Module 3: Introduction to R

Instructor(s): Rice, KenThornton, Timothy

This module introduces the R statistical environment, assuming no prior knowledge. It provides a foundation for the use of R for computation in later modules.

In addition to discussing basic data management tasks in R, such as reading in data and producing summaries through R scripts, we will also introduce R’s graphics functions, its powerful package system, and simple methods of looping.

Examples and exercises will use data drawn from biological and medical applications including infectious diseases and genetics. Hands-on use of R is a major component of this module; users require a laptop and will use it in all sessions.

Also offered as part of the Summer Institute in Statistical Genetics (SISG 2018).


Return to menu

Session 2 – Wednesday, July 11, 1:30-5 p.m.; Thursday, July 12, 8:30 a.m.-5 p.m., and Friday, July 13, 8:30 a.m.-5 p.m.

Module 4: MCMC I for Infectious Diseases

Instructor(s): Auranen, KariHalloran, M. ElizabethMinin, Vladimir

Prerequisites: Students are expected to have a working knowledge of the R computing environment. Programming will be in R. Students new to R should complete a tutorial before the module. This module assumes knowledge of the material in Module 1: Probability and Statistical Inference, though not necessarily from taking that module.

This module is an introduction to Markov chain Monte Carlo (MCMC) methods. The course includes a general introduction to Bayesian statistics, Monte Carlo, and MCMC. Some relevant facts from the Markov chain theory are reviewed. Algorithms include Gibbs sampling and Metropolis-Hastings. A practical introduction to convergence diagnostics is included. Motivating practical examples range from generic toy problems to infectious disease applications, which include chain-binomial and general epidemic models. A hierarchical model will be covered. The module will alternate between lectures and computer labs. Individuals already familiar with MCMC methods and knowledge of R programming should consider MCMC II.

Module 5: Infectious Diseases, Immunology and Within-Host Models

Instructor(s): Handel, AndreasThomas, Paul

Recommended but not required: Prior knowledge of R is helpful but not required.

This module provides an introduction to infectious diseases, the main components of the immune system, and mathematical modeling. Using pathogens such as HIV, TB, malaria, influenza and others, this module will introduce basic immunological concepts and explain how to use mathematical models to study aspects of within-host infection dynamics.

The focus will be on simple compartmental deterministic models. The use of those models to analyze the dynamics of pathogens, innate and adaptive immune responses and to design and evaluate intervention strategies, such as vaccines and drug treatments, are covered. Hands-on exercises using the programming language R will show how to construct and implement models.

Module 6: Stochastic Epidemic Models with Inference

Instructor(s): Britton, TomLongini, Jr, Ira

Prerequisites: This module assumes knowledge of the material in Module 1: Probability and Statistical Inference, though not necessarily from taking that module.

The course first studies some basic stochastic models for the spread of an infectious disease and presents large population results for them including threshold phenomenon (Ro), distribution of the final number infected, and the critical vaccination coverage (the fraction needed to vaccinate to avoid future epidemics). Several extensions towards realism are then discussed: different types of individuals and social structures in the community including households and networks.

Then focus shifts towards statistics and how to obtain estimates of relevant model parameters from epidemic data. The course will give the theoretical background but also numerous examples from empirical situations including estimation of various vaccine efficacies. There will be class exercises during the course.


Return to menu

Session 3 – Monday, July 16, 8:30 a.m.-5 p.m.; Tuesday, July 17, 8:30 a.m.-5 p.m., and Wednesday, July 18, 8:30 a.m.-Noon

Module 7: Simulation-based Inference for Epidemiological Dynamics

Instructor(s): Ionides, EdwardKing, Aaron

Prerequisites: Students are expected to have a working knowledge of the R computing environment. Programming will be in R. Students new to R should complete a tutorial before the module. This module assumes knowledge of the material in Module 1: Probability and Statistical Inference, though not necessarily from taking that module.

This module introduces statistical inference techniques and computational methods for dynamic models of epidemiological systems. The course will explore deterministic and stochastic formulations of epidemiological dynamics and develop inference methods appropriate for a range of models. Special emphasis will be on exact and approximate likelihood as the key elements in parameter estimation, hypothesis testing, and model selection. Specifically, the course will cover sequential Monte Carlo, iterated filtering, and model criticism techniques. Students will learn to implement these in R to carry out maximum likelihood and Bayesian inference.

Module 8: Microbiome Data Analysis

Instructors: Alekseyenko, AlexanderMcMurdie, Paul

Prerequisites: Programming will be done in R and fluency at the level of the module on Introduction to R, though not necessarily from taking that module, will be expected. This module assumes knowledge of the material in Module 1: Probability and Statistical Inference, though not necessarily from taking that module.

This course is concerned with multivariate statistical analysis of microbiome data. We will briefly cover foundational concepts in microbial ecology, molecular biology, bioinformatics, and DNA sequencing.

The main focus of the course will be on developing an understanding of multivariate analysis of microbiome data. Practical skills to be developed in this course include managing high-dimensional and structured data in metagenomics, visualization and representation of high-dimensional data, normalization, filtering, and mixture-model noise modeling of count data, as well as clustering and predictive model building.

Module 9: Statistics and Modeling with Novel Data Streams

Instructor(s): Elaine NsoesieSantillana, MauricioVespignani, Alessandro

Prerequisites: This module assumes knowledge of the material in Module 1: Probability and Statistical Inference, though not necessarily from taking that module. This module assumes knowledge of the material in Module 2: Mathematical Models of Infectious Diseases, though not necessarily from taking that module. Familiarity with a programming language is expected (Python, R, Matlab or other).

This module focuses on digital data sources and novel data streams such as geo-localized population and mobility data, wearable devices, web participatory platforms and web search data or social media updates. We will provide an introduction to different digital data sources and technical challenges in their collection, storage, and analysis. We will review the integration of digital data sources with statistical and mechanistic modeling of infectious diseases. The course will provide an introduction to the use of novel data streams time series for epidemic forecasting. We will describe the construction of synthetic populations and the calibration of highly detailed individual based models.


Return to menu

Session 4 – Wednesday, July 18, 1:30-5 p.m.; Thursday, July 19, 8:30 a.m.-5 p.m., and Friday, July 20, 8:30 a.m.-5 p.m.

Module 10: MCMC II for Infectious Diseases

Instructors: Kypraios, TheodoreO'Neill, Philip

Prerequisites: The course assumes all the material in Module 4: MCMC I for Infectious Diseases or the equivalent knowledge of MCMC. Students are expected to have a working knowledge of the R computing environment. Programming will be in R. Students new to R should complete an extensive tutorial before the module. This module assumes knowledge of the material in Module 1: Probability and Statistical Inference, though not necessarily from taking that module.

Recommended, but not required: Knowledge of the material from Module 2: Mathematical Models of Infectious Diseases or Module 5: Stochastic Epidemic Models with Inference, or the equivalent, would be helpful, but not required.

This module continues on from Module 4 by looking in detail at practical implementation issues for MCMC methods when applied to data from infectious disease outbreaks. The main focus will be towards inference for the SIR (susceptible-infected-removed) model. Topics include parameterization, methods for improving convergence, assessing MCMC output, and data augmentation methods. Programming will be carried out in R.

Module 11: Contact Network Epidemiology

Instructors: Hladish, ThomasMiller, Joel

Prerequisite: Previous programming experience in some language is expected.

Recommended but not required: Programming will be in Python.

Interpreting population interactions as contact networks provides powerful mathematical and computational frameworks for modeling infectious diseases.  This course introduces network concepts (e.g., nodes, degree, clustering, and modularity), and analytical and simulation-based approaches to contact network epidemiology. We will discuss both idealized and empirical networks and how modeling assumptions affect the tractability of different methods.

Students will use simple analytical models to predict disease properties such as threshold conditions, final sizes, epidemic probability, and epidemic dynamics.  They will use the Python programming language and NetworkX software library to represent and analyze networks, construct epidemic simulations, and model various intervention strategies.

Module 12: Evolutionary Dynamics and Molecular Epidemiology of Viruses

Instructors: Lemey, PhilippeSuchard, Marc

Prerequisites: This module assumes knowledge of the material in Module 1: Probability and Statistical Inference, though not necessarily from taking that module.

This module covers the use of phylogenetic and bioinformatic tools to analyze pathogen genetic variation and to gain insight in the processes that shape their diversity. The module focuses on phylogenies and how these relate to population genetic processes in infectious diseases.

In particular, the module will cover Bayesian Evolutionary Analysis by Sampling Trees (BEAST). This software will be used in class exercises that are mainly focused on estimating epidemic time scales, reconstruction changes in viral population sizes through time and inference of spatial diffusion of viruses. Evolutionary processes including recombination and selection will also be considered.

 


Return to menu

Session 5 – Monday, July 23, 8:30 a.m.-5 p.m.; Tuesday, July 24, 8:30 a.m.-5 p.m., and Wednesday, July 25, 8:30 a.m.-Noon

Module 13: Causal Inference

Instructors: Halloran, M. ElizabethRichardson, Thomas

Prerequisites: This module assumes knowledge of the material in Module 1: Probability and Statistical Inference, though not necessarily from taking that module. A working knowledge of R or SAS would be helpful.

This module provides an introduction to causal inference. Topics to be covered include potential outcomes, directed acyclic graphs, confounding, g-methods, instrumental variables, mediation, principal stratification, and interference. The methods will be illustrated using infectious disease examples, with analysis carried out in SAS and/or R.

Module 14: Spatial Statistics in Epidemiology and Public Health

Instructors: Wakefield, JonathanWaller, Lance

Prerequisites: This module assumes knowledge of the material in Module 1: Probability and Statistical Inference, though not necessarily from taking that module. Some prior knowledge of R would be helpful.

Spatial methods are now used in many disciplines and play an important role in epidemiology and public health. This module gives an introduction to spatial methods. In particular, we will present methods for assessment of clustering, cluster detection, spatial regression, small area estimation, and disease mapping. Methods will be described for both point data (in which cases and non-cases (or a sample thereof) have an associated point location) and count data (in which the numbers of cases and non-cases in a set of geographical areas are available).

An introduction to Geographic Information Systems (GIS) will be provided. The important extension to space-time analysis will be described, which is crucial for the analysis of infectious disease data with a spatial component.

Many examples will be presented, with analysis carried out in the R programming environment.

Reference: Waller, L. and Gotway, C. (2004). Applied Spatial Statistics for Public Health Data. New York, John Wiley and Sons.

Module 15: Pathogen Evolution, Selection, and Immunity

Instructors: Bedford, TrevorCobey, Sarah

Prerequisites: This module assumes knowledge of the material from Module 2: Mathematical Models of Infectious Diseases, though not necessarily from taking that module.

Recommended by not required: Knowledge of the material from Module 12: Evolutionary Dynamics and Molecular Epidemiology of Viruses. Programming exercises will be conducted in Python; some familiarity would be helpful, but not required.

This module provides an introduction to modeling antigenically diverse pathogen populations. Complementary epidemiological and evolutionary approaches will be covered.

The first part of the course will introduce multistrain compartmental models and potential mechanisms of competition. These simple models will be contrasted with models with more complex assumptions (e.g., multiple forms of immunity and spatial structure). We will review how to statistically investigate multistrain models with longitudinal data from individuals and time series data from populations.

The second part of the course will show how, using the coalescent as a neutral expectation, evolutionary pressures can be quantified using sequence data. We will detail bioinformatic methods to build phylogenies, quantify selective pressures and estimate pathogen population structure. Methods to measure pathogen phenotypic similarity and antigenic evolution, such as antigenic cartography, will be introduced.


Return to menu