22nd Summer Institute in Statistical Genetics

Module 11: Mixed Models in Quantitative Genetics

Week 2, Session 4, Wednesday 1:30 PM - Friday 5:00 PM: Wed Jul 19 to Fri Jul 21
Instructor(s):

Assumes the material in Module 1: Probability and Statistical Inference and Module 4: Regression and Analysis of Variance.

Provides a foundation for Module 14: Advanced Quantitative Genetics and Module 18: Statistical & Quantitative Genetics of Disease.

”Mixed models” refers to the analysis of linear models with arbitrary (co)variance structures among and within random effects and may be due to such factors as relationships or shared environments, cytoplasm, maternal effects and history. Mixed models are utilized in complex data analysis where the usual assumption(s) of independence and/or homogeneous variances fail.

Mixed models allow effects of nature to be separated from those of nurture and are emerging as the default method of analysis for human data. These issues are pervasive in human studies due to the lack of ability to randomize subjects to households, choice, and prior history.

In plant breeding, growth and yield data are correlated due to shared locations, but diminish by distance resulting in spatial correlations. In animal breeding, performance data are correlated because individuals may be related and may share common material environment as well as common pens or cages. Further, when individuals share a common space, they may experience indirect genetics effects (IGEs), which is an inherited effect in one individual experienced as an environmental effect in an associated individual. The evolution of cooperation and competition is based on IGEs, the estimation of which require mixed model analysis. Detection of cytoplasmic and epigenetic effects rely heavily on mixed model methods because of shared material or parental histories.

Topics to be discussed include a basic matrix algebra review, the general linear model, derivation of the mixed model, BLUP and REML estimation, estimation and design issues, and Bayesian formulations.

Applications to be discussed include estimation of breeding values and genetic variances in general pedigrees, association mapping, genomic selection, spatial correlations and corrections, maternal genetic effects, detecting selection from genomic data, admixture detection and correction, direct and indirect genetic effects, models of general group and kin selection, and genotype by environment interaction models.