There will be two presentations at this seminar.
Speaker: Angela Zhang, Graduate Student, UW Biostatistics
Abstract: Due to the high-dimensional nature of microbiome data, its analysis often relies on dimension-reduced graphical displays and penalized methods that account for the multicollinearity between taxa in each sample. Double principle coordinates analysis (DPCoA), an extension of principal components analysis (PCA), is a commonly used ordination method that uses a non-Euclidean distance matrix in order to incorporate ecologically defined structure in the analysis of microbiome data. Here, we extend these distance-based methods and introduce a framework of high-dimensional regression models that incorporate a variety of extrinsic information from our data. Through kernel-based methods, we can include similarities between samples in the dual space as well as similarities between taxa in the primal space in order to estimate the association between the composition of a subject’s microbiome and their clinical outcome or phenotype. Additonally, we adjust for the compositional structure of microbiome, which is an important but commonly ignored issue, through a centered log-ratio transformation of our data. Using data from a recent study on the gut microbiome, we compare our kernel-penalized regression model against two popular regularization methods: ridge and lasso regression. This framework allows us to incorporate phylogeny into our analysis as well as account for the compositional nature of microbiome data.
Presentation #2: In this talk we’ll review a recent paper by James Johndrow, Kristian Lum, and David Dunson that develops theoretical results concerning the performance of record linkage in a mixture modeling framework.
Speaker: Serge Aleshin-Guendel, Graduate Student, UW Biostatistics