Presentation: Robust Model-based Clustering for Heterogeneous Populations
Faculty Candidate Speaker: Briana Stephenson, Ph.D., Postdoctoral Research Associate, UNC Collaborative Studies Coordinating Center, University of North Carolina
Abstract: In a large, heterogeneous population, traditional clustering methods can produce a large number of clusters due to a variety of factors, including study size and regional diversity. These factors result in a loss of interpretability of patterns that may differ due to minor pattern changes.
We address these data complexities with the introduction of a new method known as Robust Profile Clustering (RPC). Built from a local partition process framework, participants are able to cluster at two levels: (1) globally, with participants assigned to overall population-level clusters via an over-fitted mixture model, and (2) locally, in which regional variations are accommodated via a beta-Bernoulli process dependent on subpopulation differences. These clusters can then be linked with a probit response to generate a joint predictive clustering model known as Supervised Robust Profile Clustering to help cluster global and local profiles according to the outcome of interest.
Using data obtained from large multi-site studies on birth defects and migrant population health, we discuss the application, impact and utility of these methods to improve dietary pattern analysis in a largely diverse population, such as the United States.