Presentation: Statistical Paradises and Paradoxes in Big Data (II): Multi-resolution Inference, Simpson's paradox, and Individualized Treatments
Speaker: Xiao-Li Meng, Ph.D., Dean, Graduate School of Arts and Sciences, Whipple V. N. Jones Professor of Statistics, Harvard University
Abstract: One of the Big-Data buzz words is personalized treatment, which sounds heavenly. But where on earth could anyone find enough guinea pigs to verify a treatment’s efficacy for “me”? More generally, any statistical inference is a process of “transition to similar,” namely, transferring our knowledge about a group of entities to a group of similar entities. As pondered by philosophers from Galen to Hume, how similar is similar? Wavelet-inspired Multi-resolution (MR) inference (Meng, 2014, COPSS 50th Anniversary Volume) allows us to theoretically frame this question, with the primary resolution defining the appropriate level of similarity. The search of the appropriate primary resolution level is a quest for a sensible bias-variance trade-off: estimating more precisely a less relevant treatment effect verse estimating less precisely but a more relevant treatment effect for “me.” The MR framework also reveals that, whereas we must use the same resolution level in defining treatment estimands in order to avoid comparing apples and oranges, we can employ difference resolution levels to form their estimators to achieve better mean-squared error for estimating treatment effect (Liu and Meng, 2014, The American Statistician). A real-life Simpson’s paradox from comparing kidney stone treatments will be used to engage the audience.