Presentation: Efficient Estimation, Robust Test and Design Optimality for Two-Phase Studies
Speaker: Donglin Zeng, Ph.D., Professor of Biostatistics, Co-Director of Carolina Survey Research Laboratory, University of North Carolina Gillings School of Global Public Health
Abstract: Two-phase designs are cost-effective sampling strategies when some covariates are expensive to be measured on all study subjects. Well-known examples include case-control, case-cohort, nested case-control and extreme tail sampling designs. In this talk, I will discuss three important aspects in two-phase studies: estimation, hypothesis testing and design optimality. First, I will discuss efficient estimation methods we have developed for two-phase studies. We allow expensive covariates to be correlated with inexpensive covariates collected in the first phase. Our proposed estimation is based on maximization of a modified nonparametric likelihood function through a generalization of the expectation-maximization algorithm. The resulting estimators are shown to be consistent, asymptotically normal and asymptotically efficient with easily estimated variances. Second, I will focus on hypothesis testing in two-phase studies. We propose a robust test procedure based on imputation. The proposed procedure guarantees preservation of type I error, allows high-dimensional inexpensive covariates, and yields higher power than alternative imputation approaches. Finally, I will present some recent development on design optimality. We show that for general outcomes, the most efficient design is an extreme-tail sampling design based on certain residuals. This conclusion also explains the high efficiency of extreme tail sampling for continuous outcomes and balanced case-control design for binary outcomes. Throughout the talk, I will present numerical evidences from simulation studies and illustrate our methods using different applications.