Presentation: False Discovery Rate Control for Ising Variable Selection
Candidate: Yuxiang Xie, Graduate Student, UW Biostatistics
Committee Members: Gary Chan (Chair), Peter Gilbert, Michael Wu, Daniel Enquobahrie (GSR)
Abstract: In high dimensional data analysis, it is important to effectively control the fraction of false discoveries and ensure large enough power for variable selection. In a lot of contemporary data applications, a large set of covariates are discrete variables. In this paper we propose Ising knockoff (IKF) for variable selection in high dimensional regression with discrete covariates. Under some conditions, we show that the false discovery rate (FDR) is controlled under a target level in a finite sample if the underlying Ising model of the covariates is known. In addition to an exact construction of knockoffs, we also provide a second-order approximation construction for Ising variables on purpose of practical use. In simulations we show that IKF controls FDR well and has higher power than existing knockoff procedures mostly tailored to continuous covariate distributions.