Presentation: Tuning Multiple Penalty Parameters: Methods and Theory
Speaker: Jean Feng, Graduate Student, UW Biostatistics
Abstract: In high-dimensional and/or non-parametric regression problems, penalization is commonly used to control model complexity and induce desired structure. Each penalty has a weight parameter that indicates how strongly the structure corresponding to that penalty should be enforced. Typically the parameters are chosen to minimize the validation loss from a training/validation split or cross-validation procedure using a simple grid search, but this quickly becomes computationally intractable as the number of penalty parameters grows. In this talk, we consider two questions: (1) how can we efficiently tune many penalty parameters and (2) how does the tuning procedure affect the generalization error of our selected model? To address the first question, we show how we can calculate the gradient of the validation loss with respect to the penalty parameters, even for problems with non-smooth penalty functions. Thus we can use a modified gradient descent algorithm to tune the penalty parameters. Next we show that if the penalty parameters minimize the validation loss, the error incurred from tuning an additional parameter is roughly equivalent to adding a parameter to the model itself for parametric problems; the additional error is neglible for semi- and non-parametric problems. Through simulation studies, we show that our method can efficiently tune a hundred of penalty parameters and increasing the number of penalty parameters can decrease the generalization error. These results encourage development of regularization methods with many penalty parameters.