Module Description: Modern medicine has graduated from broad spectrum treatments to targeted therapeutics. With new advances in high throughput biotechnology, it is inexpensive to run whole genome assays. Genetic features from these assays can be developed into biomarkers used to target therapies. A major difficulty here lies in the fact that the number of potential features is generally much greater than the number of patients. The goal of this module will be to introduce ideas in high dimensional predictive modeling, to discuss model validation and testing, and to give hands-on experience with these tools to build predictive biomarkers on real, high-dimensional datasets. Through the course, participants will become familiar with various methods in penalized regression, applications of cross-validation, as well as ideas in multiplicity and selection bias. Participants will gain experience applying these methods in R.