BIOST 558 Statistical Machine Learning for Data Scientists (5)
Bias-variance trade-off; training versus test error; overfitting; cross-validation; subset selection methods; regularized approaches for linear/logistic regression: ridge and lasso; non-parametric regression: trees, bagging, random forests; local regression and splines; generalized additive models; support vector machines; k-means and hierarchical clustering; principal components analysis. Offered: jointly with DATA 558/STAT 558; Spring