Boosting with the $L_2$-Loss: Regression and Classification
This paper investigates a variant of boosting, $L_2$Boost, which is constructed from a functional gradient descent algorithm with the $L_2$-loss function. Based on an explicit stagewise refitting expression of $L_2$Boost, the case of (symmetric) linear weak learners is studied in detail in both regression and two-class classification. In particular, with the boosting iteration $m$ working as the smoothing or regularization parameter, a new exponential bias-variance trade off is found with the variance (complexity) term bounded as $m$ tends to infinity. When the weak learner is a smoothing spline, an optimal rate of convergence result holds for both regression and two-class classification. And this boosted smoothing spline adapts to higher order, unknown smoothness. Moreover, a simple expansion of the 0-1 loss function is derived to reveal the importance of the decision boundary, bias reduction, and impossibility of an additive bias-variance decomposition in classification. Finally, simulation and real data set results are obtained to demonstrate the attractiveness of $L_2$Boost, particularly with a novel component-wise cubic smoothing spline as an effective and practical weak learner.