Minimax-optimal rates for sparse additive models over kernel classes via convex programming

August 1, 2010

Report Number

795

Authors

Garvesh Raskutti, Martin J. Wainwright, Bin Yu

Abstract

Sparse additive models are families of $d$-variate functions that have the additive decomposition \mbox{$f^* = \sum_{j \in S} f^*_j$,} where $S$ is a unknown subset of cardinality $s \ll d$. We consider the case where each component function $f^*_j$ lies in a reproducing kernel Hilbert space, and analyze a simple kernel-based convex program for estimating the unknown function $f^*$. Working within a high-dimensional framework that allows both the dimension $d$ and sparsity $s$ to scale, we derive convergence rates in the $L^2(\mathbb{P})$ and $L^2(\mathbb{P}_n)$ norms. These rates consist of two terms: a \emph{subset selection term} of the order $\frac{s \log d}{n}$, corresponding to the difficulty of finding the unknown $s$-sized subset, and an \emph{estimation error} term of the order $s \, \nu_n^2$, where $\nu_n^2$ is the optimal rate for estimating an univariate function within the RKHS. We complement these achievable results by deriving minimax lower bounds on the $L^2(\mathbb{P})$ error, thereby showing that our method is optimal up to constant factors for sub-linear sparsity $s = o(d)$. Thus, we obtain optimal minimax rates for many interesting classes of sparse additive models, including polynomials, splines, finite-rank kernel classes, as well as Sobolev smoothness classes.

PDF File

795.pdf