Union support recovery in high-dimensional multivariate regression

August 1, 2008

Report Number

761

Authors

Guillaume Obozinski, Martin J. Wainwright, Michael I. Jordan

Abstract

In the problem of multivariate regression, a K-dimensional response vector is regressed upon a common set of p covariates, with a matrix B* of regression coefficients. We study the behavior of the group Lasso using l1/l2 regularization for the union support problem, meaning that the set of s rows for which B* is non-zero is recovered exactly. Studying this problem under high-dimensional scaling, we show that group Lasso recovers the exact row pattern with high probability over the random design and noise for scalings of (n; p; s) such that the sample complexity parameter given by theta(n,p,s)=n/(2 psi(B*) log(p-s)) exceeds a critical threshold. Here n is the sample size, p is the ambient dimension of the regression model, s is the number of non-zero rows, and psi(B*) is a sparsity-overlap function that measures a combination of the sparsities and overlaps of the K-regression coefficient vectors that constitute the model. This sparsity-overlap function reveals that, if the design is uncorrelated on the active rows, block l1/l2 regularization for multivariate regression never harms performance relative to an ordinary Lasso approach, and can yield substantial improvements in sample complexity (up to a factor of K) when the regression vectors are suitably orthogonal. For more general designs, it is possible for the ordinary Lasso to outperform the group Lasso. We complement our analysis with simulations that demonstrate the sharpness of our theoretical results, even for relatively small problems.

PDF File

761.pdf