Joint covariate selection for grouped classification

October 1, 2007

Report Number

743

Authors

Guillaume Obozinski, Ben Taskar, Michael Jordan

Abstract

We address the problem of recovering a common set of covariates that are relevant simultaneously to several classification problems. We propose a joint measure of complexity for the group of problems that couples covariate selection. By penalizing the sum of L2-norms of the blocks of coefficients associated with each covariate across different classification problems, we encourage similar sparsity patterns in all models. To fit parameters under this regularization, we propose a blockwise boosting scheme that follows the regularization path. As the regularization coefficient decreases, the algorithm maintains and updates concurrently a growing set of covariates that are simultaneously active for all problems.

We show empirically that this approach outperforms independent L1-based covariate selection on several data sets, both in accuracy and number of selected covariates.

PDF File

743.pdf