Regularized estimation of large covariance matrices

September 1, 2006

Report Number

716

Authors

Peter J. Bickel and Elizaveta Levina

Abstract

This paper considers estimating a covariance matrix of p variables from n oberservations by either banding the sample covariance matrix or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (log p)^2/n converges to 0, and obtain explicit rates. The results are uniform over some fairly natural well-conditioned families of covariance matices. We also introduce an analogue of the Gaussian white noise model and show that if the population covariance is embeddable in that model and well-conditioned then the banded approximations produce consistent estimates of eigenvalues and associated eigenvectors of the covariance matrix. The results can be extended to smooth versions of banding and to non-Gaussian distributions with sufficient short tails. A resampling approach is proposed for choosing the banding parameter in practice. This approach is illustrated numerically on both simulated and real data.

PDF File

716.pdf