High Dimensional Data Analysis

High Dimensional Data Analysis

High-dimensional statistics focuses on data sets in which the number of features is of comparable size, or larger than the number of observations.  Data sets of this type present a variety of new challenges, since classical theory and methodology can break down in surprising and unexpected ways.

Researchers at Berkeley study both the statistical and computational challenges that arise in the high-dimensional setting. On the theoretical side, they bring to bear a range of techniques from statistics, probability, and information theory, including empirical process theory, concentration inequalities, as well as random matrix theory and free probability. Methodological innovations include new estimators for spectral properties of matrices, randomized procedures for sketching and optimization, as well as algorithms for decision-making in sequential settings.  The work is motivated and applied to various scientific and engineering disciplines, including computational biology, astronomy, recommender  systems, financial time series, and climate forecasting.

Researchers

Photo of Peter Bickel

statistics, machine learning, semiparametric models, asymptotic theory, hidden Markov models, applications to molecular biology

Sandrine Dudoit photo

statistics, applied statistics, data science, statistical computing, computational biology and genomics

Will Fithian

theoretical and applied statistics

Vadim Gorin

Integrable probability, random matrices, asymptotic representation theory

Aditya Guntuboyina

nonparametric and high-dimensional statistics, shape constrained statistical estimation, empirical processes, statistical information theory

Photo of Haiyan Huang

high dimensional and integrative genomic data analysis, network modeling, hierarchical multi-label classification, translational bioinformatics

Jiantao Jiao

artificial intelligence, control and intelligent systems and robotics, communications and networking

Song Mei

data science, statistics, machine learning

Headshot

computational biology, machine learning, applied statistics, applied probability

photo of P.B. Stark

uncertainty quantification and inference, inverse problems, nonparametrics, risk assessment, earthquake prediction, election auditing, geomagnetism, cosmology, litigation, food/nutrition

Alexander Strang

Stochastic Processes, Hierarchical Bayesian Inference, Random Graph Theory, Multi-Agent Training, Computational Biology, Solution Continuation, Optimization

Photo of Ryan Tibshirani.

high-dimensional statistics, nonparametric estimation, distribution-free inference, machine learning, convex optimization, numerical methods, tracking and forecasting epidemics

Martin Wainwright

statistical machine learning, statistics, Optimization and algorithms, artificial intelligence

Nikita Photo

Mathematical Statistics, Applied Probability, and Statistical Learning Theory