Applications in Biology and Medicine

Applications in Biology and Medicine

There is a long and fruitful history of joint development between Statistics and Biology and Medicine, with data at the core. For instance, Mendel’s fundamental laws of heredity were entirely based on statistical inference applied to data from carefully designed experiments. Most recently, the advent of novel high-throughput and high-resolution biological assays has allowed the exploration of biological processes on a genomic scale and at the resolution of single cells. Applications range from addressing fundamental science questions (e.g., how does the brain work?) to disease prevention, diagnosis, and treatment. Statistical methods are essential to make sense of the massive amounts of data generated by these biotechnologies.

Our faculty have been at the forefront of research at the interface of Statistics with Biology and Medicine, contributing statistical methods and software for genome sequencing, the study of stem cell differentiation, neuroscience, evolutionary biology, epidemiology, infectious disease modeling, clinical trials, and personalized medicine, among others. A hallmark of the Berkeley approach is our close collaboration with biologists and clinicians and our engagement throughout the data science pipeline, including the framing of questions, study design, exploratory data analysis, and the interpretation, validation, and translation of the results into domain insight. 

Our faculty have played an essential role in the creation and growth of the Center for Computational Biology and comprise its largest group (10 of the Center’s 48 faculty). 


Photo of Peter Bickel

statistics, machine learning, semiparametric models, asymptotic theory, hidden Markov models, applications to molecular biology

Peng Ding

causal inference in experiments and observational studies, with applications to biomedical and social sciences; contaminated data including missing data, measurement error, and selection bias

Sandrine Dudoit photo

statistics, applied statistics, data science, statistical computing, computational biology and genomics

Steve Evans

large random combinatorial structures, random matrices, superprocesses & other measure-valued processes, probability on algebraic structures -particularly local fields, applications of stochastic processes to biodemography, mathematical…

Adrian Gonzalez Casanova

I'm interested in Probability Theory and its applications in Theoretical Biology, including modelling, simulation and data-driven approaches.

My research primarily focuses on studying Interacting Particle Systems, Stochastic Duality, Coalescent Processes, Seed-…

Photo of Haiyan Huang

high dimensional and integrative genomic data analysis, network modeling, hierarchical multi-label classification, translational bioinformatics

Nicholas Jewell

infectious diseases (specifically HIV), chronic disease epidemiology, environmental epidemiology, survival analysis, human rights statistics


Michael Jordan

computer science, artificial intelligence, computational biology, statistics, machine learning, electrical engineering, applied statistics, optimization

Jon McAuliffe

machine learning, statistical prediction, variational inference, statistical computing, optimization

Rasmus Nielsen

evolution, molecular evolution, population genetics, human variation, human genetics, phylogenetics, applied statistics, genetics, evolutionary processes, evolutionary biology


causal inference, health services & policy analysis, biostatistics, discrete optimization

Elizabeth Purdom

computational biology, bioinformatics, statistics, data analysis, sequencing, cancer genomics


computational biology, statistical genetics, applied probability

photo of P.B. Stark

uncertainty quantification and inference, inverse problems, nonparametrics, risk assessment, earthquake prediction, election auditing, geomagnetism, cosmology, litigation, food/nutrition

Alexander Strang

Stochastic Processes, Hierarchical Bayesian Inference, Random Graph Theory, Multi-Agent Training, Computational Biology, Solution Continuation, Optimization

Photo of Ryan Tibshirani.

high-dimensional statistics, nonparametric estimation, distribution-free inference, machine learning, convex optimization, numerical methods, tracking and forecasting epidemics

Mark van der Laan

statistics, computational biology and genomics, censored data and survival analysis, medical research, inference in longitudinal studies

Bin Yu

statistical inference for high dimensional data and interdisciplinary research in neuroscience, remote sensing, and text summarization