Sandrine Dudoit

Professor and Chair
Primary Research Area: 
Applied & Theoretical Statistics
Sub-Focus: 
Applied Statistics, Bioinformatics/Biostatistics, Machine Learning
Email: 
sandrine [at] stat [dot] berkeley [dot] edu
Office / Location: 
383 Evans Hall
Office Hours: 
Tuesday, 1-2 pm

I obtained a Bachelor's degree (1992) and a Master's degree (1994) in Mathematics from Carleton University, Ottawa, Canada. I first came to UC Berkeley as a graduate student and earned a PhD degree in 1999 from the Department of Statistics.  My doctoral research, under the supervision of Professor Terence P. Speed, concerned the linkage analysis of complex human traits.  From 1999 to 2000, I was a postdoctoral fellow at the Mathematical Sciences Research Institute, Berkeley.  Before joining the Faculty at UC Berkeley in July 2001, I underwent two years of postdoctoral training in genomics in the laboratory of Professor Patrick O. Brown, Department of Biochemistry, Stanford University.  My work in the Brown Lab involved the development and application of statistical methods and software for the analysis of microarray gene expression data.

Research Interests: 

My methodological research interests regard high-dimensional inference and include exploratory data analysis (EDA), dimensionality reduction, visualization, loss-based estimation with cross-validation (e.g., density estimation, classification, regression, model selection), cluster analysis, and multiple hypothesis testing.

Much of my methodological work is motivated by statistical inference questions arising in biological research and, in particular, the design and analysis of high-throughput microarray and sequencing gene expression experiments, e.g., single-cell transcriptome sequencing (RNA-Seq) for discovering novel cell types and for the study of stem cell differentiation. My contributions include: exploratory data analysis, normalization and expression quantitation, differential expression analysis, class discovery, prediction,  inference of cell lineages, integration of biological annotation metadata (e.g., Gene Ontology (GO) annotation).

I am also interested in statistical computing and, in particular, reproducible research. I am a founding core developer of the Bioconductor Project, an open-source and open-development software project for the analysis of biomedical and genomic data.