### Objectives

Statisticians help to design data collection plans, analyze data appropriately and interpret and draw conclusions from those analyses. The central objective of the undergraduate major in Statistics is to equip students with consequently requisite quantitative skills that they can employ and build on in flexible ways.

Majors are expected to learn concepts and tools for working with data and have experience in analyzing real data that goes beyond the content of a service course in statistical methods for non-majors. Majors should understand [1] the fundamentals of probability theory, [2] statistical reasoning and inferential methods, [3] statistical computing, [4] statistical modeling and its limitations, and have skill in [5] description, interpretation and exploratory analysis of data by graphical and other means; [6] graduates are also expected to learn to communicate effectively.

### Information for Students

All majors meet at least once a semester with the undergraduate faculty advisor who reviews their progress, discusses future plans, and approves course selections. Students can also schedule appointments with the Undergraduate Student Affairs Officer, who works in close consultation with the faculty advisor.

Detailed information about the major and about the nature of the statistics profession are available on web pages for the undergraduate program.

### Core Courses

Numbers in square brackets refer to those objectives enumerated above that are particularly relevant to the individual courses.

**Statistics 133**: Concepts in Computing with Data. [3,5,6] This course focuses on how to use the computer to conduct a statistical analysis of data, including how to acquire, clean and organize data, analyze data using computationally intensive statistical methods, and report findings. Students gain experience in computing as a supporting skill for statistical practice and research. They learn how to use existing high-level general purpose software to create new algorithms and functionality and to express statistical ideas and computations, and they learn about different data technologies and tools, when to use them, and what are their trade-offs. Students acquire skills in basic numeracy, graphics, modern computationally intensive methods, and simulation. Programming concepts include variables, data types, trees, control flow. Data technologies topics include the digital representation of data, regular expressions for text manipulation, relational database management systems, the eXtensible Markup Language (XML), Web services for distributed functionality and methods, and Web publication. Extensive written reports are an integral part of the course.

**Statistics 134**: Concepts of Probability. [1] This is an introduction to probability theory, aimed at students who have had at least one year of calculus. The course covers the laws of probability, expectation, and conditioning, as well as all the standard distributions of random variables both discrete and continuous. Functions of random variables - sums, order statistics, and so on - are studied thoroughly, as are limit laws such as the law of large numbers and the central limit theorem, and the standard models: Bernoulli trials, sampling with and without replacement, Poisson process, univariate and bivariate normal, covariance and correlation. The course serves as preparation for later more systematic study of mathematical statistics and stochastic processes.

**Statistics 135**: Concepts of Statistics. [2,4,6] This is a comprehensive survey course in statistical theory and methodology, aimed at the goals of understanding the fundamental principles of statistical reasoning, achieving proficiency in data analysis, and developing written communications skills. To these ends, topics include descriptive statistics and data analysis, fundamental concepts of the theory of estimation and hypothesis testing, and methodology such as sampling, goodness-of-fit testing, analysis of variance, and least squares estimation. The laboratory includes computer-based data- analytic applications to a variety of subject matter and requires written reports.

**Statistics 140: **Probability for Data Science (4 units). An introduction to probability, emphasizing the combined use of mathematics and programming to solve problems. Random variables, discrete and continuous families of distributions. Bounds and approximations. Dependence, conditioning, Bayes methods. Convergence, Markov chains. Least squares prediction. Random permutations, symmetry, order statistics. Use of numerical computation, graphics, simulation, and computer algebra.

### Upper Division Electives

**Statistics 150**: Stochastic Processes. This course is especially recommended for students, with a strong interest in probability theory or stochastic models, including models in finance, ecology, epidemiology, geophysics and other fields. Random walks, discrete time Markov chains, Poisson processes, continuous time Markov chains, queueing theory, point processes, branching processes, renewal theory, stationary processes, Gaussian processes.

**Statistics 151A**: Linear Modelling: Theory and Applications. This course is especially recommended for students with an interest in economics, social science, or statistical models and data analysis more generally. A coordinated treatment of linear and generalized linear models and their application. Linear regression, analysis of variance and covariance, random effects, design and analysis of experiments, quality improvement, log-linear models for discrete multivariate data, model selection, robustness, graphical techniques, productive use of computers, in-depth case studies.

**Statistics 152**: Sampling Surveys. This course is especially recommended for students with an interest in social science, marketing, and data collection more generally. Theory and practice of sampling from finite populations. Simple random, stratified, cluster, and double sampling. Sampling with unequal probabilities. Properties of various estimators including ratio, regression, and difference estimators. Error estimation for complex samples.

**Statistics 153**: Introduction to Time Series. This course is especially recommended for students with an interest in physical science, communication and information theory, economics, finance, or actuarial work. An introduction to time series analysis in the time domain and spectral domain. Topics will include: estimation of trends and seasonal effects, autoregressive moving average models, forecasting, indicators, harmonic analysis, spectra.

**Statistics 154**: Modern Statistical Prediction and Machine Learning. Theory and practice of statistical prediction. Contemporary methods as extensions of classical methods. Topics: optimal prediction rules, the curse of dimensionality, empirical risk, linear regression and classification, basis expansions, regularization, splines, the bootstrap, model selection, classification and regression trees, boosting, support vector machines. Computational efficiency versus predictive performance. Emphasis on experience with real data and assessing statistical assumptions.

**Statistics 155**: Game Theory. This course is especially recommended for students with an interest in mathematics, optimization or strategy, including business decisions. General theory of zero-sum, two-person games, including games in extensive form and continuous games, and illustrated by detailed study of examples.

**Statistics 157**: Seminar on Topics in Probability and Statistics. Substantial student participation required. The topics to be covered each semester that the course may be offered will be announced by the middle of the preceding semester; see departmental bulletins. Recent topics include: Bayesian statistics, statistics and finance, random matrix theory, high-dimensional statistics.

**Statistics 158**: The Design and Analysis of Experiments. An introduction to the design and analysis of experiments. This course covers planning, conducting, and analyzing statistically designed experiments with an emphasis on hands-on experience. Standard designs studied include factorial designs, block designs, latin square designs, and repeated measures designs. Other topics covered include the principles of design, randomization, ANOVA, response surface methodoloy, and computer experiments.

### Options Within the Major

Statistics majors pursue many different careers. Some go to graduate programs in Statistics or other mathematical or scientific disciplines, some to MBA programs, some become actuaries, some go into teaching, and some into industry or government.

The major has a great deal of flexibility for students to acquire the skills and knowledge to take the next step, through upper-division electives and a mandatory three-course "cluster" outside the department. Many Statistics majors are double and triple majors; quantitative upper-division courses in their other majors are often suitable for the cluster.

Students intending to pursue graduate work in Statistics are encouraged to take Mathematics courses for their cluster.

Students intending to become actuaries can take courses in Economics and Demography, and are encouraged to take Statistics 151A, 152 and 153 for their electives. Some of these count towards "Validation by Educational Experience."

There is a special track for students who intend to become teachers. Such students take four Mathematics courses to prepare them for teaching a broader Mathematics curriculum, and are required to take only two upper-division Statistics elective courses.

Students interested in MBA programs are encouraged to take Business or Economics courses for their cluster, and to take Statistics 151A, 153 and either 152 or 155 as their upper-division electives.