Bin Yu to give Campion Lecture at RSS 2021 Conference

Bin Yu to give Campion Lecture at RSS 2021 Conference

We are excited to share that Professor Bin Yu has been chosen to be the Campion (President's Invited) Lecturer at this September's Royal Statistical Society Conference!  

Also known as the President's Invited lecture, this yearly lecture was named after the late Sir Harry Campion, who was the first director of the Central Statistical Office, the forerunner of the Office for National Statistics. Campion was also the inaugural director of the United Nations Statistical Office and the Royal Statistical Society’s President from 1957 to 1959.

Read more about Professor Yu's achievement here. Her lecture is entitled "Veridical Data Science: the practice of responsible data analysis and decision-making" and the abstract is below. 


"A.I. is like nuclear energy -- both promising and dangerous" -- Bill Gates, 2019. 

Data Science is a pillar of A.I. and has driven most of recent cutting-edge discoveries in biomedical research. In practice, Data Science has a life cycle (DSLC) that includes problem formulation, data collection, data cleaning, modeling, result interpretation and the drawing of conclusions. Human judgement calls :wq:ware ubiquitous at every step of this process, e.g., in choosing data cleaning methods, predictive algorithms and data perturbations. Such judgment calls are often responsible for the "dangers" of A.I. To maximally mitigate these dangers, we developed a framework based on three core principles: Predictability, Computability and Stability (PCS). Through a workflow and documentation (in R Markdown or Jupyter Notebook) that allows one to manage the whole DSLC, the PCS framework unifies, streamlines and expands on the best practices of machine learning and statistics – bringing us a step forward towards veridical Data Science. 

In this lecture, we will illustrate the PCS framework through the development of iterative random forests for predictive and stable non-linear interaction discovery and that of epiTree, a pipeline to discover epistasis interactions from genomics data. We will also briefly discuss two on-going PCS-driven software developments: VeridicalFlow and simChef for ease of PCS-compliant data analysis and data-driven simulations, respectively.