Department Seminar Series: Johann Gagnon-Bartsch, Removing Unwanted Variation with Negative Controls
High-throughput biological data, such as microarray data and gene sequencing data, are plagued by unwanted variation -- systematic errors introduced by variations in experimental conditions such as temperature, the chemical reagents used, etc. This unwanted variation is often stronger than the biological variation of interest, making analysis of the data challenging, and severely impeding the ability of researchers to capitalize on the promise of the technology. One of the biggest challenges to removing unwanted variation is that the factors causing this variation (temperature, atmospheric ozone, etc.) are unmeasured or simply unknown. This makes the unwanted variation difficult to identify; the problem is essentially one of unobserved confounders. In my talk, I will discuss the use of negative controls to help solve this problem. A negative control is a variable known a priori to be unassociated with the biological factor of interest. I will begin with an example that will introduce the notion of negative controls, and demonstrate the effectiveness of negative controls in dealing with unwanted variation. I will then discuss negative controls more generally, including a comparison with instrumental variables.