Prof. Laura Balzano
Title: Finding low-dimensional structure in messy data
Abstract: In order to draw inferences from large, high-dimensional datasets, we often seek simple structure that model the phenomena represented in those data. Low-rank linear structure is one of the most flexible and efficient such models, allowing efficient prediction, inference, and anomaly detection. However, classical techniques for learning low-rank models assume your data have only minor corruptions that are uniform over samples. Modern research in optimization has begun to develop new techniques to handle realistic messy data — where data are missing, have wide variations in quality, and/or are observed through nonlinear measurement systems.
In this talk we will focus on two problems. In the first, our data are heteroscedastic, ie, corrupted by one of several noise variances. This is common in problems like sensor networks or medical imaging, where different measurements of the same phenomenon are taken with different quality sensing (eg high or low radiation). In this context, learning the low-rank structure via PCA suffers from treating all data samples as if they are equally informative. We will discuss our theoretical results on weighted PCA and new algorithms for the non-convex probabilistic PCA formulation of this problem. In the second part of the talk we will extend the matrix completion problem to cases where the columns are points on low-dimensional nonlinear algebraic varieties. We discuss two optimization approaches to this problem, one kernelized algorithm and one that leverages existing LRMC techniques on a tensorized representation of the data. We also provide a formal mathematical justification for the success of our method and experimental results showing that the new approach outperforms existing state-of-the-art methods for matrix completion in many situations.
Biography: Laura Balzano is an associate professor in Electrical Engineering and Computer Science at the University of Michigan. She is recipient of the NSF Career Award, ARO Young Investigator Award, AFOSR Young Investigator Award, and faculty fellowships from Intel and 3M. She received the Vulcans Education Excellence Award at the University of Michigan. Her main research focus is on modeling with big, messy data — highly incomplete or corrupted data, uncalibrated data, and heterogeneous data — and its applications in a wide range of scientific problems. Her expertise is in statistical signal processing, matrix factorization, and optimization. Laura received a BS from Rice University, MS from the UCLA, and PhD from the University of Wisconsin in Electrical and Computer Engineering.