EECS Seminar

Learning Mid-Level Vision from Natural Data

Stella YuDirector of Vision GroupInternational Computer Science Institute
3725 Beyster BuildingMap

Zoom link for remote participants, passcode:  596685

Abstract: Computer vision with deep learning has achieved super-human performance on various benchmarks.  However, deep neural network models are highly specialized for the task and the data they are trained on.  In contrast, human vision is universal: It is a flexible light meter, an instant geometer, a versatile material comparator, and a holistic parser.  More importantly, babies with normal vision eventually all learn to see out of an initial nebulous blur and from their widely different visual experiences.

I attribute this fascinating development of universal visual perception to the ability of learning mid-level visual representations from natural data without any external supervision.  My key insight is that there are structures in the visual data that can be discovered with model bottlenecks and minimal priors.  I will present our stream of efforts on unsupervised learning of visual recognition:  seeing objectness (figure/ground) from watching unlabeled videos, recognizing individual objects and parsing a visual scene into hierarchical semantic concepts simply from a collection of unlabeled images.  Our data-driven computational modeling not only sheds light on human visual perception, but also opens up exciting new ways for scientists, engineers, and clinicians to look at their data and make novel discoveries.

Bio: Stella Yu received her Ph.D. from Carnegie Mellon University, where she studied robotics at the Robotics Institute and vision science at the Center for the Neural Basis of Cognition.  She is currently the Director of Vision Group at the International Computer Science Institute, a Senior Fellow at the Berkeley Institute for Data Science, and on the faculty of Computer Science, Vision Science, Cognitive and Brain Sciences at UC Berkeley.  Dr. Yu is interested not only in understanding visual perception from multiple perspectives, but also in using computer vision and machine learning to automate and exceed human expertise in practical applications.  Her group currently focuses on complex-valued deep learning, sound-vision integration, and actionable mid-level representation learning from non-curated data with minimal human annotations.  Dr. Yu leads multiple interdisciplinary projects and has a strong track record of joint research and successful product deployment.


Cindy Estell


Linda Scovel

Faculty Host

David Fouhey (CSE) and Andrew Owens (ECE)