Faculty Candidate Seminar

Towards Scalable Representation Learning for Visual Recognition

Saining XieResearch ScientistFacebook

Zoom link, passcode:  890941

Joint CSE & ECE Faculty Candidate


A powerful biological and cognitive representation is essential for humans’ remarkable visual recognition abilities. Deep learning has achieved unprecedented success in a variety of domains over the last decade. One major driving force is representation learning, which is concerned with learning efficient, accurate, and robust representations from raw data that are useful for a downstream classifier or predictor.

A modern deep learning system is composed of two core and often intertwined components: 1) neural network architectures and 2) representation learning algorithms. In this talk, we will present several studies in both directions. On the neural network modeling side, we will examine modern network design principles and how they affect the scaling behavior of ConvNets and recent Vision Transformers. Additionally, we will demonstrate how we can acquire a better understanding of neural network connectivity patterns through the lens of random graphs. In terms of representation learning algorithms, we will discuss our recent efforts to move beyond the traditional supervised learning paradigm and demonstrate how self-supervised visual representation learning, which does not require human annotated labels, can outperform its supervised learning counterpart across a variety of visual recognition tasks. The talk will encompass a variety of vision application domains and modalities (e.g. 2D images and 3D scenes). The goal is to show existing connections between the techniques specialized for different input modalities and provide some insights about diverse challenges that each modality presents. Finally, we will discuss several pressing challenges and opportunities that the “big model era’’ raises for computer vision research.

Bio: Saining Xie is a research scientist at Facebook AI Research (FAIR). He received his Ph.D. and M.S. degrees in computer science from the University of California San Diego, advised by Zhuowen Tu. Prior to that, he received his Bachelor’s degree from Shanghai Jiao Tong University. He has broad research interests in deep learning and computer vision, with a focus on developing deep representation learning techniques to push the boundaries of core visual recognition. He is a recipient of the Marr Prize Honorable Mention at ICCV 2015.


Cindy Estell


Linda Scovel

Student Host

Mohamed El Banani (CSE) and TBD (ECE)

Faculty Host

Justin Johnson (CSE) and Andrew Owens (ECE)