Faculty Candidate Seminar

Perceiving the 3D World from Images and Videos

Yu XiangPost DocUniversity of Washington

With recent advances in artificial intelligence, we have witnessed the deployment of AI
systems that are capable of improving our daily lives such as the Amazon checkout-free
shop and self-driving cars. However, deploying a personal robot that is able to assist
people in accomplishing real world tasks is still very challenging. The difficulty lies in the
complexity of the 3D world we live in, where a robot may encounter thousands of
objects, different scenes and human activities. For a robot to safely operate in such an
environment, it needs to effectively extract, represent and interpret information about
the 3D environment from different sensory data.

In this talk, I will present my efforts towards designing intelligent visual models that
perceive the 3D world from images and videos. I will start by describing a novel 3D
scene understanding framework that jointly reconstructs the geometry of a scene and
recognizes objects in the scene. Then, I will elaborate on the design of a new
convolutional neural network for recognizing the 3D location and 3D pose of objects in
cluttered scenes. The network is very robust to occlusions between objects and handles
symmetric objects elegantly. I will conclude this talk by demonstrating that our methods
for 3D object recognition and scene understanding provide useful information for
intelligent systems to conduct tasks in the real world such as in autonomous driving and
robot manipulation.
Yu Xiang is a postdoctoral researcher in the Robotics Research Lab at Nvidia. He
received his Ph.D. in electrical engineering from the University of Michigan at Ann Arbor
in 2016 advised by Prof. Silvio Savarese. He was a postdoctoral researcher with Prof.
Dieter Fox in Computer Science & Engineering at the University of Washington from
2016 to 2017 and was a visiting student researcher in the artificial intelligence lab at
Stanford University from 2013 to 2016. He received M.S. degree and B.S. degree both
in computer science from Fudan University in 2010 and 2007, respectively. His research
interests primarily focus on computer vision and perception for robotics, with emphasize
on studying how can an intelligent system or a robot understand its 3D environment
from sensing.

Sponsored by