Faculty Candidate Seminar

Toward Richer Visual Recognition: Attributes, Visual Phrases, and Sentences

Ali FarhadiPost-Doc FellowCarnegie Mellon University

What does it mean to do object recognition? My ultimate goal is to have a machine generate a human-quality description of images. Humans can form complete sentences describing images. These sentences identify the most interesting objects, the actions that are being performed, and the scene where the action occurs. Emulating this skill demands answers to fundamental questions about recognition: how can a recognition system deal with the vast number of objects in the real world? what should a recognition system report when it sees an unfamiliar object? what are the right quanta of recognition? In this talk, I will explore novel representations that try to answer these questions. First, I will describe the notion of "visual attributes" and show the benefits of adopting an attribute-centric framework in cross category generalization and in providing richer image descriptions. I will also introduce "visual phrases" chunks of meanings bigger than objects but smaller than scenes. Finally, I will show that using visual phrases significantly improves the performance of current recognition systems.
Ali Farhadi is a Postdoctoral Fellow at the Robotics Institute at Carnegie Mellon University working with Martial Hebert and Alexei Efros. He received his PhD from the computer science department at the University of Illinois at Urbana-Champaign under the supervision of David Forsyth. His work is mainly focused on computer vision and machine learning. More specifically, he is interested in cross-category generalization, attribute-based object representations, deeper image understanding, transfer learning and its applications to human activity and object recognition. Ali has been awarded the inaugural Google Fellowship in computer vision and image interpretation, the C.W. Gear Outstanding Graduate Award, the University of Illinois CS Fellowship, Beckman CS/AI Award, and the CVPR11 Best Student Paper Award.

Sponsored by