Learning Syntactic Structures from Visually Grounded Text and Speech
Add to Google Calendar
Meeting ID: 935 3938 9558
Language is highly structured: humans can learn language naturally and efficiently, as well as using natural language to interact with the world. Even more impressively, though humans implicitly develop and use structure for language processing in their daily communication, the explicit structure of sentences is almost never given.
In this talk, I will present our work that models syntax acquisition by learning from visually grounded text and speech. I will introduce the task of visually grounded grammar induction and our proposed solution based on visual concreteness estimation of text spans, as well as our recent attempt that extend the framework to visually grounded speech. I will conclude the talk by discussing the remaining challenges and potential directions for future work.
Freda Shi is currently a Ph.D. candidate at the Toyota Technological Institute at Chicago. She will join the David R. Cheriton School of Computer Science at the University of Waterloo as an Assistant Professor and the Vector Institute as a Faculty Member in July 2024. Her research interests are in computational linguistics, natural language processing and related aspects of machine learning. Her work has been recognized with best paper nominations at ACL 2019 and 2021, as well as a Google PhD Fellowship. She obtained her B.S. in Intelligence Science and Technology from the School of EECS at Peking University and minored in Sociology.