Semantic World Knowledge for NLP
Semantic world knowledge is crucial for resolving a variety of deep, complex decisions in natural language understanding. Annotated NLP corpora such as treebanks are too small to encode much of this knowledge, so instead, we harness such semantics from external unlabeled sources and non-language modalities. I will first discuss our work on using Web-based knowledge features for improved dependency parsing, constituent parsing, and structured taxonomy induction. Next, I will talk about learning various types of dense, continuous, task-tailored representations (aka embeddings) for improved syntactic parsing. Finally, I will discuss some current work on using other modalities as knowledge, e.g., cues from visual recognition and speech prosody.
This is joint work with various collaborators from UC Berkeley and TTI-Chicago.
Dr. Mohit Bansal is a research assistant professor at TTI-Chicago. He received a Ph.D. in CS from UC Berkeley in 2013 (under Dan Klein), an M.S. in CS from UC Berkeley, and a B.Tech. in CSE from the Indian Institute of Technology at Kanpur. His research interests are statistical natural language processing and machine learning, with a focus on semantics (lexical, compositional, multimodal), ontologies, syntactic parsing, and coreference resolution. He has received an IBM Faculty Award (2015), a Google Faculty Research Award (2014), an ACL Long Best Paper Honorable Mention (2014), a Qualcomm Innovation Fellowship (2011), and a UC Berkeley Outstanding Graduate Student Instructor Award (2011). He has also spent time at Google Research, Microsoft Research, and Cornell University.