Faculty Candidate Seminar
From Specialists to Generalists: Inductive Biases of Deep Learning for Higher Level Cognition
This event is free and open to the publicAdd to Google Calendar
Zoom link, passcode: 543496
Abstract: Have the main principles required for deep learning to achieve human-level performance been discovered, with the main remaining obstacle being to scale up? Or do we need to follow a completely different research direction not built only on the principles already discovered with deep learning, in order to achieve the kind of cognitive competence displayed by humans?
My goal is to better understand the gap between current deep learning and human cognitive abilities so as to help answer these questions and suggest research directions for deep learning with the aim of bridging the gap towards human-level AI. My thesis is that deep learning brought remarkable progress but needs to be extended in qualitative and not just quantitative ways (larger datasets and more computing resources). I argue that having larger and more diverse datasets is important but insufficient without good architectural inductive biases. My main hypothesis is that deep learning has succeeded in part due to an appropriate set of inductive biases and that additional ones are needed to take us from where we are at to human-level intelligence i.e., from good in-distribution generalization in highly supervised learning tasks (or where strong and dense rewards are available), such as object recognition in images, to strong out-of-distribution generalization and transfer learning to new tasks with low sample complexity. To make that concrete, I consider some of the inductive biases humans may exploit for higher-level and highly sequential cognitive processing. In addition to thinking about the learning advantage, my work focuses on knowledge representation in neural networks, with the idea that by decomposing knowledge in small pieces which can be recomposed dynamically as needed (to reason, imagine or explain at an explicit level), one may achieve the kind of systematic generalization which humans enjoy and is obvious in natural language. This research program is both ambitious and practical, yielding concrete algorithms as well as a cohesive vision for long-term research towards generalization in a complex and changing world.