AI Seminar

Rethinking State, Action, and Reward in Reinforcement Learning

Prof. Satinder Singh

Over the last decade and more, there has been rapid theoretical and empirical progress in reinforcement learning (RL) using the well-established formalisms of Markov decision processes (MDPs) and partially observable MDPs or POMDPs. At the core of these formalisms are particular formulations of the elemental notions of state, action, and reward that have served the field of RL so well. In this talk, I will describe recent progress in rethinking these basic elements to take the field beyond (PO)MDPs. In particular, I will briefly describe older work on flexible notions of actions called options, briefly describe some recent work on intrinsic rather than extrinsic rewards, and then spend the bulk of my time on recent work on predictive representations of state. I will conclude by arguing that taken together these advances point the way for RL to address the many challenges of building an artificial intelligence.

Sponsored by

AI Lab