AI Seminar

Set-Valued Dynamic Treatment Regimes for Competing Outcomes

Dan LizotteAssistant Professor, Department of Computer Science and Department of Epidemiology & BiostatisticsUniversity of Western Ontario

Dynamic treatment regimes ("policies" in the reinforcement learning
literature) operationalize clinical decision-making as a sequence of
functions, one for each clinical decision, where each function maps patient
features to a recommended treatment. Reinforcement learning (RL) methods for
learning optimal dynamic treatment regimes, for example Q-learning, require
the specification of a single outcome or "reward" that measures the quality
of the decisions. However, in practice clinical decision making aims to
balance several potentially competing outcomes, e.g., symptom relief and
side-effect burden. When there are competing outcomes and patients do not
know or cannot communicate the relative importance of each of them, forming
a single reward that captures "optimal decision-making" is not possible. I
will discuss recent developments in RL for learning dynamic treatment
regimes that accommodate competing outcomes by recommending sets of
treatments at each decision point. The methods will be illustrated using
data from the CATIE schizophrenia study.
Professor Lizotte is interested in the areas of machine learning,
reinforcement learning, and statistics, particularly as they apply to
problems in health informatics. We are now seeing the development of
electronic data sources that record how thousands or even millions of
patients respond to different sequences of treatments over time, and these
have the potential to inform evidence-based non-myopic medical decision
making more effectively than previous studies. However current techniques
are not always well-suited to this task. Professor Lizotte's basic research
aims to adapt and improve reinforcement learning, machine learning, and
statistical techniques so they can be applied to these new sources of
sequential medical data, and can in turn provide doctors with the best
available evidence for non-myopic decision making.

Sponsored by