ePoster

Optimal reward-rate in multi-task environments, and its consequences for behavior

Lucas Silva Simões,Alex Pouget,Peter Latham
COSYNE 2022(2022)
Lisbon, Portugal

Conference

COSYNE 2022

Lisbon, Portugal

Resources

Authors & Affiliations

Lucas Silva Simões,Alex Pouget,Peter Latham

Abstract

Consider a task where you're accumulating noisy evidence about two options, and the longer you collect evidence the more likely you are to choose the correct one. You get $1,000 for choosing the correct option and $900 for choosing the incorrect one. How long should you wait before making a decision? Tasks like this have been studied for decades, but typically in isolation. However, in the real world you always have the option of switching tasks. This can have a large effect on behavior: if, after making a choice, you are able to switch to an even more rewarding task, you won't take long at all, but if most tasks you encounter yield much lower rewards, you're likely to take a very long time. So the answer to how long you should take is: it depends on the reward statistics of other tasks. Here we provide a formulation for the problem of maximizing reward rate in a multi-task setting, and present an efficient reinforcement learning algorithm for solving it; the algorithm extends results in foraging theory to stochastic environments. We argue that human behavior aligns with what is expected from our algorithm. We illustrate this for two-task environments, and show that the amount of time spent on one task depends strongly on the reward structure of the other, and the probability that it occurs. Our theory makes several experimentally testable predictions about human -- and animal -- behavior.

Unique ID: cosyne-22/optimal-rewardrate-multitask-environments-1289864b