ePoster

Deliberation gated by opportunity cost adapts to context with urgency in non-human primates

Maximilian Puelma Touzel,Paul Cisek,Guillaume Lajoie
COSYNE 2022(2022)
Lisbon, Portugal
Presented: Mar 17, 2022

Conference

COSYNE 2022

Lisbon, Portugal

Resources

Authors & Affiliations

Maximilian Puelma Touzel,Paul Cisek,Guillaume Lajoie

Abstract

Finding the right amount of deliberation, between insufficient and excessive, is a hard decision-making problem that depends on the value we place on our time. Average reward, putatively encoded by tonic dopamine, serves in existing reinforcement learning (RL) theory as the stationary opportunity cost of time. This cost often varies with context, however, which changes over time. Current RL approaches thus do not efficiently handle task non-stationarity. Yet, the brain's representation of and computation with time's value, including its impact on the neural dynamics of deliberation, must account for this variation. Using non-human primates as a model, here, we offer a two-part proposal for how the brain achieves time-sensitive deliberation. The opportunity cost of time is (1) estimated adaptively and on multiple timescales from reward history and (2) is represented directly as urgency, a previously characterized neural signal that lowers the threshold for decisions as within-trial deliberation goes on. We show that this simple, value-free strategy we call Performance-Gated Deliberation (PGD) is a heuristic approximation of the optimal, average-reward reinforcement learning (AR-RL) strategy. We highlight that the context variation of urgency from both PMd and LIP recordings in separate tasks favors a trial-aware versus trial-unaware cost of time. Using this version, we fit a PGD agent to decision times from recorded behaviour of two non-human primates in a prediction task with non-stationary reward context. This PGD agent outperforms AR-RL optimal solutions in explaining the state-dependence of the behaviour, with the model of the hastier subject having shorter inferred memory window and larger inferred reward bias. The opportunity cost profiles also match the urgency signals extracted from simultaneous PMd recordings. Our integrated research approach spanning cognitive and systems neuroscience grounds the value of time in its neural representation by revealing its impact on the dynamics of decision-making brain areas.

Unique ID: cosyne-22/deliberation-gated-opportunity-cost-6c63bad8