BELIEF-BASED REINFORCEMENT LEARNING EXPLAINS THE DYNAMICS OF MEMORY-DEPENDENT NAVIGATION UNDER UNCERTAINTY
Cajal Neuroscience Center (CNC), CSIC
Presentation
Date TBA
Event Information
Poster Board
PS07-10AM-354
Poster
View posterAbstract
Reward driven spatial navigation is widely used to understand factors involved in long-term memory. Memory-guided spatial navigation relies also on variables such as internal representations, environmental structure, and decision-making processes under uncertainty. In a foraging task, animals’ decision making is affected, not just by the memory of past reward locations, but also by memory-independent factors such as the exploration-exploitation tradeoff. This poses a challenge for interpreting experimental results.
To dissociate the contributions of memory from other factors affecting decision making, we used a partially observable Markov decision process (POMDP) framework. We modeled mouse behavior from the Morales et al. (2020) 8-ports spatial navigation task, where mice made choices in a high-throughput setting. In this framework, each mouse is treated as a reward‑maximizing agent making decisions under uncertainty and imperfect memory. In each session, mice start with prior beliefs (probability distributions over reward ports) representing imperfect memories of past reward locations. Belief states are updated via Bayesian inference from action-outcome history representing its current uncertain knowledge of the reward location and reward availability.
We compared a fixed greedy policy with a deep reinforcement-learning algorithm trained on empirical choice and outcome data, enabling likelihood-based policy comparison. Preliminary analyses show that policies that only maximize water reward are insufficient to capture mouse behavior, whereas agents incorporating a physical distance penalty provide substantially better fits.
Our results demonstrate that interpreting spatial navigation as a direct readout of memory is insufficient, as belief-based decision-making under uncertainty, and not reward maximization alone, shapes observed behavior.
Recommended posters
LEARNING OF REWARD LOCATIONS UNDER PATH INTEGRATION AND CUE-BASED NAVIGATION
Abolfazl Badripour, Dahee Jung, Sebastien Royer
NEUROMODULATORY REGULATION OF REINFORCEMENT-LEARNING SIGNALS DURING MEMORY-BASED BEHAVIOR IN HEALTH AND DISEASE
Erika Cerqueira, Paula Moledo, Pablo Jercog
EFFECT OF REWARD DISTRIBUTION AND EXTINCTION ON MOUSE BEHAVIOR AND PLACE CELL DISTRIBUTION IN AN OPEN-FIELD TASK
Genevieve Wager, Gal Shayer, Stefano Recanatesi, Vijay Balasubramanian, Genela Morris, Dori Derdikman
ADAPTIVE GEOMETRY OF COGNITIVE MAPS GOVERNS BEHAVIOR
Jon Recalde, Sen Cheng
MODELING THE INTERPLAY OF MOTIVATION AND LEARNING IN MOUSE PERCEPTUAL DECISION-MAKING
Giulio Matteucci, Maëlle Guyoton, Lucile Favero Montero, Ludovico Grabau, Sami El-Boustani
SPATIAL NAVIGATION ENGAGES A DISTRIBUTED NEURAL REPRESENTATION
Enny van Beest, Bex Terry, George Booth, Kenneth Harris, Matteo Carandini