ePoster
A Model of Place Field Reorganization During Reward Maximization
M Ganesh Kumarand 3 co-authors
COSYNE 2025 (2025)
Montreal, Canada
Presentation
Date TBA
Event Information
Poster
View posterAbstract
When rodents learn to navigate in a new environment, high place field density emerge at reward locations, fields elongate against the trajectory, and individual fields change spatial selectivity or drift while demonstrating stable behavior. Why place fields demonstrate these characteristic phenomena during learning remains elusive. We develop a model that optimizes for a normative goal, which is to maximize cumulative rewards. Place fields are modelled using Gaussian basis functions to represent spatial information of an environment, and directly synapse to an actor-critic for policy learning in a 1D and 2D environment. Each field's amplitude, center and width, as well as actor weights, are updated online to maximize cumulative reward while the critic minimizes the TD error. We demonstrate that place fields near the target increase their firing rates and move closer to the target during learning. We analytically argue that each place field's dynamics is modulated by the value of a location. Next, we show that both our normative model and fields trained to learn the successor representation increase in size and the center of mass shifts backwards towards the start location. Interestingly, each model's spatial representation evolves differently during early learning stages before becoming aligned. Furthermore, we show that within a certain noise regime, the population vector correlation decreases while the representation similarity remains fairly stable, and fields that are important for stable navigation performance drift less. Finally, we show that incorporating these place field phenomena improves the speed of policy convergence when learning to navigate to a single target, and when the target is shifted to a new location, suggesting a functional role for place field reorganization and noise in continual learning. To conclude, we develop a normative model that recapitulates three aspects of place field learning dynamics and unifies mechanisms to offer testable predictions for future experiments.