Resources
Authors & Affiliations
Masud Ehsani, Sen Cheng
Abstract
Experimental studies have demonstrated that spontaneous task-independent replay of sequences encoding a learned route during sharp-wave–ripple (SWR) events can be beneficial for consolidating path memory [1] and discovering new behavioral strategies, such as forming novel space-action representations, finding shortcuts, and integrating and associating different routes[2]. Additionally, It has been shown experimentally that hippocampal theta sequences, generated during active exploration of the environment, play a crucial role in selecting appropriate actions in goal-directed tasks[3,4]. However, the mechanism by which activation of a sequence of upcoming positions in each theta cycle and replay of sequences in SWR alters and shapes action selection remains unclear.
In a closed-loop model of spatial navigation based on place cells and action selection cells [5], we investigate how the emergence of theta code and sequence replay can modulate the action selection strategy. The model comprises a recurrent inhibitory-excitatory layer with place cells formed from spatially tuned input received from the environment, connected to the action-selection neuronal population modeled as a ring attractor. The activity of action selection neurons determines the agent's actions in the environment, with the task of finding a reward region and learning the optimal path by plasticity of place-to-action and place-to-place cell connections.
We observe that during rest periods, the random replay of sequences of place cells representing a learned non-optimal path changes the connection strength among place cells through the spike-timing-dependent plasticity (STDP) rule. In the presence of connection delays, competition between shorter and longer paths leads to the shortening of place cell sequences, resulting in a sparser code. STDP plasticity also adjusts the connections from place cells to action cells, enabling the formation of representations of shortcuts. After replay, the direction of action associated with each location becomes a vector summation of the upcoming actions toward the goal. We further hypothesize that during active exploration, the activation of upcoming place cells in each theta cycle results in the simultaneous activation of actions associated with each pair of states. This activation pattern may cause a bump to form in the ring attractor, directing the agent towards the shortcut path based on the summation of action vectors.