Resources
Authors & Affiliations
Olivier Codol, Nanda H Krishna, Guillaume Lajoie, Matthew G. Perich
Abstract
During development, neural circuits are shaped continuously as we learn to control our bodies, with the goal of producing neural dynamics that enable the rich repertoire of behaviors we perform with our limbs. However, the nature of the teaching signal underlying this normative learning process remains elusive. Here, we test two well-established and biologically plausible theories---supervised learning (SL) and reinforcement learning (RL)---that could explain how neural circuits develop the capacity for skilled movements. We trained recurrent neural networks to control a biomechanical model of a primate arm using either SL or RL and compared the resulting neural dynamics to populations of neurons recorded from the motor cortex of monkeys performing the same movements. Intriguingly, RL-trained networks produced neural activity that best matched their biological counterparts both in terms of geometry and dynamics of population activity, in contrast to SL networks. We show that the similarity between RL-trained networks and biological brains depends critically on matching biomechanical properties of the limb. We then demonstrated that monkeys and RL-trained networks, but not SL-trained networks, show a strikingly similar capacity for robust short-term behavioral adaptation to a movement perturbation, indicating a fundamental and general commonality in the neural control policy. Together, our results support the hypothesis that neural dynamics for behavioral control emerge through a process akin to reinforcement learning. The resulting neural circuits offer numerous advantages for adaptable behavior over simpler and more efficient learning rules and expand our understanding of how developmental processes shape neural dynamics.