World Wide relies on analytics signals to operate securely and keep research services available. Accept to continue, or leave the site.
Review the Privacy Policy for details about analytics processing.
Graduate Student
University of Bristol
Showing your local timezone
Schedule
Wednesday, December 1, 2021
4:30 AM America/New_York
Recording provided by the organiser.
Domain
NeuroscienceOriginal Event
View sourceHost
Neuromatch 4
Duration
15 minutes
In the deep reinforcement learning (RL) community, motor control problems are usually approached from a reward-based learning perspective. However, humans are often believed to learn motor control through directed error-based learning. Within this learning setting, the control system is assumed to have access to exact error signals and their gradients with respect to the control signal. This is unlike reward-based learning, in which errors are assumed to be unsigned, encoding relative successes and failures. Here, we try to understand the relation between these two approaches, reward- and error- based learning, and ballistic arm reaches. To do so, we test canonical (deep) RL algorithms on a well-known sensorimotor perturbation in neuroscience: mirror-reversal of visual feedback during arm reaching. This test leads us to propose a potentially novel RL algorithm, denoted as model-based deterministic policy gradient (MB-DPG). This RL algorithm draws inspiration from error-based learning to qualitatively reproduce human reaching performance under mirror-reversal. Next, we show MB-DPG outperforms the other canonical (deep) RL algorithms on a single- and a multi- target ballistic reaching task, based on a biomechanical model of the human arm. Finally, we propose MB-DPG may provide an efficient computational framework to help explain error-based learning in neuroscience.
Michele Garibbo
Graduate Student
University of Bristol
Contact & Resources