Resources
Authors & Affiliations
Yuhan Helena Liu,Guillaume Lajoie
Abstract
Neuroscientists are increasingly turning to the mathematical framework of artificial neural networks
(ANNs) training for insights into biological learning mechanisms. This has motivated an influx of biologically
plausible learning rules that approximate backpropagation [1-9]. Despite achieving impressive performance
quantified by accuracy, these studies have not covered the breadth of solution characteristics found by these
rules. In this work, we leverage established theoretical tools from deep learning to investigate the robustness
of solutions, and gain insights into generalization properties of biologically relevant learning ingredients.
For complex tasks learned by overparameterized neural networks, there typically exists many solutions
(loss minima in parameter space) that result in similar accuracy, but can differ drastically in generalization
performance and robustness to perturbations. Theoretical work from machine learning establishes that the
curvature of such minima matters: flat minima can yield better generalization [10-14]. Leveraging this
theory, we ask: how do proposed biologically-motivated gradient approximations affect solution quality. In
recurrent networks, we demonstrate that several state-of-the-art biologically plausible learning rules tend
to approach high-curvature regions in synaptic weight space which leads to worse generalization properties,
compared to their machine learning counterparts. We track loss landscape curvature, as measured by the
loss’ Hessian eigenspectrum, in numerical experiments, and verify that this curvature informs generalization
performance. We derive analytical expressions explaining this phenomenon, which predicts numerical
results showing that a large learning rate early in training, followed by gradual decay to avoid instabilities,
can facilitate these rules to avoid or escape narrow minima. We discuss how such learning rate regulation
could be implemented biologically via neuromodulation [15], and formulate experimental predictions for
behaving animal experiments. To our knowledge, our analysis is the first to highlight and study this gap in
solution quality between artificial and biological learning rules, thereby motivating further research into
how the brain learns robust solutions.