Resources
Authors & Affiliations
Christian Schmid, James Murray
Abstract
The ability of a brain or a neural network to efficiently learn depends crucially on both the task structure and the learning rule. Previous works have analyzed the dynamical equations describing learning in the relatively simplified context of the perceptron under assumptions of a student-teacher framework or a linearized output. While these assumptions have facilitated theoretical insights, they have precluded a detailed exploration of the roles of the nonlinearity and input-data distribution in determining the learning dynamics, limiting the applicability of the theories to real biological or artificial neural networks. Here, we use a stochastic-process approach to derive flow equations describing learning in a general setting. We then apply this framework to the case of a nonlinear perceptron performing a Gaussian binary classification task. We characterize the effects of the learning rule (supervised or reinforcement learning, SL/RL) and the input noise covariance structure on the perceptron’s learning curve and the forgetting curve as subsequent tasks are learned. In particular, we find that the input-data noise differently affects the learning speed under SL vs. RL, as well as determines how quickly learning of a task is overwritten by subsequent learning. Additionally, we verify our approach with real data using the MNIST dataset and accurately capture non-trivial learning dynamics for structured input distributions in a high-dimensional setting. This approach points a way toward analyzing learning dynamics for more-complex circuit architectures and provides a path for designing experiments to distinguish between learning rules that the brain might be using.