Resources
Authors & Affiliations
Friedrich Schuessler,Francesca Mastrogiuseppe,Srdjan Ostojic,Omri Barak
Abstract
Trained artificial neural networks have become essential models in neuroscience. However, the robustness of these models to -- seemingly arbitrary -- design or initialization choices is currently debated. Some studies reported universality, others variability between solutions found by training. One choice, the magnitude of output weights, has recently received a lot of attention in machine learning, as it induces two very different classes of solutions in feed-forward networks. How this parameter affects solutions in recurrent neural networks (RNNs) trained on neuroscience tasks is not well understood.
We first approached this question with an example: a cycling task inspired by recent experiments.
We found two qualitatively different classes of solutions: For large output weights, the internal dynamics were mostly orthogonal to the output weight vector, or oblique. For small output weights, dynamics were instead aligned. Only the oblique solution shared key features with the experimental data.
We developed a theory to understand the different solutions. Our key result is that stability constraints allow for two classes of solutions, distinguished by the correlation between dynamics and output weights: oblique dynamics for large weights, aligned dynamics for small ones. Training RNNs across a variety of neuroscience tasks, we observed the two classes as predicted by our theory. Solutions often differed qualitatively between the classes, and also, for oblique solutions, within. Finally, the two classes differed from those in feed-forward networks, precisely because stability does not play a role there.
Beyond characterizing the effect of a model choice, our results give a new perspective on the relation between internal dynamics and output in the context of learning, and enable a better understanding of the ubiquitous observation of neural dynamics in orthogonal subspaces.