Mathematical Theory
mathematical theory
“Brain theory, what is it or what should it be?”
n the neurosciences the need for some 'overarching' theory is sometimes expressed, but it is not always obvious what is meant by this. One can perhaps agree that in modern science observation and experimentation is normally complemented by 'theory', i.e. the development of theoretical concepts that help guiding and evaluating experiments and measurements. A deeper discussion of 'brain theory' will require the clarification of some further distictions, in particular: theory vs. model and brain research (and its theory) vs. neuroscience. Other questions are: Does a theory require mathematics? Or even differential equations? Today it is often taken for granted that the whole universe including everything in it, for example humans, animals, and plants, can be adequately treated by physics and therefore theoretical physics is the overarching theory. Even if this is the case, it has turned out that in some particular parts of physics (the historical example is thermodynamics) it may be useful to simplify the theory by introducing additional theoretical concepts that can in principle be 'reduced' to more complex descriptions on the 'microscopic' level of basic physical particals and forces. In this sense, brain theory may be regarded as part of theoretical neuroscience, which is inside biophysics and therefore inside physics, or theoretical physics. Still, in neuroscience and brain research, additional concepts are typically used to describe results and help guiding experimentation that are 'outside' physics, beginning with neurons and synapses, names of brain parts and areas, up to concepts like 'learning', 'motivation', 'attention'. Certainly, we do not yet have one theory that includes all these concepts. So 'brain theory' is still in a 'pre-newtonian' state. However, it may still be useful to understand in general the relations between a larger theory and its 'parts', or between microscopic and macroscopic theories, or between theories at different 'levels' of description. This is what I plan to do.
Unifying the mechanisms of hippocampal episodic memory and prefrontal working memory
Remembering events in the past is crucial to intelligent behaviour. Flexible memory retrieval, beyond simple recall, requires a model of how events relate to one another. Two key brain systems are implicated in this process: the hippocampal episodic memory (EM) system and the prefrontal working memory (WM) system. While an understanding of the hippocampal system, from computation to algorithm and representation, is emerging, less is understood about how the prefrontal WM system can give rise to flexible computations beyond simple memory retrieval, and even less is understood about how the two systems relate to each other. Here we develop a mathematical theory relating the algorithms and representations of EM and WM by showing a duality between storing memories in synapses versus neural activity. In doing so, we develop a formal theory of the algorithm and representation of prefrontal WM as structured, and controllable, neural subspaces (termed activity slots). By building models using this formalism, we elucidate the differences, similarities, and trade-offs between the hippocampal and prefrontal algorithms. Lastly, we show that several prefrontal representations in tasks ranging from list learning to cue dependent recall are unified as controllable activity slots. Our results unify frontal and temporal representations of memory, and offer a new basis for understanding the prefrontal representation of WM
Geometry of concept learning
Understanding Human ability to learn novel concepts from just a few sensory experiences is a fundamental problem in cognitive neuroscience. I will describe a recent work with Ben Sorcher and Surya Ganguli (PNAS, October 2022) in which we propose a simple, biologically plausible, and mathematically tractable neural mechanism for few-shot learning of naturalistic concepts. We posit that the concepts that can be learned from few examples are defined by tightly circumscribed manifolds in the neural firing-rate space of higher-order sensory areas. Discrimination between novel concepts is performed by downstream neurons implementing ‘prototype’ decision rule, in which a test example is classified according to the nearest prototype constructed from the few training examples. We show that prototype few-shot learning achieves high few-shot learning accuracy on natural visual concepts using both macaque inferotemporal cortex representations and deep neural network (DNN) models of these representations. We develop a mathematical theory that links few-shot learning to the geometric properties of the neural concept manifolds and demonstrate its agreement with our numerical simulations across different DNNs as well as different layers. Intriguingly, we observe striking mismatches between the geometry of manifolds in intermediate stages of the primate visual pathway and in trained DNNs. Finally, we show that linguistic descriptors of visual concepts can be used to discriminate images belonging to novel concepts, without any prior visual experience of these concepts (a task known as ‘zero-shot’ learning), indicated a remarkable alignment of manifold representations of concepts in visual and language modalities. I will discuss ongoing effort to extend this work to other high level cognitive tasks.
Modularity of attractors in inhibition-dominated TLNs
Threshold-linear networks (TLNs) display a wide variety of nonlinear dynamics including multistability, limit cycles, quasiperiodic attractors, and chaos. Over the past few years, we have developed a detailed mathematical theory relating stable and unstable fixed points of TLNs to graph-theoretic properties of the underlying network. In particular, we have discovered that a special type of unstable fixed points, corresponding to "core motifs," are predictive of dynamic attractors. Recently, we have used these ideas to classify dynamic attractors in a two-parameter family of inhibition-dominated TLNs spanning all 9608 directed graphs of size n=5. Remarkably, we find a striking modularity in the dynamic attractors, with identical or near-identical attractors arising in networks that are otherwise dynamically inequivalent. This suggests that, just as one can store multiple static patterns as stable fixed points in a Hopfield model, a variety of dynamic attractors can also be embedded in a TLN in a modular fashion.