Recurrent Networks
recurrent networks
The role of sub-population structure in computations through neural dynamics
Neural computations are currently conceptualised using two separate approaches: sorting neurons into functional sub-populations or examining distributed collective dynamics. Whether and how these two aspects interact to shape computations is currently unclear. Using a novel approach to extract computational mechanisms from recurrent networks trained on neuroscience tasks, we show that the collective dynamics and sub-population structure play fundamentally complementary roles. Although various tasks can be implemented in networks with fully random population structure, we found that flexible input–output mappings instead require a non-random population structure that can be described in terms of multiple sub-populations. Our analyses revealed that such a sub-population organisation enables flexible computations through a mechanism based on gain-controlled modulations that flexibly shape the collective dynamics.
Convex neural codes in recurrent networks and sensory systems
Neural activity in many sensory systems is organized on low-dimensional manifolds by means of convex receptive fields. Neural codes in these areas are constrained by this organization, as not every neural code is compatible with convex receptive fields. The same codes are also constrained by the structure of the underlying neural network. In my talk I will attempt to provide answers to the following natural questions: (i) How do recurrent circuits generate codes that are compatible with the convexity of receptive fields? (ii) How can we utilize the constraints imposed by the convex receptive field to understand the underlying stimulus space. To answer question (i), we describe the combinatorics of the steady states and fixed points of recurrent networks that satisfy the Dale’s law. It turns out the combinatorics of the fixed points are completely determined by two distinct conditions: (a) the connectivity graph of the network and (b) a spectral condition on the synaptic matrix. We give a characterization of exactly which features of connectivity determine the combinatorics of the fixed points. We also find that a generic recurrent network that satisfies Dale's law outputs convex combinatorial codes. To address question (ii), I will describe methods based on ideas from topology and geometry that take advantage of the convex receptive field properties to infer the dimension of (non-linear) neural representations. I will illustrate the first method by inferring basic features of the neural representations in the mouse olfactory bulb.
Training Dynamic Spiking Neural Network via Forward Propagation Through Time
With recent advances in learning algorithms, recurrent networks of spiking neurons are achieving performance competitive with standard recurrent neural networks. Still, these learning algorithms are limited to small networks of simple spiking neurons and modest-length temporal sequences, as they impose high memory requirements, have difficulty training complex neuron models, and are incompatible with online learning.Taking inspiration from the concept of Liquid Time-Constant (LTCs), we introduce a novel class of spiking neurons, the Liquid Time-Constant Spiking Neuron (LTC-SN), resulting in functionality similar to the gating operation in LSTMs. We integrate these neurons in SNNs that are trained with FPTT and demonstrate that thus trained LTC-SNNs outperform various SNNs trained with BPTT on long sequences while enabling online learning and drastically reducing memory complexity. We show this for several classical benchmarks that can easily be varied in sequence length, like the Add Task and the DVS-gesture benchmark. We also show how FPTT-trained LTC-SNNs can be applied to large convolutional SNNs, where we demonstrate novel state-of-the-art for online learning in SNNs on a number of standard benchmarks (S-MNIST, R-MNIST, DVS-GESTURE) and also show that large feedforward SNNs can be trained successfully in an online manner to near (Fashion-MNIST, DVS-CIFAR10) or exceeding (PS-MNIST, R-MNIST) state-of-the-art performance as obtained with offline BPTT. Finally, the training and memory efficiency of FPTT enables us to directly train SNNs in an end-to-end manner at network sizes and complexity that was previously infeasible: we demonstrate this by training in an end-to-end fashion the first deep and performant spiking neural network for object localization and recognition. Taken together, we out contribution enable for the first time training large-scale complex spiking neural network architectures online and on long temporal sequences.
Flexible multitask computation in recurrent networks utilizes shared dynamical motifs
Flexible computation is a hallmark of intelligent behavior. Yet, little is known about how neural networks contextually reconfigure for different computations. Humans are able to perform a new task without extensive training, presumably through the composition of elementary processes that were previously learned. Cognitive scientists have long hypothesized the possibility of a compositional neural code, where complex neural computations are made up of constituent components; however, the neural substrate underlying this structure remains elusive in biological and artificial neural networks. Here we identified an algorithmic neural substrate for compositional computation through the study of multitasking artificial recurrent neural networks. Dynamical systems analyses of networks revealed learned computational strategies that mirrored the modular subtask structure of the task-set used for training. Dynamical motifs such as attractors, decision boundaries and rotations were reused across different task computations. For example, tasks that required memory of a continuous circular variable repurposed the same ring attractor. We show that dynamical motifs are implemented by clusters of units and are reused across different contexts, allowing for flexibility and generalization of previously learned computation. Lesioning these clusters resulted in modular effects on network performance: a lesion that destroyed one dynamical motif only minimally perturbed the structure of other dynamical motifs. Finally, modular dynamical motifs could be reconfigured for fast transfer learning. After slow initial learning of dynamical motifs, a subsequent faster stage of learning reconfigured motifs to perform novel tasks. This work contributes to a more fundamental understanding of compositional computation underlying flexible general intelligence in neural systems. We present a conceptual framework that establishes dynamical motifs as a fundamental unit of computation, intermediate between the neuron and the network. As more whole brain imaging studies record neural activity from multiple specialized systems simultaneously, the framework of dynamical motifs will guide questions about specialization and generalization across brain regions.
Computation in the neuronal systems close to the critical point
It was long hypothesized that natural systems might take advantage of the extended temporal and spatial correlations close to the critical point to improve their computational capabilities. However, on the other side, different distances to criticality were inferred from the recordings of nervous systems. In my talk, I discuss how including additional constraints on the processing time can shift the optimal operating point of the recurrent networks. Moreover, the data from the visual cortex of the monkeys during the attentional task indicate that they flexibly change the closeness to the critical point of the local activity. Overall it suggests that, as we would expect from common sense, the optimal state depends on the task at hand, and the brain adapts to it in a local and fast manner.
Taming chaos in neural circuits
Neural circuits exhibit complex activity patterns, both spontaneously and in response to external stimuli. Information encoding and learning in neural circuits depend on the ability of time-varying stimuli to control spontaneous network activity. In particular, variability arising from the sensitivity to initial conditions of recurrent cortical circuits can limit the information conveyed about the sensory input. Spiking and firing rate network models can exhibit such sensitivity to initial conditions that are reflected in their dynamic entropy rate and attractor dimensionality computed from their full Lyapunov spectrum. I will show how chaos in both spiking and rate networks depends on biophysical properties of neurons and the statistics of time-varying stimuli. In spiking networks, increasing the input rate or coupling strength aids in controlling the driven target circuit, which is reflected in both a reduced trial-to-trial variability and a decreased dynamic entropy rate. With sufficiently strong input, a transition towards complete network state control occurs. Surprisingly, this transition does not coincide with the transition from chaos to stability but occurs at even larger values of external input strength. Controllability of spiking activity is facilitated when neurons in the target circuit have a sharp spike onset, thus a high speed by which neurons launch into the action potential. I will also discuss chaos and controllability in firing-rate networks in the balanced state. For these, external control of recurrent dynamics strongly depends on correlations in the input. This phenomenon was studied with a non-stationary dynamic mean-field theory that determines how the activity statistics and the largest Lyapunov exponent depend on frequency and amplitude of the input, recurrent coupling strength, and network size. This shows that uncorrelated inputs facilitate learning in balanced networks. The results highlight the potential of Lyapunov spectrum analysis as a diagnostic for machine learning applications of recurrent networks. They are also relevant in light of recent advances in optogenetics that allow for time-dependent stimulation of a select population of neurons.
Norse: A library for gradient-based learning in Spiking Neural Networks
We introduce Norse: An open-source library for gradient-based training of spiking neural networks. In contrast to neuron simulators which mainly target computational neuroscientists, our library seamlessly integrates with the existing PyTorch ecosystem using abstractions familiar to the machine learning community. This has immediate benefits in that it provides a familiar interface, hardware accelerator support and, most importantly, the ability to use gradient-based optimization. While many parallel efforts in this direction exist, Norse emphasizes flexibility and usability in three ways. Users can conveniently specify feed-forward (convolutional) architectures, as well as arbitrarily connected recurrent networks. We strictly adhere to a functional and class-based API such that neuron primitives and, for example, plasticity rules composes. Finally, the functional core API ensures compatibility with the PyTorch JIT and ONNX infrastructure. We have made progress to support network execution on the SpiNNaker platform and plan to support other neuromorphic architectures in the future. While the library is useful in its present state, it also has limitations we will address in ongoing work. In particular, we aim to implement event-based gradient computation, using the EventProp algorithm, which will allow us to support sparse event-based data efficiently, as well as work towards support of more complex neuron models. With this library, we hope to contribute to a joint future of computational neuroscience and neuromorphic computing.
Frontal circuit specialisations for decision making
During primate evolution, prefrontal cortex (PFC) expanded substantially relative to other cortical areas. The expansion of PFC circuits likely supported the increased cognitive abilities of humans and anthropoids to plan, evaluate, and decide between different courses of action. But what do these circuits compute as a decision is being made, and how can they be related to anatomical specialisations within and across PFC? To address this, we recorded PFC activity during value-based decision making using single unit recording in non-human primates and magnetoencephalography in humans. At a macrocircuit level, we found that value correlates differ substantially across PFC subregions. They are heavily shaped by each subregion’s anatomical connections and by the decision-maker’s current locus of attention. At a microcircuit level, we found that the temporal evolution of value correlates can be predicted using cortical recurrent network models that temporally integrate incoming decision evidence. These models reflect the fact that PFC circuits are highly recurrent in nature and have synaptic properties that support persistent activity across temporally extended cognitive tasks. Our findings build upon recent work describing economic decision making as a process of attention-weighted evidence integration across time.
A theory for Hebbian learning in recurrent E-I networks
The Stabilized Supralinear Network is a model of recurrently connected excitatory (E) and inhibitory (I) neurons with a supralinear input-output relation. It can explain cortical computations such as response normalization and inhibitory stabilization. However, the network's connectivity is designed by hand, based on experimental measurements. How the recurrent synaptic weights can be learned from the sensory input statistics in a biologically plausible way is unknown. Earlier theoretical work on plasticity focused on single neurons and the balance of excitation and inhibition but did not consider the simultaneous plasticity of recurrent synapses and the formation of receptive fields. Here we present a recurrent E-I network model where all synaptic connections are simultaneously plastic, and E neurons self-stabilize by recruiting co-tuned inhibition. Motivated by experimental results, we employ a local Hebbian plasticity rule with multiplicative normalization for E and I synapses. We develop a theoretical framework that explains how plasticity enables inhibition balanced excitatory receptive fields that match experimental results. We show analytically that sufficiently strong inhibition allows neurons' receptive fields to decorrelate and distribute themselves across the stimulus space. For strong recurrent excitation, the network becomes stabilized by inhibition, which prevents unconstrained self-excitation. In this regime, external inputs integrate sublinearly. As in the Stabilized Supralinear Network, this results in response normalization and winner-takes-all dynamics: when two competing stimuli are presented, the network response is dominated by the stronger stimulus while the weaker stimulus is suppressed. In summary, we present a biologically plausible theoretical framework to model plasticity in fully plastic recurrent E-I networks. While the connectivity is derived from the sensory input statistics, the circuit performs meaningful computations. Our work provides a mathematical framework of plasticity in recurrent networks, which has previously only been studied numerically and can serve as the basis for a new generation of brain-inspired unsupervised machine learning algorithms.
Recurrent network dynamics lead to interference in sequential learning
Learning in real life is often sequential: A learner first learns task A, then task B. If the tasks are related, the learner may adapt the previously learned representation instead of generating a new one from scratch. Adaptation may ease learning task B but may also decrease the performance on task A. Such interference has been observed in experimental and machine learning studies. In the latter case, it is mediated by correlations between weight updates for the different tasks. In typical applications, like image classification with feed-forward networks, these correlated weight updates can be traced back to input correlations. For many neuroscience tasks, however, networks need to not only transform the input, but also generate substantial internal dynamics. Here we illuminate the role of internal dynamics for interference in recurrent neural networks (RNNs). We analyze RNNs trained sequentially on neuroscience tasks with gradient descent and observe forgetting even for orthogonal tasks. We find that the degree of interference changes systematically with tasks properties, especially with emphasis on input-driven over autonomously generated dynamics. To better understand our numerical observations, we thoroughly analyze a simple model of working memory: For task A, a network is presented with an input pattern and trained to generate a fixed point aligned with this pattern. For task B, the network has to memorize a second, orthogonal pattern. Adapting an existing representation corresponds to the rotation of the fixed point in phase space, as opposed to the emergence of a new one. We show that the two modes of learning – rotation vs. new formation – are directly linked to recurrent vs. input-driven dynamics. We make this notion precise in a further simplified, analytically tractable model, where learning is restricted to a 2x2 matrix. In our analysis of trained RNNs, we also make the surprising observation that, across different tasks, larger random initial connectivity reduces interference. Analyzing the fixed point task reveals the underlying mechanism: The random connectivity strongly accelerates the learning mode of new formation, and has less effect on rotation. The prior thus wins the race to zero loss, and interference is reduced. Altogether, our work offers a new perspective on sequential learning in recurrent networks, and the emphasis on internally generated dynamics allows us to take the history of individual learners into account.
Generalizing theories of cerebellum-like learning
Since the theories of Marr, Ito, and Albus, the cerebellum has provided an attractive well-characterized model system to investigate biological mechanisms of learning. In recent years, theories have been developed that provide a normative account for many features of the anatomy and function of cerebellar cortex and cerebellum-like systems, including the distribution of parallel fiber-Purkinje cell synaptic weights, the expansion in neuron number of the granule cell layer and their synaptic in-degree, and sparse coding by granule cells. Typically, these theories focus on the learning of random mappings between uncorrelated inputs and binary outputs, an assumption that may be reasonable for certain forms of associative conditioning but is also quite far from accounting for the important role the cerebellum plays in the control of smooth movements. I will discuss in-progress work with Marjorie Xie, Samuel Muscinelli, and Kameron Decker Harris generalizing these learning theories to correlated inputs and general classes of smooth input-output mappings. Our studies build on earlier work in theoretical neuroscience as well as recent advances in the kernel theory of wide neural networks. They illuminate the role of pre-expansion structures in processing input stimuli and the significance of sparse granule cell activity. If there is time, I will also discuss preliminary work with Jack Lindsey extending these theories beyond cerebellum-like structures to recurrent networks.
A geometric framework to predict structure from function in neural networks
The structural connectivity matrix of synaptic weights between neurons is a critical determinant of overall network function. However, quantitative links between neural network structure and function are complex and subtle. For example, many networks can give rise to similar functional responses, and the same network can function differently depending on context. Whether certain patterns of synaptic connectivity are required to generate specific network-level computations is largely unknown. Here we introduce a geometric framework for identifying synaptic connections required by steady-state responses in recurrent networks of rectified-linear neurons. Assuming that the number of specified response patterns does not exceed the number of input synapses, we analytically calculate all feedforward and recurrent connectivity matrices that can generate the specified responses from the network inputs. We then use this analytical characterization to rigorously analyze the solution space geometry and derive certainty conditions guaranteeing a non-zero synapse between neurons.
Residual population dynamics as a window into neural computation
Neural activity in frontal and motor cortices can be considered to be the manifestation of a dynamical system implemented by large neural populations in recurrently connected networks. The computations emerging from such population-level dynamics reflect the interaction between external inputs into a network and its internal, recurrent dynamics. Isolating these two contributions in experimentally recorded neural activity, however, is challenging, limiting the resulting insights into neural computations. I will present an approach to addressing this challenge based on response residuals, i.e. variability in the population trajectory across repetitions of the same task condition. A complete characterization of residual dynamics is well-suited to systematically compare computations across brain areas and tasks, and leads to quantitative predictions about the consequences of small, arbitrary causal perturbations.
The emergence and modulation of time in neural circuits and behavior
Spontaneous behavior in animals and humans shows a striking amount of variability both in the spatial domain (which actions to choose) and temporal domain (when to act). Concatenating actions into sequences and behavioral plans reveals the existence of a hierarchy of timescales ranging from hundreds of milliseconds to minutes. How do multiple timescales emerge from neural circuit dynamics? How do circuits modulate temporal responses to flexibly adapt to changing demands? In this talk, we will present recent results from experiments and theory suggesting a new computational mechanism generating the temporal variability underlying naturalistic behavior. We will show how neural activity from premotor areas unfolds through temporal sequences of attractors, which predict the intention to act. These sequences naturally emerge from recurrent cortical networks, where correlated neural variability plays a crucial role in explaining the observed variability in action timing. We will then discuss how reaction times in these recurrent circuits can be accelerated or slowed down via gain modulation, induced by neuromodulation or perturbations. Finally, we will present a general mechanism producing a reservoir of multiple timescales in recurrent networks.
E-prop: A biologically inspired paradigm for learning in recurrent networks of spiking neurons
Transformative advances in deep learning, such as deep reinforcement learning, usually rely on gradient-based learning methods such as backpropagation through time (BPTT) as a core learning algorithm. However, BPTT is not argued to be biologically plausible, since it requires to a propagate gradients backwards in time and across neurons. Here, we propose e-prop, a novel gradient-based learning method with local and online weight update rules for recurrent neural networks, and in particular recurrent spiking neural networks (RSNNs). As a result, e-prop has the potential to provide a substantial fraction of the power of deep learning to RSNNs. In this presentation, we will motivate e-prop from the perspective of recent insights in neuroscience and show how these have to be combined to form an algorithm for online gradient descent. The mathematical results will be supported by empirical evidence in supervised and reinforcement learning tasks. We will also discuss how limitations that are inherited from gradient-based learning methods, such as sample-efficiency, can be addressed by considering an evolution-like optimization that enhances learning on particular task families. The emerging learning architecture can be used to learn tasks by a single demonstration, hence enabling one-shot learning.
Effect of experience on context-dependent learning in recurrent networks
Bernstein Conference 2024
A family of synaptic plasticity rules based on spike times produces a diversity of triplet motifs in recurrent networks
Bernstein Conference 2024
Linking Neural Manifolds to Circuit Structure in Recurrent Networks
Bernstein Conference 2024
Response variability can accelerate learning in feedforward-recurrent networks
Bernstein Conference 2024
Synaptic Upscaling Amplifies Chaotic Dynamics in Recurrent Networks of Rate Neurons
Bernstein Conference 2024
Multitask computation in recurrent networks utilizes shared dynamical motifs
COSYNE 2022
Multitask computation in recurrent networks utilizes shared dynamical motifs
COSYNE 2022
Revisiting the flexibility-stability dilemma in recurrent networks using a multiplicative plasticity rule
COSYNE 2022
Revisiting the flexibility-stability dilemma in recurrent networks using a multiplicative plasticity rule
COSYNE 2022
Inhibitory control of plasticity promotes stability and competitive learning in recurrent networks
COSYNE 2023
Locally coupled oscillatory recurrent networks learn traveling waves and topographic organization
COSYNE 2023
Spiking and bursting activity in stochastic recurrent networks
COSYNE 2023
Synaptic-type-specific clustering optimizes the computational capabilities of balanced recurrent networks
COSYNE 2023
Effect of experience on context-dependent learning in recurrent networks
COSYNE 2025
A family of synaptic plasticity rules shapes triplet motifs in recurrent networks
COSYNE 2025
Nonlinear Manifolds Underlie Robust Representations of Continuous Variables in Recurrent Networks
COSYNE 2025
Predicting neural activity in connectome-constrained recurrent networks
COSYNE 2025
Fading memory as inductive bias in residual recurrent networks
FENS Forum 2024