Rnns
RNNs
Prof Jakob Macke
How do neural circuits in the human brain recognize objects, persons and actions from complex visual stimuli? To address these questions, we will develop deep convolutional neural networks for modelling how neurons in high-level human brain areas respond to complex visual information. We will make use of a unique dataset of neurophysiological recordings of single-unit activity and field potentials recorded from the medial temporal lobe of epilepsy patients. Our tools will open up avenues for a range of new investigations in cognitive and clinical neuroscience, and may inspire new artificial vision systems. The position is part of a collaboration with the `Dynamic Vision and Learning’ Group at TU Munich (Prof. Dr. Laura Leal-Taixé) and the Cognitive and Clinical Neurophysiology Group at University Hospital Bonn (Prof. Dr. Dr. Mormann). Our group develop computational methods that help scientists interpret empirical data, with a focus on basic and clinical neuroscience research. We want to understand how neuronal networks in the brain process sensory information and control intelligent behaviour, and use this knowledge to develop methods for the diagnosis and therapy of neuronal dysfunction. More details at https://uni-tuebingen.de/en/196976
Mark Humphries
A 4-year fully-funded PhD studentship project with Professor Mark Humphries and Professor Stephen Coombes is available for October 2024 start, through the University of Nottingham's BBSRC Doctoral Training Programme. The striatum is central to an extraordinary range of disorders, from Parkinson's disease to OCD, but our best models for its function are outdated and contradicted by recent data. In this project, we will test the hypothesis that the striatum is a special class of recurrent neural networks (RNNs) that use purely inhibitory connections. We will build and analyse this class of networks, deriving predictions for the computations that striatum performs, and for the activity of neuron populations in the striatum. We will then test these predictions in two large-scale datasets of population recordings from striatum in freely-exploring mice from the studies of Klaus et al (Neuron, 2017) and Markowitz et al (Cell, 2018). The DTP offers 2 lab rotations and wide-ranging training modules. If successful, the PhD student will join the Humphries' lab and be part of the School of Psychology's extensive postgraduate support network.
Spatially-embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings
Brain networks exist within the confines of resource limitations. As a result, a brain network must overcome metabolic costs of growing and sustaining the network within its physical space, while simultaneously implementing its required information processing. To observe the effect of these processes, we introduce the spatially-embedded recurrent neural network (seRNN). seRNNs learn basic task-related inferences while existing within a 3D Euclidean space, where the communication of constituent neurons is constrained by a sparse connectome. We find that seRNNs, similar to primate cerebral cortices, naturally converge on solving inferences using modular small-world networks, in which functionally similar units spatially configure themselves to utilize an energetically-efficient mixed-selective code. As all these features emerge in unison, seRNNs reveal how many common structural and functional brain motifs are strongly intertwined and can be attributed to basic biological optimization processes. seRNNs can serve as model systems to bridge between structural and functional research communities to move neuroscientific understanding forward.
Extracting computational mechanisms from neural data using low-rank RNNs
An influential theory in systems neuroscience suggests that brain function can be understood through low-dimensional dynamics [Vyas et al 2020]. However, a challenge in this framework is that a single computational task may involve a range of dynamic processes. To understand which processes are at play in the brain, it is important to use data on neural activity to constrain models. In this study, we present a method for extracting low-dimensional dynamics from data using low-rank recurrent neural networks (lrRNNs), a highly expressive and understandable type of model [Mastrogiuseppe & Ostojic 2018, Dubreuil, Valente et al. 2022]. We first test our approach using synthetic data created from full-rank RNNs that have been trained on various brain tasks. We find that lrRNNs fitted to neural activity allow us to identify the collective computational processes and make new predictions for inactivations in the original RNNs. We then apply our method to data recorded from the prefrontal cortex of primates during a context-dependent decision-making task. Our approach enables us to assign computational roles to the different latent variables and provides a mechanistic model of the recorded dynamics, which can be used to perform in silico experiments like inactivations and provide testable predictions.
Flexible selection of task-relevant features through population gating
Brains can gracefully weed out irrelevant stimuli to guide behavior. This feat is believed to rely on a progressive selection of task-relevant stimuli across the cortical hierarchy, but the specific across-area interactions enabling stimulus selection are still unclear. Here, we propose that population gating, occurring within A1 but controlled by top-down inputs from mPFC, can support across-area stimulus selection. Examining single-unit activity recorded while rats performed an auditory context-dependent task, we found that A1 encoded relevant and irrelevant stimuli along a common dimension of its neural space. Yet, the relevant stimulus encoding was enhanced along an extra dimension. In turn, mPFC encoded only the stimulus relevant to the ongoing context. To identify candidate mechanisms for stimulus selection within A1, we reverse-engineered low-rank RNNs trained on a similar task. Our analyses predicted that two context-modulated neural populations gated their preferred stimulus in opposite contexts, which we confirmed in further analyses of A1. Finally, we show in a two-region RNN how population gating within A1 could be controlled by top-down inputs from PFC, enabling flexible across-area communication despite fixed inter-areal connectivity.
Online Training of Spiking Recurrent Neural Networks With Memristive Synapses
Spiking recurrent neural networks (RNNs) are a promising tool for solving a wide variety of complex cognitive and motor tasks, due to their rich temporal dynamics and sparse processing. However training spiking RNNs on dedicated neuromorphic hardware is still an open challenge. This is due mainly to the lack of local, hardware-friendly learning mechanisms that can solve the temporal credit assignment problem and ensure stable network dynamics, even when the weight resolution is limited. These challenges are further accentuated, if one resorts to using memristive devices for in-memory computing to resolve the von-Neumann bottleneck problem, at the expense of a substantial increase in variability in both the computation and the working memory of the spiking RNNs. In this talk, I will present our recent work where we introduced a PyTorch simulation framework of memristive crossbar arrays that enables accurate investigation of such challenges. I will show that recently proposed e-prop learning rule can be used to train spiking RNNs whose weights are emulated in the presented simulation framework. Although e-prop locally approximates the ideal synaptic updates, it is difficult to implement the updates on the memristive substrate due to substantial device non-idealities. I will mention several widely adapted weight update schemes that primarily aim to cope with these device non-idealities and demonstrate that accumulating gradients can enable online and efficient training of spiking RNN on memristive substrates.
NMC4 Short Talk: A theory for the population rate of adapting neurons disambiguates mean vs. variance-driven dynamics and explains log-normal response statistics
Recently, the field of computational neuroscience has seen an explosion of the use of trained recurrent network models (RNNs) to model patterns of neural activity. These RNN models are typically characterized by tuned recurrent interactions between rate 'units' whose dynamics are governed by smooth, continuous differential equations. However, the response of biological single neurons is better described by all-or-none events - spikes - that are triggered in response to the processing of their synaptic input by the complex dynamics of their membrane. One line of research has attempted to resolve this discrepancy by linking the average firing probability of a population of simplified spiking neuron models to rate dynamics similar to those used for RNN units. However, challenges remain to account for complex temporal dependencies in the biological single neuron response and for the heterogeneity of synaptic input across the population. Here, we make progress by showing how to derive dynamic rate equations for a population of spiking neurons with multi-timescale adaptation properties - as this was shown to accurately model the response of biological neurons - while they receive independent time-varying inputs, leading to plausible asynchronous activity in the network. The resulting rate equations yield an insightful segregation of the population's response into dynamics that are driven by the mean signal received by the neural population, and dynamics driven by the variance of the input across neurons, with respective timescales that are in agreement with slice experiments. Further, these equations explain how input variability can shape log-normal instantaneous rate distributions across neurons, as observed in vivo. Our results help interpret properties of the neural population response and open the way to investigating whether the more biologically plausible and dynamically complex rate model we derive could provide useful inductive biases if used in an RNN to solve specific tasks.
NMC4 Short Talk: Different hypotheses on the role of the PFC in solving simple cognitive tasks
Low-dimensional population dynamics can be observed in neural activity recorded from the prefrontal cortex (PFC) of subjects performing simple cognitive tasks. Many studies have shown that recurrent neural networks (RNNs) trained on the same tasks can reproduce qualitatively these state space trajectories, and have used them as models of how neuronal dynamics implement task computations. The PFC is also viewed as a conductor that organizes the communication between cortical areas and provides contextual information. It is then not clear what is its role in solving simple cognitive tasks. Do the low-dimensional trajectories observed in the PFC really correspond to the computations that it performs? Or do they indirectly reflect the computations occurring within the cortical areas projecting to the PFC? To address these questions, we modelled cortical areas with a modular RNN and equipped it with a PFC-like cognitive system. When trained on cognitive tasks, this multi-system brain model can reproduce the low-dimensional population responses observed in neuronal activity as well as classical RNNs. Qualitatively different mechanisms can emerge from the training process when varying some details of the architecture such as the time constants. In particular, there is one class of models where it is the dynamics of the cognitive system that is implementing the task computations, and another where the cognitive system is only necessary to provide contextual information about the task rule as task performance is not impaired when preventing the system from accessing the task inputs. These constitute two different hypotheses about the causal role of the PFC in solving simple cognitive tasks, which could motivate further experiments on the brain.
Neural dynamics of probabilistic information processing in humans and recurrent neural networks
In nature, sensory inputs are often highly structured, and statistical regularities of these signals can be extracted to form expectation about future sensorimotor associations, thereby optimizing behavior. One of the fundamental questions in neuroscience concerns the neural computations that underlie these probabilistic sensorimotor processing. Through a recurrent neural network (RNN) model and human psychophysics and electroencephalography (EEG), the present study investigates circuit mechanisms for processing probabilistic structures of sensory signals to guide behavior. We first constructed and trained a biophysically constrained RNN model to perform a series of probabilistic decision-making tasks similar to paradigms designed for humans. Specifically, the training environment was probabilistic such that one stimulus was more probable than the others. We show that both humans and the RNN model successfully extract information about stimulus probability and integrate this knowledge into their decisions and task strategy in a new environment. Specifically, performance of both humans and the RNN model varied with the degree to which the stimulus probability of the new environment matched the formed expectation. In both cases, this expectation effect was more prominent when the strength of sensory evidence was low, suggesting that like humans, our RNNs placed more emphasis on prior expectation (top-down signals) when the available sensory information (bottom-up signals) was limited, thereby optimizing task performance. Finally, by dissecting the trained RNN model, we demonstrate how competitive inhibition and recurrent excitation form the basis for neural circuitry optimized to perform probabilistic information processing.
Recurrent network dynamics lead to interference in sequential learning
Learning in real life is often sequential: A learner first learns task A, then task B. If the tasks are related, the learner may adapt the previously learned representation instead of generating a new one from scratch. Adaptation may ease learning task B but may also decrease the performance on task A. Such interference has been observed in experimental and machine learning studies. In the latter case, it is mediated by correlations between weight updates for the different tasks. In typical applications, like image classification with feed-forward networks, these correlated weight updates can be traced back to input correlations. For many neuroscience tasks, however, networks need to not only transform the input, but also generate substantial internal dynamics. Here we illuminate the role of internal dynamics for interference in recurrent neural networks (RNNs). We analyze RNNs trained sequentially on neuroscience tasks with gradient descent and observe forgetting even for orthogonal tasks. We find that the degree of interference changes systematically with tasks properties, especially with emphasis on input-driven over autonomously generated dynamics. To better understand our numerical observations, we thoroughly analyze a simple model of working memory: For task A, a network is presented with an input pattern and trained to generate a fixed point aligned with this pattern. For task B, the network has to memorize a second, orthogonal pattern. Adapting an existing representation corresponds to the rotation of the fixed point in phase space, as opposed to the emergence of a new one. We show that the two modes of learning – rotation vs. new formation – are directly linked to recurrent vs. input-driven dynamics. We make this notion precise in a further simplified, analytically tractable model, where learning is restricted to a 2x2 matrix. In our analysis of trained RNNs, we also make the surprising observation that, across different tasks, larger random initial connectivity reduces interference. Analyzing the fixed point task reveals the underlying mechanism: The random connectivity strongly accelerates the learning mode of new formation, and has less effect on rotation. The prior thus wins the race to zero loss, and interference is reduced. Altogether, our work offers a new perspective on sequential learning in recurrent networks, and the emphasis on internally generated dynamics allows us to take the history of individual learners into account.
Untangling brain wide current flow using neural network models
Rajanlab designs neural network models constrained by experimental data, and reverse engineers them to figure out how brain circuits function in health and disease. Recently, we have been developing a powerful new theory-based framework for “in-vivo tract tracing” from multi-regional neural activity collected experimentally. We call this framework CURrent-Based Decomposition (CURBD). CURBD employs recurrent neural networks (RNNs) directly constrained, from the outset, by time series measurements acquired experimentally, such as Ca2+ imaging or electrophysiological data. Once trained, these data-constrained RNNs let us infer matrices quantifying the interactions between all pairs of modeled units. Such model-derived “directed interaction matrices” can then be used to separately compute excitatory and inhibitory input currents that drive a given neuron from all other neurons. Therefore different current sources can be de-mixed – either within the same region or from other regions, potentially brain-wide – which collectively give rise to the population dynamics observed experimentally. Source de-mixed currents obtained through CURBD allow an unprecedented view into multi-region mechanisms inaccessible from measurements alone. We have applied this method successfully to several types of neural data from our experimental collaborators, e.g., zebrafish (Deisseroth lab, Stanford), mice (Harvey lab, Harvard), monkeys (Rudebeck lab, Sinai), and humans (Rutishauser lab, Cedars Sinai), where we have discovered both directed interactions brain wide and inter-area currents during different types of behaviors. With this powerful framework based on data-constrained multi-region RNNs and CURrent Based Decomposition (CURBD), we ask if there are conserved multi-region mechanisms across different species, as well as identify key divergences.
Inferring brain-wide current flow using data-constrained neural network models
Rajanlab designs neural network models constrained by experimental data, and reverse engineers them to figure out how brain circuits function in health and disease. Recently, we have been developing a powerful new theory-based framework for “in-vivo tract tracing” from multi-regional neural activity collected experimentally. We call this framework CURrent-Based Decomposition (CURBD). CURBD employs recurrent neural networks (RNNs) directly constrained, from the outset, by time series measurements acquired experimentally, such as Ca2+ imaging or electrophysiological data. Once trained, these data-constrained RNNs let us infer matrices quantifying the interactions between all pairs of modeled units. Such model-derived “directed interaction matrices” can then be used to separately compute excitatory and inhibitory input currents that drive a given neuron from all other neurons. Therefore different current sources can be de-mixed – either within the same region or from other regions, potentially brain-wide – which collectively give rise to the population dynamics observed experimentally. Source de-mixed currents obtained through CURBD allow an unprecedented view into multi-region mechanisms inaccessible from measurements alone. We have applied this method successfully to several types of neural data from our experimental collaborators, e.g., zebrafish (Deisseroth lab, Stanford), mice (Harvey lab, Harvard), monkeys (Rudebeck lab, Sinai), and humans (Rutishauser lab, Cedars Sinai), where we have discovered both directed interactions brain wide and inter-area currents during different types of behaviors. With this framework based on data-constrained multi-region RNNs and CURrent Based Decomposition (CURBD), we can ask if there are conserved multi-region mechanisms across different species, as well as identify key divergences.
Theory of gating in recurrent neural networks
Recurrent neural networks (RNNs) are powerful dynamical models, widely used in machine learning (ML) for processing sequential data, and also in neuroscience, to understand the emergent properties of networks of real neurons. Prior theoretical work in understanding the properties of RNNs has focused on models with additive interactions. However, real neurons can have gating i.e. multiplicative interactions, and gating is also a central feature of the best performing RNNs in machine learning. Here, we develop a dynamical mean-field theory (DMFT) to study the consequences of gating in RNNs. We use random matrix theory to show how gating robustly produces marginal stability and line attractors – important mechanisms for biologically-relevant computations requiring long memory. The long-time behavior of the gated network is studied using its Lyapunov spectrum, and the DMFT is used to provide a novel analytical expression for the maximum Lyapunov exponent demonstrating its close relation to relaxation-time of the dynamics. Gating is also shown to give rise to a novel, discontinuous transition to chaos, where the proliferation of critical points (topological complexity) is decoupled from the appearance of chaotic dynamics (dynamical complexity), contrary to a seminal result for additive RNNs. Critical surfaces and regions of marginal stability in the parameter space are indicated in phase diagrams, thus providing a map for principled parameter choices for ML practitioners. Finally, we develop a field-theory for gradients that arise in training, by incorporating the adjoint sensitivity framework from control theory in the DMFT. This paves the way for the use of powerful field-theoretic techniques to study training/gradients in large RNNs.
Effective and Efficient Computation with Multiple-timescale Spiking Recurrent Neural Networks
The emergence of brain-inspired neuromorphic computing as a paradigm for edge AI is motivating the search for high-performance and efficient spiking neural networks to run on this hardware. However, compared to classical neural networks in deep learning, current spiking neural networks lack competitive performance in compelling areas. Here, for sequential and streaming tasks, we demonstrate how spiking recurrent neural networks (SRNN) using adaptive spiking neurons are able to achieve state-of-the-art performance compared to other spiking neural networks and almost reach or exceed the performance of classical recurrent neural networks (RNNs) while exhibiting sparse activity. From this, we calculate a 100x energy improvement for our SRNNs over classical RNNs on the harder tasks. We find in particular that adapting the timescales of spiking neurons is crucial for achieving such performance, and we demonstrate the performance for SRNNs for different spiking neuron models.
Estimating flexible across-area communication with neurally-constrained RNNs
Bernstein Conference 2024
Initialization choice leads to different solutions in trained RNNs
COSYNE 2022
Initialization choice leads to different solutions in trained RNNs
COSYNE 2022
Reduced dynamics - a tool for describing RNNs activity as a directed graph
COSYNE 2022
Reduced dynamics - a tool for describing RNNs activity as a directed graph
COSYNE 2022
Top-down optimization recovers biological coding principles of single-neuron adaptation in RNNs
COSYNE 2022
Top-down optimization recovers biological coding principles of single-neuron adaptation in RNNs
COSYNE 2022
Multi task representations of RNNs show a simplicity bias
COSYNE 2023
Phase remembers: trained RNNs develop phase-locked limit cycles in a working memory task
COSYNE 2023
Constructing biologically constrained RNNs with Dale’s backprop and topologically-informed pruning
COSYNE 2025
Not so griddy: Internal representations of RNNs path integrating more than one agent
COSYNE 2025
Measuring and Controlling Solution Degeneracy across Task-Trained RNNs
COSYNE 2025
Orthogonal line attractors in the monkey frontoparietal cortex and RNNs support hierarchical decisions
COSYNE 2025
Second-order forward-mode optimization of RNNs for neuroscience
COSYNE 2025
Harmonic oscillator RNNs: Single node dynamics, resonance and the role of feedback connections
FENS Forum 2024