Recurrent
recurrent
Computational Mechanisms of Predictive Processing in Brains and Machines
Predictive processing offers a unifying view of neural computation, proposing that brains continuously anticipate sensory input and update internal models based on prediction errors. In this talk, I will present converging evidence for the computational mechanisms underlying this framework across human neuroscience and deep neural networks. I will begin with recent work showing that large-scale distributed prediction-error encoding in the human brain directly predicts how sensory representations reorganize through predictive learning. I will then turn to PredNet, a popular predictive coding inspired deep network that has been widely used to model real-world biological vision systems. Using dynamic stimuli generated with our Spatiotemporal Style Transfer algorithm, we demonstrate that PredNet relies primarily on low-level spatiotemporal structure and remains insensitive to high-level content, revealing limits in its generalization capacity. Finally, I will discuss new recurrent vision models that integrate top-down feedback connections with intrinsic neural variability, uncovering a dual mechanism for robust sensory coding in which neural variability decorrelates unit responses, while top-down feedback stabilizes network dynamics. Together, these results outline how prediction error signaling and top-down feedback pathways shape adaptive sensory processing in biological and artificial systems.
Neurobiological constraints on learning: bug or feature?
Understanding how brains learn requires bridging evidence across scales—from behaviour and neural circuits to cells, synapses, and molecules. In our work, we use computational modelling and data analysis to explore how the physical properties of neurons and neural circuits constrain learning. These include limits imposed by brain wiring, energy availability, molecular noise, and the 3D structure of dendritic spines. In this talk I will describe one such project testing if wiring motifs from fly brain connectomes can improve performance of reservoir computers, a type of recurrent neural network. The hope is that these insights into brain learning will lead to improved learning algorithms for artificial systems.
Probing neural population dynamics with recurrent neural networks
Large-scale recordings of neural activity are providing new opportunities to study network-level dynamics with unprecedented detail. However, the sheer volume of data and its dynamical complexity are major barriers to uncovering and interpreting these dynamics. I will present latent factor analysis via dynamical systems, a sequential autoencoding approach that enables inference of dynamics from neuronal population spiking activity on single trials and millisecond timescales. I will also discuss recent adaptations of the method to uncover dynamics from neural activity recorded via 2P Calcium imaging. Finally, time permitting, I will mention recent efforts to improve the interpretability of deep-learning based dynamical systems models.
Learning produces a hippocampal cognitive map in the form of an orthogonalized state machine
Cognitive maps confer animals with flexible intelligence by representing spatial, temporal, and abstract relationships that can be used to shape thought, planning, and behavior. Cognitive maps have been observed in the hippocampus, but their algorithmic form and the processes by which they are learned remain obscure. Here, we employed large-scale, longitudinal two-photon calcium imaging to record activity from thousands of neurons in the CA1 region of the hippocampus while mice learned to efficiently collect rewards from two subtly different versions of linear tracks in virtual reality. The results provide a detailed view of the formation of a cognitive map in the hippocampus. Throughout learning, both the animal behavior and hippocampal neural activity progressed through multiple intermediate stages, gradually revealing improved task representation that mirrored improved behavioral efficiency. The learning process led to progressive decorrelations in initially similar hippocampal neural activity within and across tracks, ultimately resulting in orthogonalized representations resembling a state machine capturing the inherent struture of the task. We show that a Hidden Markov Model (HMM) and a biologically plausible recurrent neural network trained using Hebbian learning can both capture core aspects of the learning dynamics and the orthogonalized representational structure in neural activity. In contrast, we show that gradient-based learning of sequence models such as Long Short-Term Memory networks (LSTMs) and Transformers do not naturally produce such orthogonalized representations. We further demonstrate that mice exhibited adaptive behavior in novel task settings, with neural activity reflecting flexible deployment of the state machine. These findings shed light on the mathematical form of cognitive maps, the learning rules that sculpt them, and the algorithms that promote adaptive behavior in animals. The work thus charts a course toward a deeper understanding of biological intelligence and offers insights toward developing more robust learning algorithms in artificial intelligence.
Prefrontal mechanisms involved in learning distractor-resistant working memory in a dual task
Working memory (WM) is a cognitive function that allows the short-term maintenance and manipulation of information when no longer accessible to the senses. It relies on temporarily storing stimulus features in the activity of neuronal populations. To preserve these dynamics from distraction it has been proposed that pre and post-distraction population activity decomposes into orthogonal subspaces. If orthogonalization is necessary to avoid WM distraction, it should emerge as performance in the task improves. We sought evidence of WM orthogonalization learning and the underlying mechanisms by analyzing calcium imaging data from the prelimbic (PrL) and anterior cingulate (ACC) cortices of mice as they learned to perform an olfactory dual task. The dual task combines an outer Delayed Paired-Association task (DPA) with an inner Go-NoGo task. We examined how neuronal activity reflected the process of protecting the DPA sample information against Go/NoGo distractors. As mice learned the task, we measured the overlap between the neural activity onto the low-dimensional subspaces that encode sample or distractor odors. Early in the training, pre-distraction activity overlapped with both sample and distractor subspaces. Later in the training, pre-distraction activity was strictly confined to the sample subspace, resulting in a more robust sample code. To gain mechanistic insight into how these low-dimensional WM representations evolve with learning we built a recurrent spiking network model of excitatory and inhibitory neurons with low-rank connections. The model links learning to (1) the orthogonalization of sample and distractor WM subspaces and (2) the orthogonalization of each subspace with irrelevant inputs. We validated (1) by measuring the angular distance between the sample and distractor subspaces through learning in the data. Prediction (2) was validated in PrL through the photoinhibition of ACC to PrL inputs, which induced early-training neural dynamics in well-trained animals. In the model, learning drives the network from a double-well attractor toward a more continuous ring attractor regime. We tested signatures for this dynamical evolution in the experimental data by estimating the energy landscape of the dynamics on a one-dimensional ring. In sum, our study defines network dynamics underlying the process of learning to shield WM representations from distracting tasks.
A recurrent network model of planning predicts hippocampal replay and human behavior
When interacting with complex environments, humans can rapidly adapt their behavior to changes in task or context. To facilitate this adaptation, we often spend substantial periods of time contemplating possible futures before acting. For such planning to be rational, the benefits of planning to future behavior must at least compensate for the time spent thinking. Here we capture these features of human behavior by developing a neural network model where not only actions, but also planning, are controlled by prefrontal cortex. This model consists of a meta-reinforcement learning agent augmented with the ability to plan by sampling imagined action sequences drawn from its own policy, which we refer to as `rollouts'. Our results demonstrate that this agent learns to plan when planning is beneficial, explaining the empirical variability in human thinking times. Additionally, the patterns of policy rollouts employed by the artificial agent closely resemble patterns of rodent hippocampal replays recently recorded in a spatial navigation task, in terms of both their spatial statistics and their relationship to subsequent behavior. Our work provides a new theory of how the brain could implement planning through prefrontal-hippocampal interactions, where hippocampal replays are triggered by -- and in turn adaptively affect -- prefrontal dynamics.
A recurrent network model of planning explains hippocampal replay and human behavior
When interacting with complex environments, humans can rapidly adapt their behavior to changes in task or context. To facilitate this adaptation, we often spend substantial periods of time contemplating possible futures before acting. For such planning to be rational, the benefits of planning to future behavior must at least compensate for the time spent thinking. Here we capture these features of human behavior by developing a neural network model where not only actions, but also planning, are controlled by prefrontal cortex. This model consists of a meta-reinforcement learning agent augmented with the ability to plan by sampling imagined action sequences drawn from its own policy, which we refer to as 'rollouts'. Our results demonstrate that this agent learns to plan when planning is beneficial, explaining the empirical variability in human thinking times. Additionally, the patterns of policy rollouts employed by the artificial agent closely resemble patterns of rodent hippocampal replays recently recorded in a spatial navigation task, in terms of both their spatial statistics and their relationship to subsequent behavior. Our work provides a new theory of how the brain could implement planning through prefrontal-hippocampal interactions, where hippocampal replays are triggered by - and in turn adaptively affect - prefrontal dynamics.
Internal representation of musical rhythm: transformation from sound to periodic beat
When listening to music, humans readily perceive and move along with a periodic beat. Critically, perception of a periodic beat is commonly elicited by rhythmic stimuli with physical features arranged in a way that is not strictly periodic. Hence, beat perception must capitalize on mechanisms that transform stimulus features into a temporally recurrent format with emphasized beat periodicity. Here, I will present a line of work that aims to clarify the nature and neural basis of this transformation. In these studies, electrophysiological activity was recorded as participants listened to rhythms known to induce perception of a consistent beat across healthy Western adults. The results show that the human brain selectively emphasizes beat representation when it is not acoustically prominent in the stimulus, and this transformation (i) can be captured non-invasively using surface EEG in adult participants, (ii) is already in place in 5- to 6-month-old infants, and (iii) cannot be fully explained by subcortical auditory nonlinearities. Moreover, as revealed by human intracerebral recordings, a prominent beat representation emerges already in the primary auditory cortex. Finally, electrophysiological recordings from the auditory cortex of a rhesus monkey show a significant enhancement of beat periodicities in this area, similar to humans. Taken together, these findings indicate an early, general auditory cortical stage of processing by which rhythmic inputs are rendered more temporally recurrent than they are in reality. Already present in non-human primates and human infants, this "periodized" default format could then be shaped by higher-level associative sensory-motor areas and guide movement in individuals with strongly coupled auditory and motor systems. Together, this highlights the multiplicity of neural processes supporting coordinated musical behaviors widely observed across human cultures.The experiments herein include: a motor timing task comparing the effects of movement vs non-movement with and without feedback (Exp. 1A & 1B), a transcranial magnetic stimulation (TMS) study on the role of the supplementary motor area (SMA) in transforming temporal information (Exp. 2), and a perceptual timing task investigating the effect of noisy movement on time perception with both visual and auditory modalities (Exp. 3A & 3B). Together, the results of these studies support the Bayesian cue combination framework, in that: movement improves the precision of time perception not only in perceptual timing tasks but also motor timing tasks (Exp. 1A & 1B), stimulating the SMA appears to disrupt the transformation of temporal information (Exp. 2), and when movement becomes unreliable or noisy there is no longer an improvement in precision of time perception (Exp. 3A & 3B). Although there is support for the proposed framework, more studies (i.e., fMRI, TMS, EEG, etc.) need to be conducted in order to better understand where and how this may be instantiated in the brain; however, this work provides a starting point to better understanding the intrinsic connection between time and movement
The role of sub-population structure in computations through neural dynamics
Neural computations are currently conceptualised using two separate approaches: sorting neurons into functional sub-populations or examining distributed collective dynamics. Whether and how these two aspects interact to shape computations is currently unclear. Using a novel approach to extract computational mechanisms from recurrent networks trained on neuroscience tasks, we show that the collective dynamics and sub-population structure play fundamentally complementary roles. Although various tasks can be implemented in networks with fully random population structure, we found that flexible input–output mappings instead require a non-random population structure that can be described in terms of multiple sub-populations. Our analyses revealed that such a sub-population organisation enables flexible computations through a mechanism based on gain-controlled modulations that flexibly shape the collective dynamics.
The centrality of population-level factors to network computation is demonstrated by a versatile approach for training spiking networks
Neural activity is often described in terms of population-level factors extracted from the responses of many neurons. Factors provide a lower-dimensional description with the aim of shedding light on network computations. Yet, mechanistically, computations are performed not by continuously valued factors but by interactions among neurons that spike discretely and variably. Models provide a means of bridging these levels of description. We developed a general method for training model networks of spiking neurons by leveraging factors extracted from either data or firing-rate-based networks. In addition to providing a useful model-building framework, this formalism illustrates how reliable and continuously valued factors can arise from seemingly stochastic spiking. Our framework establishes procedures for embedding this property in network models with different levels of realism. The relationship between spikes and factors in such networks provides a foundation for interpreting (and subtly redefining) commonly used quantities such as firing rates.
From spikes to factors: understanding large-scale neural computations
It is widely accepted that human cognition is the product of spiking neurons. Yet even for basic cognitive functions, such as the ability to make decisions or prepare and execute a voluntary movement, the gap between spikes and computation is vast. Only for very simple circuits and reflexes can one explain computations neuron-by-neuron and spike-by-spike. This approach becomes infeasible when neurons are numerous the flow of information is recurrent. To understand computation, one thus requires appropriate abstractions. An increasingly common abstraction is the neural ‘factor’. Factors are central to many explanations in systems neuroscience. Factors provide a framework for describing computational mechanism, and offer a bridge between data and concrete models. Yet there remains some discomfort with this abstraction, and with any attempt to provide mechanistic explanations above that of spikes, neurons, cell-types, and other comfortingly concrete entities. I will explain why, for many networks of spiking neurons, factors are not only a well-defined abstraction, but are critical to understanding computation mechanistically. Indeed, factors are as real as other abstractions we now accept: pressure, temperature, conductance, and even the action potential itself. I use recent empirical results to illustrate how factor-based hypotheses have become essential to the forming and testing of scientific hypotheses. I will also show how embracing factor-level descriptions affords remarkable power when decoding neural activity for neural engineering purposes.
The strongly recurrent regime of cortical networks
Modern electrophysiological recordings simultaneously capture single-unit spiking activities of hundreds of neurons. These neurons exhibit highly complex coordination patterns. Where does this complexity stem from? One candidate is the ubiquitous heterogeneity in connectivity of local neural circuits. Studying neural network dynamics in the linearized regime and using tools from statistical field theory of disordered systems, we derive relations between structure and dynamics that are readily applicable to subsampled recordings of neural circuits: Measuring the statistics of pairwise covariances allows us to infer statistical properties of the underlying connectivity. Applying our results to spontaneous activity of macaque motor cortex, we find that the underlying network operates in a strongly recurrent regime. In this regime, network connectivity is highly heterogeneous, as quantified by a large radius of bulk connectivity eigenvalues. Being close to the point of linear instability, this dynamical regime predicts a rich correlation structure, a large dynamical repertoire, long-range interaction patterns, relatively low dimensionality and a sensitive control of neuronal coordination. These predictions are verified in analyses of spontaneous activity of macaque motor cortex and mouse visual cortex. Finally, we show that even microscopic features of connectivity, such as connection motifs, systematically scale up to determine the global organization of activity in neural circuits.
Myelin Formation and Oligodendrocyte Biology in Epilepsy
Epilepsy is one of the most common neurological diseases according to the World Health Organization (WHO) affecting around 70 million people worldwide [WHO]. Patients who suffer from epilepsy also suffer from a variety of neuro-psychiatric co-morbidities, which they can experience as crippling as the seizure condition itself. Adequate organization of cerebral white matter is utterly important for cognitive development. The failure of integration of neurologic function with cognition is reflected in neuro-psychiatric disease, such as autism spectrum disorder (ASD). However, in epilepsy we know little about the importance of white matter abnormalities in epilepsy-associated co-morbidities. Epilepsy surgery is an important therapy strategy in patients where conventional anti-epileptic drug treatment fails . On histology of the resected brain samples, malformations of cortical development (MCD) are common among the epilepsy surgery population, especially focal cortical dysplasia (FCD) and tuberous sclerosis complex (TSC). Both pathologies are associated with constitutive activation of the mTOR pathway. Interestingly, some type of FCD is morphological similar to TSC cortical tubers including the abnormalities of the white matter. Hypomyelination with lack of myelin-producing cells, the oligodendrocytes, within the lesional area is a striking phenomenon. Impairment of the complex myelination process can have a major impact on brain function. In the worst case leading to distorted or interrupted neurotransmissions. It is still unclear whether the observed myelin pathology in epilepsy surgical specimens is primarily related to the underlying malformation process or is just a secondary phenomenon of recurrent epileptic seizures creating a toxic micro-environment which hampers myelin formation. Interestingly, mTORC1 has been implicated as key signal for myelination, thus, promoting the maturation of oligodendrocytes . These results, however, remain controversial. Regardless of the underlying pathophysiologic mechanism, alterations of myelin dynamics, depending on their severity, are known to be linked to various kinds of developmental disorders or neuropsychiatric manifestations.
Spatially-embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings
Brain networks exist within the confines of resource limitations. As a result, a brain network must overcome metabolic costs of growing and sustaining the network within its physical space, while simultaneously implementing its required information processing. To observe the effect of these processes, we introduce the spatially-embedded recurrent neural network (seRNN). seRNNs learn basic task-related inferences while existing within a 3D Euclidean space, where the communication of constituent neurons is constrained by a sparse connectome. We find that seRNNs, similar to primate cerebral cortices, naturally converge on solving inferences using modular small-world networks, in which functionally similar units spatially configure themselves to utilize an energetically-efficient mixed-selective code. As all these features emerge in unison, seRNNs reveal how many common structural and functional brain motifs are strongly intertwined and can be attributed to basic biological optimization processes. seRNNs can serve as model systems to bridge between structural and functional research communities to move neuroscientific understanding forward.
Meta-learning functional plasticity rules in neural networks
Synaptic plasticity is known to be a key player in the brain’s life-long learning abilities. However, due to experimental limitations, the nature of the local changes at individual synapses and their link with emerging network-level computations remain unclear. I will present a numerical, meta-learning approach to deduce plasticity rules from either neuronal activity data and/or prior knowledge about the network's computation. I will first show how to recover known rules, given a human-designed loss function in rate networks, or directly from data, using an adversarial approach. Then I will present how to scale-up this approach to recurrent spiking networks using simulation-based inference.
Extracting computational mechanisms from neural data using low-rank RNNs
An influential theory in systems neuroscience suggests that brain function can be understood through low-dimensional dynamics [Vyas et al 2020]. However, a challenge in this framework is that a single computational task may involve a range of dynamic processes. To understand which processes are at play in the brain, it is important to use data on neural activity to constrain models. In this study, we present a method for extracting low-dimensional dynamics from data using low-rank recurrent neural networks (lrRNNs), a highly expressive and understandable type of model [Mastrogiuseppe & Ostojic 2018, Dubreuil, Valente et al. 2022]. We first test our approach using synthetic data created from full-rank RNNs that have been trained on various brain tasks. We find that lrRNNs fitted to neural activity allow us to identify the collective computational processes and make new predictions for inactivations in the original RNNs. We then apply our method to data recorded from the prefrontal cortex of primates during a context-dependent decision-making task. Our approach enables us to assign computational roles to the different latent variables and provides a mechanistic model of the recorded dynamics, which can be used to perform in silico experiments like inactivations and provide testable predictions.
Analyzing artificial neural networks to understand the brain
In the first part of this talk I will present work showing that recurrent neural networks can replicate broad behavioral patterns associated with dynamic visual object recognition in humans. An analysis of these networks shows that different types of recurrence use different strategies to solve the object recognition problem. The similarities between artificial neural networks and the brain presents another opportunity, beyond using them just as models of biological processing. In the second part of this talk, I will discuss—and solicit feedback on—a proposed research plan for testing a wide range of analysis tools frequently applied to neural data on artificial neural networks. I will present the motivation for this approach as well as the form the results could take and how this would benefit neuroscience.
Convex neural codes in recurrent networks and sensory systems
Neural activity in many sensory systems is organized on low-dimensional manifolds by means of convex receptive fields. Neural codes in these areas are constrained by this organization, as not every neural code is compatible with convex receptive fields. The same codes are also constrained by the structure of the underlying neural network. In my talk I will attempt to provide answers to the following natural questions: (i) How do recurrent circuits generate codes that are compatible with the convexity of receptive fields? (ii) How can we utilize the constraints imposed by the convex receptive field to understand the underlying stimulus space. To answer question (i), we describe the combinatorics of the steady states and fixed points of recurrent networks that satisfy the Dale’s law. It turns out the combinatorics of the fixed points are completely determined by two distinct conditions: (a) the connectivity graph of the network and (b) a spectral condition on the synaptic matrix. We give a characterization of exactly which features of connectivity determine the combinatorics of the fixed points. We also find that a generic recurrent network that satisfies Dale's law outputs convex combinatorial codes. To address question (ii), I will describe methods based on ideas from topology and geometry that take advantage of the convex receptive field properties to infer the dimension of (non-linear) neural representations. I will illustrate the first method by inferring basic features of the neural representations in the mouse olfactory bulb.
Training Dynamic Spiking Neural Network via Forward Propagation Through Time
With recent advances in learning algorithms, recurrent networks of spiking neurons are achieving performance competitive with standard recurrent neural networks. Still, these learning algorithms are limited to small networks of simple spiking neurons and modest-length temporal sequences, as they impose high memory requirements, have difficulty training complex neuron models, and are incompatible with online learning.Taking inspiration from the concept of Liquid Time-Constant (LTCs), we introduce a novel class of spiking neurons, the Liquid Time-Constant Spiking Neuron (LTC-SN), resulting in functionality similar to the gating operation in LSTMs. We integrate these neurons in SNNs that are trained with FPTT and demonstrate that thus trained LTC-SNNs outperform various SNNs trained with BPTT on long sequences while enabling online learning and drastically reducing memory complexity. We show this for several classical benchmarks that can easily be varied in sequence length, like the Add Task and the DVS-gesture benchmark. We also show how FPTT-trained LTC-SNNs can be applied to large convolutional SNNs, where we demonstrate novel state-of-the-art for online learning in SNNs on a number of standard benchmarks (S-MNIST, R-MNIST, DVS-GESTURE) and also show that large feedforward SNNs can be trained successfully in an online manner to near (Fashion-MNIST, DVS-CIFAR10) or exceeding (PS-MNIST, R-MNIST) state-of-the-art performance as obtained with offline BPTT. Finally, the training and memory efficiency of FPTT enables us to directly train SNNs in an end-to-end manner at network sizes and complexity that was previously infeasible: we demonstrate this by training in an end-to-end fashion the first deep and performant spiking neural network for object localization and recognition. Taken together, we out contribution enable for the first time training large-scale complex spiking neural network architectures online and on long temporal sequences.
Beyond Biologically Plausible Spiking Networks for Neuromorphic Computing
Biologically plausible spiking neural networks (SNNs) are an emerging architecture for deep learning tasks due to their energy efficiency when implemented on neuromorphic hardware. However, many of the biological features are at best irrelevant and at worst counterproductive when evaluated in the context of task performance and suitability for neuromorphic hardware. In this talk, I will present an alternative paradigm to design deep learning architectures with good task performance in real-world benchmarks while maintaining all the advantages of SNNs. We do this by focusing on two main features – event-based computation and activity sparsity. Starting from the performant gated recurrent unit (GRU) deep learning architecture, we modify it to make it event-based and activity-sparse. The resulting event-based GRU (EGRU) is extremely efficient for both training and inference. At the same time, it achieves performance close to conventional deep learning architectures in challenging tasks such as language modelling, gesture recognition and sequential MNIST.
Nonlinear computations in spiking neural networks through multiplicative synapses
The brain efficiently performs nonlinear computations through its intricate networks of spiking neurons, but how this is done remains elusive. While recurrent spiking networks implementing linear computations can be directly derived and easily understood (e.g., in the spike coding network (SCN) framework), the connectivity required for nonlinear computations can be harder to interpret, as they require additional non-linearities (e.g., dendritic or synaptic) weighted through supervised training. Here we extend the SCN framework to directly implement any polynomial dynamical system. This results in networks requiring multiplicative synapses, which we term the multiplicative spike coding network (mSCN). We demonstrate how the required connectivity for several nonlinear dynamical systems can be directly derived and implemented in mSCNs, without training. We also show how to precisely carry out higher-order polynomials with coupled networks that use only pair-wise multiplicative synapses, and provide expected numbers of connections for each synapse type. Overall, our work provides an alternative method for implementing nonlinear computations in spiking neural networks, while keeping all the attractive features of standard SCNs such as robustness, irregular and sparse firing, and interpretable connectivity. Finally, we discuss the biological plausibility of mSCNs, and how the high accuracy and robustness of the approach may be of interest for neuromorphic computing.
A multi-level account of hippocampal function in concept learning from behavior to neurons
A complete neuroscience requires multi-level theories that address phenomena ranging from higher-level cognitive behaviors to activities within a cell. Unfortunately, we don't have cognitive models of behavior whose components can be decomposed into the neural dynamics that give rise to behavior, leaving an explanatory gap. Here, we decompose SUSTAIN, a clustering model of concept learning, into neuron-like units (SUSTAIN-d; decomposed). Instead of abstract constructs (clusters), SUSTAIN-d has a pool of neuron-like units. With millions of units, a key challenge is how to bridge from abstract constructs such as clusters to neurons, whilst retaining high-level behavior. How does the brain coordinate neural activity during learning? Inspired by algorithms that capture flocking behavior in birds, we introduce a neural flocking learning rule to coordinate units that collectively form higher-level mental constructs ("virtual clusters"), neural representations (concept, place and grid cell-like assemblies), and parallels recurrent hippocampal activity. The decomposed model shows how brain-scale neural populations coordinate to form assemblies encoding concept and spatial representations, and why many neurons are required for robust performance. Our account provides a multi-level explanation for how cognition and symbol-like representations are supported by coordinated neural assemblies formed through learning.
Associative memory of structured knowledge
A long standing challenge in biological and artificial intelligence is to understand how new knowledge can be constructed from known building blocks in a way that is amenable for computation by neuronal circuits. Here we focus on the task of storage and recall of structured knowledge in long-term memory. Specifically, we ask how recurrent neuronal networks can store and retrieve multiple knowledge structures. We model each structure as a set of binary relations between events and attributes (attributes may represent e.g., temporal order, spatial location, role in semantic structure), and map each structure to a distributed neuronal activity pattern using a vector symbolic architecture (VSA) scheme. We then use associative memory plasticity rules to store the binarized patterns as fixed points in a recurrent network. By a combination of signal-to-noise analysis and numerical simulations, we demonstrate that our model allows for efficient storage of these knowledge structures, such that the memorized structures as well as their individual building blocks (e.g., events and attributes) can be subsequently retrieved from partial retrieving cues. We show that long-term memory of structured knowledge relies on a new principle of computation beyond the memory basins. Finally, we show that our model can be extended to store sequences of memories as single attractors.
Myelin Formation and Oligodendrocyte Biology in Epilepsy
Epilepsy is one of the most common neurological diseases according to the World Health Organization (WHO) affecting around 70 million people worldwide [WHO]. Patients who suffer from epilepsy also suffer from a variety of neuro-psychiatric co-morbidities, which they can experience as crippling as the seizure condition itself. Adequate organization of cerebral white matter is utterly important for cognitive development. The failure of integration of neurologic function with cognition is reflected in neuro-psychiatric disease, such as autism spectrum disorder (ASD). However, in epilepsy we know little about the importance of white matter abnormalities in epilepsy-associated co-morbidities. Epilepsy surgery is an important therapy strategy in patients where conventional anti-epileptic drug treatment fails . On histology of the resected brain samples, malformations of cortical development (MCD) are common among the epilepsy surgery population, especially focal cortical dysplasia (FCD) and tuberous sclerosis complex (TSC). Both pathologies are associated with constitutive activation of the mTOR pathway. Interestingly, some type of FCD is morphological similar to TSC cortical tubers including the abnormalities of the white matter. Hypomyelination with lack of myelin-producing cells, the oligodendrocytes, within the lesional area is a striking phenomenon. Impairment of the complex myelination process can have a major impact on brain function. In the worst case leading to distorted or interrupted neurotransmissions. It is still unclear whether the observed myelin pathology in epilepsy surgical specimens is primarily related to the underlying malformation process or is just a secondary phenomenon of recurrent epileptic seizures creating a toxic micro-environment which hampers myelin formation. Interestingly, mTORC1 has been implicated as key signal for myelination, thus, promoting the maturation of oligodendrocytes . These results, however, remain controversial. Regardless of the underlying pathophysiologic mechanism, alterations of myelin dynamics, depending on their severity, are known to be linked to various kinds of developmental disorders or neuropsychiatric manifestations.
Towards multi-system network models for cognitive neuroscience
Artificial neural networks can be useful for studying brain functions. In cognitive neuroscience, recurrent neural networks are often used to model cognitive functions. I will first offer my opinion on what is missing in the classical use of recurrent neural networks. Then I will discuss two lines of ongoing efforts in our group to move beyond the classical recurrent neural networks by studying multi-system neural networks (the talk will focus on two-system networks). These are networks that combine modules for several neural systems, such as vision, audition, prefrontal, hippocampal systems. I will showcase how multi-system networks can potentially be constrained by experimental data in fundamental ways and at scale.
Aligned and Oblique Dynamics in Recurrent Neural Networks
Talk & Tutorial
Building System Models of Brain-Like Visual Intelligence with Brain-Score
Research in the brain and cognitive sciences attempts to uncover the neural mechanisms underlying intelligent behavior in domains such as vision. Due to the complexities of brain processing, studies necessarily had to start with a narrow scope of experimental investigation and computational modeling. I argue that it is time for our field to take the next step: build system models that capture a range of visual intelligence behaviors along with the underlying neural mechanisms. To make progress on system models, we propose integrative benchmarking – integrating experimental results from many laboratories into suites of benchmarks that guide and constrain those models at multiple stages and scales. We show-case this approach by developing Brain-Score benchmark suites for neural (spike rates) and behavioral experiments in the primate visual ventral stream. By systematically evaluating a wide variety of model candidates, we not only identify models beginning to match a range of brain data (~50% explained variance), but also discover that models’ brain scores are predicted by their object categorization performance (up to 70% ImageNet accuracy). Using the integrative benchmarks, we develop improved state-of-the-art system models that more closely match shallow recurrent neuroanatomy and early visual processing to predict primate temporal processing and become more robust, and require fewer supervised synaptic updates. Taken together, these integrative benchmarks and system models are first steps to modeling the complexities of brain processing in an entire domain of intelligence.
General purpose event-based architectures for deep learning
Biologically plausible spiking neural networks (SNNs) are an emerging architecture for deep learning tasks due to their energy efficiency when implemented on neuromorphic hardware. However, many of the biological features are at best irrelevant and at worst counterproductive when evaluated in the context of task performance and suitability for neuromorphic hardware. In this talk, I will present an alternative paradigm to design deep learning architectures with good task performance in real-world benchmarks while maintaining all the advantages of SNNs. We do this by focusing on two main features -- event-based computation and activity sparsity. Starting from the performant gated recurrent unit (GRU) deep learning architecture, we modify it to make it event-based and activity-sparse. The resulting event-based GRU (EGRU) is extremely efficient for both training and inference. At the same time, it achieves performance close to conventional deep learning architectures in challenging tasks such as language modelling, gesture recognition and sequential MNIST
Flexible multitask computation in recurrent networks utilizes shared dynamical motifs
Flexible computation is a hallmark of intelligent behavior. Yet, little is known about how neural networks contextually reconfigure for different computations. Humans are able to perform a new task without extensive training, presumably through the composition of elementary processes that were previously learned. Cognitive scientists have long hypothesized the possibility of a compositional neural code, where complex neural computations are made up of constituent components; however, the neural substrate underlying this structure remains elusive in biological and artificial neural networks. Here we identified an algorithmic neural substrate for compositional computation through the study of multitasking artificial recurrent neural networks. Dynamical systems analyses of networks revealed learned computational strategies that mirrored the modular subtask structure of the task-set used for training. Dynamical motifs such as attractors, decision boundaries and rotations were reused across different task computations. For example, tasks that required memory of a continuous circular variable repurposed the same ring attractor. We show that dynamical motifs are implemented by clusters of units and are reused across different contexts, allowing for flexibility and generalization of previously learned computation. Lesioning these clusters resulted in modular effects on network performance: a lesion that destroyed one dynamical motif only minimally perturbed the structure of other dynamical motifs. Finally, modular dynamical motifs could be reconfigured for fast transfer learning. After slow initial learning of dynamical motifs, a subsequent faster stage of learning reconfigured motifs to perform novel tasks. This work contributes to a more fundamental understanding of compositional computation underlying flexible general intelligence in neural systems. We present a conceptual framework that establishes dynamical motifs as a fundamental unit of computation, intermediate between the neuron and the network. As more whole brain imaging studies record neural activity from multiple specialized systems simultaneously, the framework of dynamical motifs will guide questions about specialization and generalization across brain regions.
The role of astroglia-neuron interactions in generation and spread of seizures
Astroglia-neuron interactions are involved in multiple processes, regulating development, excitability and connectivity of neural circuits. Accumulating number of evidences highlight a direct connection between aberrant astroglial genetics and physiology in various forms of epilepsies. Using zebrafish seizure models, we showed that neurons and astroglia follow different spatiotemporal dynamics during transitions from pre-ictal to ictal activity. We observed that during pre-ictal period neurons exhibit local synchrony and low level of activity, whereas astroglia exhibit global synchrony and high-level of calcium signals that are anti correlated with neural activity. Instead, generalized seizures are marked by a massive release of astroglial glutamate release as well as a drastic increase of astroglia and neuronal activity and synchrony across the entire brain. Knocking out astroglial glutamate transporters leads to recurrent spontaneous generalized seizures accompanied with massive astroglial glutamate release. We are currently using a combination of genetic and pharmacological approaches to perturb astroglial glutamate signalling and astroglial gap junctions to further investigate their role in generation and spreading of epileptic seizures across the brain.
Online Training of Spiking Recurrent Neural Networks With Memristive Synapses
Spiking recurrent neural networks (RNNs) are a promising tool for solving a wide variety of complex cognitive and motor tasks, due to their rich temporal dynamics and sparse processing. However training spiking RNNs on dedicated neuromorphic hardware is still an open challenge. This is due mainly to the lack of local, hardware-friendly learning mechanisms that can solve the temporal credit assignment problem and ensure stable network dynamics, even when the weight resolution is limited. These challenges are further accentuated, if one resorts to using memristive devices for in-memory computing to resolve the von-Neumann bottleneck problem, at the expense of a substantial increase in variability in both the computation and the working memory of the spiking RNNs. In this talk, I will present our recent work where we introduced a PyTorch simulation framework of memristive crossbar arrays that enables accurate investigation of such challenges. I will show that recently proposed e-prop learning rule can be used to train spiking RNNs whose weights are emulated in the presented simulation framework. Although e-prop locally approximates the ideal synaptic updates, it is difficult to implement the updates on the memristive substrate due to substantial device non-idealities. I will mention several widely adapted weight update schemes that primarily aim to cope with these device non-idealities and demonstrate that accumulating gradients can enable online and efficient training of spiking RNN on memristive substrates.
From Computation to Large-scale Neural Circuitry in Human Belief Updating
Many decisions under uncertainty entail dynamic belief updating: multiple pieces of evidence informing about the state of the environment are accumulated across time to infer the environmental state, and choose a corresponding action. Traditionally, this process has been conceptualized as a linear and perfect (i.e., without loss) integration of sensory information along purely feedforward sensory-motor pathways. Yet, natural environments can undergo hidden changes in their state, which requires a non-linear accumulation of decision evidence that strikes a tradeoff between stability and flexibility in response to change. How this adaptive computation is implemented in the brain has remained unknown. In this talk, I will present an approach that my laboratory has developed to identify evidence accumulation signatures in human behavior and neural population activity (measured with magnetoencephalography, MEG), across a large number of cortical areas. Applying this approach to data recorded during visual evidence accumulation tasks with change-points, we find that behavior and neural activity in frontal and parietal regions involved in motor planning exhibit hallmarks signatures of adaptive evidence accumulation. The same signatures of adaptive behavior and neural activity emerge naturally from simulations of a biophysically detailed model of a recurrent cortical microcircuit. The MEG data further show that decision dynamics in parietal and frontal cortex are mirrored by a selective modulation of the state of early visual cortex. This state modulation is (i) specifically expressed in the alpha frequency-band, (ii) consistent with feedback of evolving belief states from frontal cortex, (iii) dependent on the environmental volatility, and (iv) amplified by pupil-linked arousal responses during evidence accumulation. Together, our findings link normative decision computations to recurrent cortical circuit dynamics and highlight the adaptive nature of decision-related long-range feedback processing in the brain.
Optimal information loading into working memory in prefrontal cortex
Working memory involves the short-term maintenance of information and is critical in many tasks. The neural circuit dynamics underlying working memory remain poorly understood, with different aspects of prefrontal cortical (PFC) responses explained by different putative mechanisms. By mathematical analysis, numerical simulations, and using recordings from monkey PFC, we investigate a critical but hitherto ignored aspect of working memory dynamics: information loading. We find that, contrary to common assumptions, optimal information loading involves inputs that are largely orthogonal, rather than similar, to the persistent activities observed during memory maintenance. Using a novel, theoretically principled metric, we show that PFC exhibits the hallmarks of optimal information loading and we find that such dynamics emerge naturally as a dynamical strategy in task-optimized recurrent neural networks. Our theory unifies previous, seemingly conflicting theories of memory maintenance based on attractor or purely sequential dynamics, and reveals a normative principle underlying the widely observed phenomenon of dynamic coding in PFC.
Feedforward and feedback processes in visual recognition
Progress in deep learning has spawned great successes in many engineering applications. As a prime example, convolutional neural networks, a type of feedforward neural networks, are now approaching – and sometimes even surpassing – human accuracy on a variety of visual recognition tasks. In this talk, however, I will show that these neural networks and their recent extensions exhibit a limited ability to solve seemingly simple visual reasoning problems involving incremental grouping, similarity, and spatial relation judgments. Our group has developed a recurrent network model of classical and extra-classical receptive field circuits that is constrained by the anatomy and physiology of the visual cortex. The model was shown to account for diverse visual illusions providing computational evidence for a novel canonical circuit that is shared across visual modalities. I will show that this computational neuroscience model can be turned into a modern end-to-end trainable deep recurrent network architecture that addresses some of the shortcomings exhibited by state-of-the-art feedforward networks for solving complex visual reasoning tasks. This suggests that neuroscience may contribute powerful new ideas and approaches to computer science and artificial intelligence.
An investigation of perceptual biases in spiking recurrent neural networks trained to discriminate time intervals
Magnitude estimation and stimulus discrimination tasks are affected by perceptual biases that cause the stimulus parameter to be perceived as shifted toward the mean of its distribution. These biases have been extensively studied in psychophysics and, more recently and to a lesser extent, with neural activity recordings. New computational techniques allow us to train spiking recurrent neural networks on the tasks used in the experiments. This provides us with another valuable tool with which to investigate the network mechanisms responsible for the biases and how behavior could be modeled. As an example, in this talk I will consider networks trained to discriminate the durations of temporal intervals. The trained networks presented the contraction bias, even though they were trained with a stimulus sequence without temporal correlations. The neural activity during the delay period carried information about the stimuli of the current trial and previous trials, this being one of the mechanisms that originated the contraction bias. The population activity described trajectories in a low-dimensional space and their relative locations depended on the prior distribution. The results can be modeled as an ideal observer that during the delay period sees a combination of the current and the previous stimuli. Finally, I will describe how the neural trajectories in state space encode an estimate of the interval duration. The approach could be applied to other cognitive tasks.
Neural Circuit Mechanisms of Pattern Separation in the Dentate Gyrus
The ability to discriminate different sensory patterns by disentangling their neural representations is an important property of neural networks. While a variety of learning rules are known to be highly effective at fine-tuning synapses to achieve this, less is known about how different cell types in the brain can facilitate this process by providing architectural priors that bias the network towards sparse, selective, and discriminable representations. We studied this by simulating a neuronal network modelled on the dentate gyrus—an area characterised by sparse activity associated with pattern separation in spatial memory tasks. To test the contribution of different cell types to these functions, we presented the model with a wide dynamic range of input patterns and systematically added or removed different circuit elements. We found that recruiting feedback inhibition indirectly via recurrent excitatory neurons proved particularly helpful in disentangling patterns, and show that simple alignment principles for excitatory and inhibitory connections are a highly effective strategy.
Synthetic and natural images unlock the power of recurrency in primary visual cortex
During perception the visual system integrates current sensory evidence with previously acquired knowledge of the visual world. Presumably this computation relies on internal recurrent interactions. We record populations of neurons from the primary visual cortex of cats and macaque monkeys and find evidence for adaptive internal responses to structured stimulation that change on both slow and fast timescales. In the first experiment, we present abstract images, only briefly, a protocol known to produce strong and persistent recurrent responses in the primary visual cortex. We show that repetitive presentations of a large randomized set of images leads to enhanced stimulus encoding on a timescale of minutes to hours. The enhanced encoding preserves the representational details required for image reconstruction and can be detected in post-exposure spontaneous activity. In a second experiment, we show that the encoding of natural scenes across populations of V1 neurons is improved, over a timescale of hundreds of milliseconds, with the allocation of spatial attention. Given the hierarchical organization of the visual cortex, contextual information from the higher levels of the processing hierarchy, reflecting high-level image regularities, can inform the activity in V1 through feedback. We hypothesize that these fast attentional boosts in stimulus encoding rely on recurrent computations that capitalize on the presence of high-level visual features in natural scenes. We design control images dominated by low-level features and show that, in agreement with our hypothesis, the attentional benefits in stimulus encoding vanish. We conclude that, in the visual system, powerful recurrent processes optimize neuronal responses, already at the earliest stages of cortical processing.
Timescales of neural activity: their inference, control, and relevance
Timescales characterize how fast the observables change in time. In neuroscience, they can be estimated from the measured activity and can be used, for example, as a signature of the memory trace in the network. I will first discuss the inference of the timescales from the neuroscience data comprised of the short trials and introduce a new unbiased method. Then, I will apply the method to the data recorded from a local population of cortical neurons from the visual area V4. I will demonstrate that the ongoing spiking activity unfolds across at least two distinct timescales - fast and slow - and the slow timescale increases when monkeys attend to the location of the receptive field. Which models can give rise to such behavior? Random balanced networks are known for their fast timescales; thus, a change in the neurons or network properties is required to mimic the data. I will propose a set of models that can control effective timescales and demonstrate that only the model with strong recurrent interactions fits the neural data. Finally, I will discuss the timescales' relevance for behavior and cortical computations.
Computation in the neuronal systems close to the critical point
It was long hypothesized that natural systems might take advantage of the extended temporal and spatial correlations close to the critical point to improve their computational capabilities. However, on the other side, different distances to criticality were inferred from the recordings of nervous systems. In my talk, I discuss how including additional constraints on the processing time can shift the optimal operating point of the recurrent networks. Moreover, the data from the visual cortex of the monkeys during the attentional task indicate that they flexibly change the closeness to the critical point of the local activity. Overall it suggests that, as we would expect from common sense, the optimal state depends on the task at hand, and the brain adapts to it in a local and fast manner.
Extrinsic control and autonomous computation in the hippocampal CA1 circuit
In understanding circuit operations, a key issue is the extent to which neuronal spiking reflects local computation or responses to upstream inputs. Because pyramidal cells in CA1 do not have local recurrent projections, it is currently assumed that firing in CA1 is inherited from its inputs – thus, entorhinal inputs provide communication with the rest of the neocortex and the outside world, whereas CA3 inputs provide internal and past memory representations. Several studies have attempted to prove this hypothesis, by lesioning or silencing either area CA3 or the entorhinal cortex and examining the effect of firing on CA1 pyramidal cells. Despite the intense and careful work in this research area, the magnitudes and types of the reported physiological impairments vary widely across experiments. At least part of the existing variability and conflicts is due to the different behavioral paradigms, designs and evaluation methods used by different investigators. Simultaneous manipulations in the same animal or even separate manipulations of the different inputs to the hippocampal circuits in the same experiment are rare. To address these issues, I used optogenetic silencing of unilateral and bilateral mEC, of the local CA1 region, and performed bilateral pharmacogenetic silencing of the entire CA3 region. I combined this with high spatial resolution recording of local field potentials (LFP) in the CA1-dentate axis and simultaneously collected firing pattern data from thousands of single neurons. Each experimental animal had up to two of these manipulations being performed simultaneously. Silencing the medial entorhinal (mEC) largely abolished extracellular theta and gamma currents in CA1, without affecting firing rates. In contrast, CA3 and local CA1 silencing strongly decreased firing of CA1 neurons without affecting theta currents. Each perturbation reconfigured the CA1 spatial map. Yet, the ability of the CA1 circuit to support place field activity persisted, maintaining the same fraction of spatially tuned place fields, and reliable assembly expression as in the intact mouse. Thus, the CA1 network can maintain autonomous computation to support coordinated place cell assemblies without reliance on its inputs, yet these inputs can effectively reconfigure and assist in maintaining stability of the CA1 map.
Recurrent brainstem-forebrain loops in the control of vocal production in songbirds
Parametric control of flexible timing through low-dimensional neural manifolds
Biological brains possess an exceptional ability to infer relevant behavioral responses to a wide range of stimuli from only a few examples. This capacity to generalize beyond the training set has been proven particularly challenging to realize in artificial systems. How neural processes enable this capacity to extrapolate to novel stimuli is a fundamental open question. A prominent but underexplored hypothesis suggests that generalization is facilitated by a low-dimensional organization of collective neural activity, yet evidence for the underlying neural mechanisms remains wanting. Combining network modeling, theory and neural data analysis, we tested this hypothesis in the framework of flexible timing tasks, which rely on the interplay between inputs and recurrent dynamics. We first trained recurrent neural networks on a set of timing tasks while minimizing the dimensionality of neural activity by imposing low-rank constraints on the connectivity, and compared the performance and generalization capabilities with networks trained without any constraint. We then examined the trained networks, characterized the dynamical mechanisms underlying the computations, and verified their predictions in neural recordings. Our key finding is that low-dimensional dynamics strongly increases the ability to extrapolate to inputs outside of the range used in training. Critically, this capacity to generalize relies on controlling the low-dimensional dynamics by a parametric contextual input. We found that this parametric control of extrapolation was based on a mechanism where tonic inputs modulate the dynamics along non-linear manifolds in activity space while preserving their geometry. Comparisons with neural recordings in the dorsomedial frontal cortex of macaque monkeys performing flexible timing tasks confirmed the geometric and dynamical signatures of this mechanism. Altogether, our results tie together a number of previous experimental findings and suggest that the low-dimensional organization of neural dynamics plays a central role in generalizable behaviors.
Taming chaos in neural circuits
Neural circuits exhibit complex activity patterns, both spontaneously and in response to external stimuli. Information encoding and learning in neural circuits depend on the ability of time-varying stimuli to control spontaneous network activity. In particular, variability arising from the sensitivity to initial conditions of recurrent cortical circuits can limit the information conveyed about the sensory input. Spiking and firing rate network models can exhibit such sensitivity to initial conditions that are reflected in their dynamic entropy rate and attractor dimensionality computed from their full Lyapunov spectrum. I will show how chaos in both spiking and rate networks depends on biophysical properties of neurons and the statistics of time-varying stimuli. In spiking networks, increasing the input rate or coupling strength aids in controlling the driven target circuit, which is reflected in both a reduced trial-to-trial variability and a decreased dynamic entropy rate. With sufficiently strong input, a transition towards complete network state control occurs. Surprisingly, this transition does not coincide with the transition from chaos to stability but occurs at even larger values of external input strength. Controllability of spiking activity is facilitated when neurons in the target circuit have a sharp spike onset, thus a high speed by which neurons launch into the action potential. I will also discuss chaos and controllability in firing-rate networks in the balanced state. For these, external control of recurrent dynamics strongly depends on correlations in the input. This phenomenon was studied with a non-stationary dynamic mean-field theory that determines how the activity statistics and the largest Lyapunov exponent depend on frequency and amplitude of the input, recurrent coupling strength, and network size. This shows that uncorrelated inputs facilitate learning in balanced networks. The results highlight the potential of Lyapunov spectrum analysis as a diagnostic for machine learning applications of recurrent networks. They are also relevant in light of recent advances in optogenetics that allow for time-dependent stimulation of a select population of neurons.
Frontal circuit specialisations for information search and decision making
During primate evolution, prefrontal cortex (PFC) expanded substantially relative to other cortical areas. The expansion of PFC circuits likely supported the increased cognitive abilities of humans and anthropoids to sample information about their environment, evaluate that information, plan, and decide between different courses of action. What quantities do these circuits compute as information is being sampled towards and a decision is being made? And how can they be related to anatomical specialisations within and across PFC? To address this, we recorded PFC activity during value-based decision making using single unit recording in non-human primates and magnetoencephalography in humans. At a macrocircuit level, we found that value correlates differ substantially across PFC subregions. They are heavily shaped by each subregion’s anatomical connections and by the decision-maker’s current locus of attention. At a microcircuit level, we found that the temporal evolution of value correlates can be predicted using cortical recurrent network models that temporally integrate incoming decision evidence. These models reflect the fact that PFC circuits are highly recurrent in nature and have synaptic properties that support persistent activity across temporally extended cognitive tasks. Our findings build upon recent work describing economic decision making as a process of attention-weighted evidence integration across time.
Theory of recurrent neural networks – from parameter inference to intrinsic timescales in spiking networks
JAK/STAT regulation of the transcriptomic response during epileptogenesis
Temporal lobe epilepsy (TLE) is a progressive disorder mediated by pathological changes in molecular cascades and neural circuit remodeling in the hippocampus resulting in increased susceptibility to spontaneous seizures and cognitive dysfunction. Targeting these cascades could prevent or reverse symptom progression and has the potential to provide viable disease-modifying treatments that could reduce the portion of TLE patients (>30%) not responsive to current medical therapies. Changes in GABA(A) receptor subunit expression have been implicated in the pathogenesis of TLE, and the Janus Kinase/Signal Transducer and Activator of Transcription (JAK/STAT) pathway has been shown to be a key regulator of these changes. The JAK/STAT pathway is known to be involved in inflammation and immunity, and to be critical for neuronal functions such as synaptic plasticity and synaptogenesis. Our laboratories have shown that a STAT3 inhibitor, WP1066, could greatly reduce the number of spontaneous recurrent seizures (SRS) in an animal model of pilocarpine-induced status epilepticus (SE). This suggests promise for JAK/STAT inhibitors as disease-modifying therapies, however, the potential adverse effects of systemic or global CNS pathway inhibition limits their use. Development of more targeted therapeutics will require a detailed understanding of JAK/STAT-induced epileptogenic responses in different cell types. To this end, we have developed a new transgenic line where dimer-dependent STAT3 signaling is functionally knocked out (fKO) by tamoxifen-induced Cre expression specifically in forebrain excitatory neurons (eNs) via the Calcium/Calmodulin Dependent Protein Kinase II alpha (CamK2a) promoter. Most recently, we have demonstrated that STAT3 KO in excitatory neurons (eNSTAT3fKO) markedly reduces the progression of epilepsy (SRS frequency) in the intrahippocampal kainate (IHKA) TLE model and protects mice from kainic acid (KA)-induced memory deficits as assessed by Contextual Fear Conditioning. Using data from bulk hippocampal tissue RNA-sequencing, we further discovered a transcriptomic signature for the IHKA model that contains a substantial number of genes, particularly in synaptic plasticity and inflammatory gene networks, that are down-regulated after KA-induced SE in wild-type but not eNSTAT3fKO mice. Finally, we will review data from other models of brain injury that lead to epilepsy, such as TBI, that implicate activation of the JAK/STAT pathway that may contribute to epilepsy development.
A nonlinear shot noise model for calcium-based synaptic plasticity
Activity dependent synaptic plasticity is considered to be a primary mechanism underlying learning and memory. Yet it is unclear whether plasticity rules such as STDP measured in vitro apply in vivo. Network models with STDP predict that activity patterns (e.g., place-cell spatial selectivity) should change much faster than observed experimentally. We address this gap by investigating a nonlinear calcium-based plasticity rule fit to experiments done in physiological conditions. In this model, LTP and LTD result from intracellular calcium transients arising almost exclusively from synchronous coactivation of pre- and postsynaptic neurons. We analytically approximate the full distribution of nonlinear calcium transients as a function of pre- and postsynaptic firing rates, and temporal correlations. This analysis directly relates activity statistics that can be measured in vivo to the changes in synaptic efficacy they cause. Our results highlight that both high-firing rates and temporal correlations can lead to significant changes to synaptic efficacy. Using a mean-field theory, we show that the nonlinear plasticity rule, without any fine-tuning, gives a stable, unimodal synaptic weight distribution characterized by many strong synapses which remain stable over long periods of time, consistent with electrophysiological and behavioral studies. Moreover, our theory explains how memories encoded by strong synapses can be preferentially stabilized by the plasticity rule. We confirmed our analytical results in a spiking recurrent network. Interestingly, although most synapses are weak and undergo rapid turnover, the fraction of strong synapses are sufficient for supporting realistic spiking dynamics and serve to maintain the network’s cluster structure. Our results provide a mechanistic understanding of how stable memories may emerge on the behavioral level from an STDP rule measured in physiological conditions. Furthermore, the plasticity rule we investigate is mathematically equivalent to other learning rules which rely on the statistics of coincidences, so we expect that our formalism will be useful to study other learning processes beyond the calcium-based plasticity rule.
NMC4 Short Talk: Predictive coding is a consequence of energy efficiency in recurrent neural networks
Predictive coding represents a promising framework for understanding brain function, postulating that the brain continuously inhibits predictable sensory input, ensuring a preferential processing of surprising elements. A central aspect of this view on cortical computation is its hierarchical connectivity, involving recurrent message passing between excitatory bottom-up signals and inhibitory top-down feedback. Here we use computational modelling to demonstrate that such architectural hard-wiring is not necessary. Rather, predictive coding is shown to emerge as a consequence of energy efficiency, a fundamental requirement of neural processing. When training recurrent neural networks to minimise their energy consumption while operating in predictive environments, the networks self-organise into prediction and error units with appropriate inhibitory and excitatory interconnections and learn to inhibit predictable sensory input. We demonstrate that prediction units can reliably be identified through biases in their median preactivation, pointing towards a fundamental property of prediction units in the predictive coding framework. Moving beyond the view of purely top-down driven predictions, we demonstrate via virtual lesioning experiments that networks perform predictions on two timescales: fast lateral predictions among sensory units and slower prediction cycles that integrate evidence over time. Our results, which replicate across two separate data sets, suggest that predictive coding can be interpreted as a natural consequence of energy efficiency. More generally, they raise the question which other computational principles of brain function can be understood as a result of physical constraints posed by the brain, opening up a new area of bio-inspired, machine learning-powered neuroscience research.
NMC4 Short Talk: A theory for the population rate of adapting neurons disambiguates mean vs. variance-driven dynamics and explains log-normal response statistics
Recently, the field of computational neuroscience has seen an explosion of the use of trained recurrent network models (RNNs) to model patterns of neural activity. These RNN models are typically characterized by tuned recurrent interactions between rate 'units' whose dynamics are governed by smooth, continuous differential equations. However, the response of biological single neurons is better described by all-or-none events - spikes - that are triggered in response to the processing of their synaptic input by the complex dynamics of their membrane. One line of research has attempted to resolve this discrepancy by linking the average firing probability of a population of simplified spiking neuron models to rate dynamics similar to those used for RNN units. However, challenges remain to account for complex temporal dependencies in the biological single neuron response and for the heterogeneity of synaptic input across the population. Here, we make progress by showing how to derive dynamic rate equations for a population of spiking neurons with multi-timescale adaptation properties - as this was shown to accurately model the response of biological neurons - while they receive independent time-varying inputs, leading to plausible asynchronous activity in the network. The resulting rate equations yield an insightful segregation of the population's response into dynamics that are driven by the mean signal received by the neural population, and dynamics driven by the variance of the input across neurons, with respective timescales that are in agreement with slice experiments. Further, these equations explain how input variability can shape log-normal instantaneous rate distributions across neurons, as observed in vivo. Our results help interpret properties of the neural population response and open the way to investigating whether the more biologically plausible and dynamically complex rate model we derive could provide useful inductive biases if used in an RNN to solve specific tasks.
NMC4 Short Talk: Different hypotheses on the role of the PFC in solving simple cognitive tasks
Low-dimensional population dynamics can be observed in neural activity recorded from the prefrontal cortex (PFC) of subjects performing simple cognitive tasks. Many studies have shown that recurrent neural networks (RNNs) trained on the same tasks can reproduce qualitatively these state space trajectories, and have used them as models of how neuronal dynamics implement task computations. The PFC is also viewed as a conductor that organizes the communication between cortical areas and provides contextual information. It is then not clear what is its role in solving simple cognitive tasks. Do the low-dimensional trajectories observed in the PFC really correspond to the computations that it performs? Or do they indirectly reflect the computations occurring within the cortical areas projecting to the PFC? To address these questions, we modelled cortical areas with a modular RNN and equipped it with a PFC-like cognitive system. When trained on cognitive tasks, this multi-system brain model can reproduce the low-dimensional population responses observed in neuronal activity as well as classical RNNs. Qualitatively different mechanisms can emerge from the training process when varying some details of the architecture such as the time constants. In particular, there is one class of models where it is the dynamics of the cognitive system that is implementing the task computations, and another where the cognitive system is only necessary to provide contextual information about the task rule as task performance is not impaired when preventing the system from accessing the task inputs. These constitute two different hypotheses about the causal role of the PFC in solving simple cognitive tasks, which could motivate further experiments on the brain.
NMC4 Keynote: An all-natural deep recurrent neural network architecture for flexible navigation
A wide variety of animals and some artificial agents can adapt their behavior to changing cues, contexts, and goals. But what neural network architectures support such behavioral flexibility? Agents with loosely structured network architectures and random connections can be trained over millions of trials to display flexibility in specific tasks, but many animals must adapt and learn with much less experience just to survive. Further, it has been challenging to understand how the structure of trained deep neural networks relates to their functional properties, an important objective for neuroscience. In my talk, I will use a combination of behavioral, physiological and connectomic evidence from the fly to make the case that the built-in modularity and structure of its networks incorporate key aspects of the animal’s ecological niche, enabling rapid flexibility by constraining learning to operate on a restricted parameter set. It is not unlikely that this is also a feature of many biological neural networks across other animals, large and small, and with and without vertebrae.
NMC4 Short Talk: Directly interfacing brain and deep networks exposes non-hierarchical visual processing
A recent approach to understanding the mammalian visual system is to show correspondence between the sequential stages of processing in the ventral stream with layers in a deep convolutional neural network (DCNN), providing evidence that visual information is processed hierarchically, with successive stages containing ever higher-level information. However, correspondence is usually defined as shared variance between brain region and model layer. We propose that task-relevant variance is a stricter test: If a DCNN layer corresponds to a brain region, then substituting the model’s activity with brain activity should successfully drive the model’s object recognition decision. Using this approach on three datasets (human fMRI and macaque neuron firing rates) we found that in contrast to the hierarchical view, all ventral stream regions corresponded best to later model layers. That is, all regions contain high-level information about object category. We hypothesised that this is due to recurrent connections propagating high-level visual information from later regions back to early regions, in contrast to the exclusively feed-forward connectivity of DCNNs. Using task-relevant correspondence with a late DCNN layer akin to a tracer, we used Granger causal modelling to show late-DCNN correspondence in IT drives correspondence in V4. Our analysis suggests, effectively, that no ventral stream region can be appropriately characterised as ‘early’ beyond 70ms after stimulus presentation, challenging hierarchical models. More broadly, we ask what it means for a model component and brain region to correspond: beyond quantifying shared variance, we must consider the functional role in the computation. We also demonstrate that using a DCNN to decode high-level conceptual information from ventral stream produces a general mapping from brain to model activation space, which generalises to novel classes held-out from training data. This suggests future possibilities for brain-machine interface with high-level conceptual information, beyond current designs that interface with the sensorimotor periphery.
Refuting the unfolding-argument on the irrelevance of causal structure to consciousness
I will build from Niccolo's discussion of the Blockhead argument to argue that having an FeedForward Network (FN) responding like an recurrent network (RN) in a consciousness experiment is not enough to convince us the two are the same with regards to the posession of mental states and conscious experience. I will then argue that a robust functional equivalence between FFN and RN is akso not supported by the mathematical work on the Universal Approximator theorem, and is also unlikely to hold, as a conjecture, given data in cognitive neuroscience; I will argue that an equivalence of RN and FFN may only apply to static functions between input/output layers and not to the temporal patterns or to the network's reactions to structural perturbations. Finally, I review data indicating that consciousness has functional characteristics, such as a flexible control of behavior, and that cognitive/brain dynamics reveal interacting top-down and bottom-up processes, which are necessary for the mediation of such control processes.
The generation of cortical novelty responses through inhibitory plasticity
Animals depend on fast and reliable detection of novel stimuli in their environment. Neurons in multiple sensory areas respond more strongly to novel in comparison to familiar stimuli. Yet, it remains unclear which circuit, cellular, and synaptic mechanisms underlie those responses. Here, we show that spike-timing-dependent plasticity of inhibitory-to-excitatory synapses generates novelty responses in a recurrent spiking network model. Inhibitory plasticity increases the inhibition onto excitatory neurons tuned to familiar stimuli, while inhibition for novel stimuli remains low, leading to a network novelty response. The generation of novelty responses does not depend on the periodicity but rather on the distribution of presented stimuli. By including tuning of inhibitory neurons, the network further captures stimulus-specific adaptation. Finally, we suggest that disinhibition can control the amplification of novelty responses. Therefore, inhibitory plasticity provides a flexible, biologically plausible mechanism to detect the novelty of bottom-up stimuli, enabling us to make experimentally testable predictions.
Norse: A library for gradient-based learning in Spiking Neural Networks
We introduce Norse: An open-source library for gradient-based training of spiking neural networks. In contrast to neuron simulators which mainly target computational neuroscientists, our library seamlessly integrates with the existing PyTorch ecosystem using abstractions familiar to the machine learning community. This has immediate benefits in that it provides a familiar interface, hardware accelerator support and, most importantly, the ability to use gradient-based optimization. While many parallel efforts in this direction exist, Norse emphasizes flexibility and usability in three ways. Users can conveniently specify feed-forward (convolutional) architectures, as well as arbitrarily connected recurrent networks. We strictly adhere to a functional and class-based API such that neuron primitives and, for example, plasticity rules composes. Finally, the functional core API ensures compatibility with the PyTorch JIT and ONNX infrastructure. We have made progress to support network execution on the SpiNNaker platform and plan to support other neuromorphic architectures in the future. While the library is useful in its present state, it also has limitations we will address in ongoing work. In particular, we aim to implement event-based gradient computation, using the EventProp algorithm, which will allow us to support sparse event-based data efficiently, as well as work towards support of more complex neuron models. With this library, we hope to contribute to a joint future of computational neuroscience and neuromorphic computing.
Efficient GPU training of SNNs using approximate RTRL
Last year’s SNUFA workshop report concluded “Moving toward neuron numbers comparable with biology and applying these networks to real-world data-sets will require the development of novel algorithms, software libraries, and dedicated hardware accelerators that perform well with the specifics of spiking neural networks” [1]. Taking inspiration from machine learning libraries — where techniques such as parallel batch training minimise latency and maximise GPU occupancy — as well as our previous research on efficiently simulating SNNs on GPUs for computational neuroscience [2,3], we are extending our GeNN SNN simulator to pursue this vision. To explore GeNN’s potential, we use the eProp learning rule [4] — which approximates RTRL — to train SNN classifiers on the Spiking Heidelberg Digits and the Spiking Sequential MNIST datasets. We find that the performance of these classifiers is comparable to those trained using BPTT [5] and verify that the theoretical advantages of neuron models with adaptation dynamics [5] translate to improved classification performance. We then measured execution times and found that training an SNN classifier using GeNN and eProp becomes faster than SpyTorch and BPTT after less than 685 timesteps and much larger models can be trained on the same GPU when using GeNN. Furthermore, we demonstrate that our implementation of parallel batch training improves training performance by over 4⨉ and enables near-perfect scaling across multiple GPUs. Finally, we show that performing inference using a recurrent SNN using GeNN uses less energy and has lower latency than a comparable LSTM simulated with TensorFlow [6].
Event-based Backpropagation for Exact Gradients in Spiking Neural Networks
Gradient-based optimization powered by the backpropagation algorithm proved to be the pivotal method in the training of non-spiking artificial neural networks. At the same time, spiking neural networks hold the promise for efficient processing of real-world sensory data by communicating using discrete events in continuous time. We derive the backpropagation algorithm for a recurrent network of spiking (leaky integrate-and-fire) neurons with hard thresholds and show that the backward dynamics amount to an event-based backpropagation of errors through time. Our derivation uses the jump conditions for partial derivatives at state discontinuities found by applying the implicit function theorem, allowing us to avoid approximations or substitutions. We find that the gradient exists and is finite almost everywhere in weight space, up to the null set where a membrane potential is precisely tangent to the threshold. Our presented algorithm, EventProp, computes the exact gradient with respect to a general loss function based on spike times and membrane potentials. Crucially, the algorithm allows for an event-based communication scheme in the backward phase, retaining the potential advantages of temporal sparsity afforded by spiking neural networks. We demonstrate the optimization of spiking networks using gradients computed via EventProp and the Yin-Yang and MNIST datasets with either a spike time-based or voltage-based loss function and report competitive performance. Our work supports the rigorous study of gradient-based optimization in spiking neural networks as well as the development of event-based neuromorphic architectures for the efficient training of spiking neural networks. While we consider the leaky integrate-and-fire model in this work, our methodology generalises to any neuron model defined as a hybrid dynamical system.
Optimising spiking interneuron circuits for compartment-specific feedback
Cortical circuits process information by rich recurrent interactions between excitatory neurons and inhibitory interneurons. One of the prime functions of interneurons is to stabilize the circuit by feedback inhibition, but the level of specificity on which inhibitory feedback operates is not fully resolved. We hypothesized that inhibitory circuits could enable separate feedback control loops for different synaptic input streams, by means of specific feedback inhibition to different neuronal compartments. To investigate this hypothesis, we adopted an optimization approach. Leveraging recent advances in training spiking network models, we optimized the connectivity and short-term plasticity of interneuron circuits for compartment-specific feedback inhibition onto pyramidal neurons. Over the course of the optimization, the interneurons diversified into two classes that resembled parvalbumin (PV) and somatostatin (SST) expressing interneurons. The resulting circuit can be understood as a neural decoder that inverts the nonlinear biophysical computations performed within the pyramidal cells. Our model provides a proof of concept for studying structure-function relations in cortical circuits by a combination of gradient-based optimization and biologically plausible phenomenological models
The diachronic account of attentional selectivity
Many models of attention assume that attentional selection takes place at a specific moment in time which demarcates the critical transition from pre-attentive to attentive processing of sensory input. We argue that this intuitively appealing account is not only inaccurate, but has led to substantial conceptual confusion (to the point where some attention researchers offer to abandon the term ‘attention’ altogether). As an alternative, we offer a “diachronic” framework that describes attentional selectivity as a process that unfolds over time. Key to this view is the concept of attentional episodes, brief periods of intense attentional amplification of sensory representations that regulate access to working memory and response-related processes. We describe how attentional episodes are linked to earlier attentional mechanisms and to recurrent processing at the neural level. We present data showing that multiple sequential events can be involuntarily encoded in working memory when they appear during the same attentional episode, whether they are relevant or not. We also discuss the costs associated with processing multiple events within a single episode. Finally, we argue that breaking down the dichotomy between pre-attentive and attentive (as well as early vs. late selection) offers new solutions to old problems in attention research that have never been resolved. It can provide a unified and conceptually coherent account of the network of cognitive and neural processes that produce the goal-directed selectivity in perceptual processing that is commonly referred to as “attention”.
Tuning dumb neurons to task processing - via homeostasis
Homeostatic plasticity plays a key role in stabilizing neural network activity. But what is its role in neural information processing? We showed analytically how homeostasis changes collective dynamics and consequently information flow - depending on the input to the network. We then studied how input and homeostasis on a recurrent network of LIF neurons impacts information flow and task performance. We showed how we can tune the working point of the network, and found that, contrary to previous assumptions, there is not one optimal working point for a family of tasks, but each task may require its own working point.
Conditions for sequence replay in recurrent network models of CA3
Bernstein Conference 2024
Effect of experience on context-dependent learning in recurrent networks
Bernstein Conference 2024
Efficient cortical spike train decoding for brain-machine interface implants with recurrent spiking neural networks
Bernstein Conference 2024
Evolutionary algorithms support recurrent plasticity in spiking neural network models of neocortical task learning
Bernstein Conference 2024
Excitatory and inhibitory neurons exhibit distinct roles for task learning, temporal scaling, and working memory in recurrent spiking neural network models of neocortex.
Bernstein Conference 2024
A family of synaptic plasticity rules based on spike times produces a diversity of triplet motifs in recurrent networks
Bernstein Conference 2024
Identifying task-specific dynamics in recurrent neural networks using Dynamical Similarity Analysis
Bernstein Conference 2024
Inferring stochastic low-rank recurrent neural networks from neural data
Bernstein Conference 2024
Investigating the role of recurrent connectivity in connectome-constrained and task-optimized models of the fruit fly’s motion pathway
Bernstein Conference 2024
Linking Neural Manifolds to Circuit Structure in Recurrent Networks
Bernstein Conference 2024
Recurrent Attention Network
Bernstein Conference 2024
Response variability can accelerate learning in feedforward-recurrent networks
Bernstein Conference 2024
Reverse engineering recurrent network models reveals mechanisms for location memory
Bernstein Conference 2024
Shaping Low-Rank Recurrent Neural Networks with Biological Learning Rules
Bernstein Conference 2024
Synaptic Upscaling Amplifies Chaotic Dynamics in Recurrent Networks of Rate Neurons
Bernstein Conference 2024
Theta-modulated memory encoding and retrieval in recurrent hippocampal circuits
Bernstein Conference 2024
Unraveling perceptual biases: Insights from spiking recurrent neural networks
Bernstein Conference 2024
Auxiliary neurons in optimized recurrent neural circuit speed up sampling-based probabilistic inference
COSYNE 2022
Clustered recurrent connectivity promotes the development of E/I co-tuning via synaptic plasticity
COSYNE 2022
A high-throughput pipeline for evaluating recurrent neural networks on multiple datasets
COSYNE 2022
Fitting recurrent spiking network models to study the interaction between cortical areas
COSYNE 2022
Gain-mediated statistical adaptation in recurrent neural networks
COSYNE 2022
Gain-mediated statistical adaptation in recurrent neural networks
COSYNE 2022
Fitting recurrent spiking network models to study the interaction between cortical areas
COSYNE 2022
Hierarchy of brain oscillations emerges from recurrent error correction
COSYNE 2022
Hierarchy of brain oscillations emerges from recurrent error correction
COSYNE 2022
A high-throughput pipeline for evaluating recurrent neural networks on multiple datasets
COSYNE 2022
Hippocampal representations emerge when training recurrent neural networks on a memory dependent maze navigation task
COSYNE 2022
Hippocampal representations emerge when training recurrent neural networks on a memory dependent maze navigation task
COSYNE 2022
Modeling multi-region neural communication during decision making with recurrent switching dynamical systems
COSYNE 2022
Modeling multi-region neural communication during decision making with recurrent switching dynamical systems
COSYNE 2022
Multitask computation in recurrent networks utilizes shared dynamical motifs
COSYNE 2022
Multitask computation in recurrent networks utilizes shared dynamical motifs
COSYNE 2022
Neural Representations of Opponent Strategy Support the Adaptive Behavior of Recurrent Actor-Critics in a Competitive Game
COSYNE 2022
Neural Representations of Opponent Strategy Support the Adaptive Behavior of Recurrent Actor-Critics in a Competitive Game
COSYNE 2022
Operative Dimensions in High-Dimensional Connectivity of Recurrent Neural Networks
COSYNE 2022
Operative Dimensions in High-Dimensional Connectivity of Recurrent Neural Networks
COSYNE 2022
Phase dependent maintenance of temporal order in biological and artificial recurrent neural networks
COSYNE 2022
Phase dependent maintenance of temporal order in biological and artificial recurrent neural networks
COSYNE 2022
Biological-plausible learning with a two compartment neuron model in recurrent neural networks
Bernstein Conference 2024