recurrent network
Latest
A recurrent network model of planning predicts hippocampal replay and human behavior
When interacting with complex environments, humans can rapidly adapt their behavior to changes in task or context. To facilitate this adaptation, we often spend substantial periods of time contemplating possible futures before acting. For such planning to be rational, the benefits of planning to future behavior must at least compensate for the time spent thinking. Here we capture these features of human behavior by developing a neural network model where not only actions, but also planning, are controlled by prefrontal cortex. This model consists of a meta-reinforcement learning agent augmented with the ability to plan by sampling imagined action sequences drawn from its own policy, which we refer to as `rollouts'. Our results demonstrate that this agent learns to plan when planning is beneficial, explaining the empirical variability in human thinking times. Additionally, the patterns of policy rollouts employed by the artificial agent closely resemble patterns of rodent hippocampal replays recently recorded in a spatial navigation task, in terms of both their spatial statistics and their relationship to subsequent behavior. Our work provides a new theory of how the brain could implement planning through prefrontal-hippocampal interactions, where hippocampal replays are triggered by -- and in turn adaptively affect -- prefrontal dynamics.
A recurrent network model of planning explains hippocampal replay and human behavior
When interacting with complex environments, humans can rapidly adapt their behavior to changes in task or context. To facilitate this adaptation, we often spend substantial periods of time contemplating possible futures before acting. For such planning to be rational, the benefits of planning to future behavior must at least compensate for the time spent thinking. Here we capture these features of human behavior by developing a neural network model where not only actions, but also planning, are controlled by prefrontal cortex. This model consists of a meta-reinforcement learning agent augmented with the ability to plan by sampling imagined action sequences drawn from its own policy, which we refer to as 'rollouts'. Our results demonstrate that this agent learns to plan when planning is beneficial, explaining the empirical variability in human thinking times. Additionally, the patterns of policy rollouts employed by the artificial agent closely resemble patterns of rodent hippocampal replays recently recorded in a spatial navigation task, in terms of both their spatial statistics and their relationship to subsequent behavior. Our work provides a new theory of how the brain could implement planning through prefrontal-hippocampal interactions, where hippocampal replays are triggered by - and in turn adaptively affect - prefrontal dynamics.
The role of sub-population structure in computations through neural dynamics
Neural computations are currently conceptualised using two separate approaches: sorting neurons into functional sub-populations or examining distributed collective dynamics. Whether and how these two aspects interact to shape computations is currently unclear. Using a novel approach to extract computational mechanisms from recurrent networks trained on neuroscience tasks, we show that the collective dynamics and sub-population structure play fundamentally complementary roles. Although various tasks can be implemented in networks with fully random population structure, we found that flexible input–output mappings instead require a non-random population structure that can be described in terms of multiple sub-populations. Our analyses revealed that such a sub-population organisation enables flexible computations through a mechanism based on gain-controlled modulations that flexibly shape the collective dynamics.
Convex neural codes in recurrent networks and sensory systems
Neural activity in many sensory systems is organized on low-dimensional manifolds by means of convex receptive fields. Neural codes in these areas are constrained by this organization, as not every neural code is compatible with convex receptive fields. The same codes are also constrained by the structure of the underlying neural network. In my talk I will attempt to provide answers to the following natural questions: (i) How do recurrent circuits generate codes that are compatible with the convexity of receptive fields? (ii) How can we utilize the constraints imposed by the convex receptive field to understand the underlying stimulus space. To answer question (i), we describe the combinatorics of the steady states and fixed points of recurrent networks that satisfy the Dale’s law. It turns out the combinatorics of the fixed points are completely determined by two distinct conditions: (a) the connectivity graph of the network and (b) a spectral condition on the synaptic matrix. We give a characterization of exactly which features of connectivity determine the combinatorics of the fixed points. We also find that a generic recurrent network that satisfies Dale's law outputs convex combinatorial codes. To address question (ii), I will describe methods based on ideas from topology and geometry that take advantage of the convex receptive field properties to infer the dimension of (non-linear) neural representations. I will illustrate the first method by inferring basic features of the neural representations in the mouse olfactory bulb.
Training Dynamic Spiking Neural Network via Forward Propagation Through Time
With recent advances in learning algorithms, recurrent networks of spiking neurons are achieving performance competitive with standard recurrent neural networks. Still, these learning algorithms are limited to small networks of simple spiking neurons and modest-length temporal sequences, as they impose high memory requirements, have difficulty training complex neuron models, and are incompatible with online learning.Taking inspiration from the concept of Liquid Time-Constant (LTCs), we introduce a novel class of spiking neurons, the Liquid Time-Constant Spiking Neuron (LTC-SN), resulting in functionality similar to the gating operation in LSTMs. We integrate these neurons in SNNs that are trained with FPTT and demonstrate that thus trained LTC-SNNs outperform various SNNs trained with BPTT on long sequences while enabling online learning and drastically reducing memory complexity. We show this for several classical benchmarks that can easily be varied in sequence length, like the Add Task and the DVS-gesture benchmark. We also show how FPTT-trained LTC-SNNs can be applied to large convolutional SNNs, where we demonstrate novel state-of-the-art for online learning in SNNs on a number of standard benchmarks (S-MNIST, R-MNIST, DVS-GESTURE) and also show that large feedforward SNNs can be trained successfully in an online manner to near (Fashion-MNIST, DVS-CIFAR10) or exceeding (PS-MNIST, R-MNIST) state-of-the-art performance as obtained with offline BPTT. Finally, the training and memory efficiency of FPTT enables us to directly train SNNs in an end-to-end manner at network sizes and complexity that was previously infeasible: we demonstrate this by training in an end-to-end fashion the first deep and performant spiking neural network for object localization and recognition. Taken together, we out contribution enable for the first time training large-scale complex spiking neural network architectures online and on long temporal sequences.
Associative memory of structured knowledge
A long standing challenge in biological and artificial intelligence is to understand how new knowledge can be constructed from known building blocks in a way that is amenable for computation by neuronal circuits. Here we focus on the task of storage and recall of structured knowledge in long-term memory. Specifically, we ask how recurrent neuronal networks can store and retrieve multiple knowledge structures. We model each structure as a set of binary relations between events and attributes (attributes may represent e.g., temporal order, spatial location, role in semantic structure), and map each structure to a distributed neuronal activity pattern using a vector symbolic architecture (VSA) scheme. We then use associative memory plasticity rules to store the binarized patterns as fixed points in a recurrent network. By a combination of signal-to-noise analysis and numerical simulations, we demonstrate that our model allows for efficient storage of these knowledge structures, such that the memorized structures as well as their individual building blocks (e.g., events and attributes) can be subsequently retrieved from partial retrieving cues. We show that long-term memory of structured knowledge relies on a new principle of computation beyond the memory basins. Finally, we show that our model can be extended to store sequences of memories as single attractors.
Flexible multitask computation in recurrent networks utilizes shared dynamical motifs
Flexible computation is a hallmark of intelligent behavior. Yet, little is known about how neural networks contextually reconfigure for different computations. Humans are able to perform a new task without extensive training, presumably through the composition of elementary processes that were previously learned. Cognitive scientists have long hypothesized the possibility of a compositional neural code, where complex neural computations are made up of constituent components; however, the neural substrate underlying this structure remains elusive in biological and artificial neural networks. Here we identified an algorithmic neural substrate for compositional computation through the study of multitasking artificial recurrent neural networks. Dynamical systems analyses of networks revealed learned computational strategies that mirrored the modular subtask structure of the task-set used for training. Dynamical motifs such as attractors, decision boundaries and rotations were reused across different task computations. For example, tasks that required memory of a continuous circular variable repurposed the same ring attractor. We show that dynamical motifs are implemented by clusters of units and are reused across different contexts, allowing for flexibility and generalization of previously learned computation. Lesioning these clusters resulted in modular effects on network performance: a lesion that destroyed one dynamical motif only minimally perturbed the structure of other dynamical motifs. Finally, modular dynamical motifs could be reconfigured for fast transfer learning. After slow initial learning of dynamical motifs, a subsequent faster stage of learning reconfigured motifs to perform novel tasks. This work contributes to a more fundamental understanding of compositional computation underlying flexible general intelligence in neural systems. We present a conceptual framework that establishes dynamical motifs as a fundamental unit of computation, intermediate between the neuron and the network. As more whole brain imaging studies record neural activity from multiple specialized systems simultaneously, the framework of dynamical motifs will guide questions about specialization and generalization across brain regions.
Feedforward and feedback processes in visual recognition
Progress in deep learning has spawned great successes in many engineering applications. As a prime example, convolutional neural networks, a type of feedforward neural networks, are now approaching – and sometimes even surpassing – human accuracy on a variety of visual recognition tasks. In this talk, however, I will show that these neural networks and their recent extensions exhibit a limited ability to solve seemingly simple visual reasoning problems involving incremental grouping, similarity, and spatial relation judgments. Our group has developed a recurrent network model of classical and extra-classical receptive field circuits that is constrained by the anatomy and physiology of the visual cortex. The model was shown to account for diverse visual illusions providing computational evidence for a novel canonical circuit that is shared across visual modalities. I will show that this computational neuroscience model can be turned into a modern end-to-end trainable deep recurrent network architecture that addresses some of the shortcomings exhibited by state-of-the-art feedforward networks for solving complex visual reasoning tasks. This suggests that neuroscience may contribute powerful new ideas and approaches to computer science and artificial intelligence.
Computation in the neuronal systems close to the critical point
It was long hypothesized that natural systems might take advantage of the extended temporal and spatial correlations close to the critical point to improve their computational capabilities. However, on the other side, different distances to criticality were inferred from the recordings of nervous systems. In my talk, I discuss how including additional constraints on the processing time can shift the optimal operating point of the recurrent networks. Moreover, the data from the visual cortex of the monkeys during the attentional task indicate that they flexibly change the closeness to the critical point of the local activity. Overall it suggests that, as we would expect from common sense, the optimal state depends on the task at hand, and the brain adapts to it in a local and fast manner.
Taming chaos in neural circuits
Neural circuits exhibit complex activity patterns, both spontaneously and in response to external stimuli. Information encoding and learning in neural circuits depend on the ability of time-varying stimuli to control spontaneous network activity. In particular, variability arising from the sensitivity to initial conditions of recurrent cortical circuits can limit the information conveyed about the sensory input. Spiking and firing rate network models can exhibit such sensitivity to initial conditions that are reflected in their dynamic entropy rate and attractor dimensionality computed from their full Lyapunov spectrum. I will show how chaos in both spiking and rate networks depends on biophysical properties of neurons and the statistics of time-varying stimuli. In spiking networks, increasing the input rate or coupling strength aids in controlling the driven target circuit, which is reflected in both a reduced trial-to-trial variability and a decreased dynamic entropy rate. With sufficiently strong input, a transition towards complete network state control occurs. Surprisingly, this transition does not coincide with the transition from chaos to stability but occurs at even larger values of external input strength. Controllability of spiking activity is facilitated when neurons in the target circuit have a sharp spike onset, thus a high speed by which neurons launch into the action potential. I will also discuss chaos and controllability in firing-rate networks in the balanced state. For these, external control of recurrent dynamics strongly depends on correlations in the input. This phenomenon was studied with a non-stationary dynamic mean-field theory that determines how the activity statistics and the largest Lyapunov exponent depend on frequency and amplitude of the input, recurrent coupling strength, and network size. This shows that uncorrelated inputs facilitate learning in balanced networks. The results highlight the potential of Lyapunov spectrum analysis as a diagnostic for machine learning applications of recurrent networks. They are also relevant in light of recent advances in optogenetics that allow for time-dependent stimulation of a select population of neurons.
Frontal circuit specialisations for information search and decision making
During primate evolution, prefrontal cortex (PFC) expanded substantially relative to other cortical areas. The expansion of PFC circuits likely supported the increased cognitive abilities of humans and anthropoids to sample information about their environment, evaluate that information, plan, and decide between different courses of action. What quantities do these circuits compute as information is being sampled towards and a decision is being made? And how can they be related to anatomical specialisations within and across PFC? To address this, we recorded PFC activity during value-based decision making using single unit recording in non-human primates and magnetoencephalography in humans. At a macrocircuit level, we found that value correlates differ substantially across PFC subregions. They are heavily shaped by each subregion’s anatomical connections and by the decision-maker’s current locus of attention. At a microcircuit level, we found that the temporal evolution of value correlates can be predicted using cortical recurrent network models that temporally integrate incoming decision evidence. These models reflect the fact that PFC circuits are highly recurrent in nature and have synaptic properties that support persistent activity across temporally extended cognitive tasks. Our findings build upon recent work describing economic decision making as a process of attention-weighted evidence integration across time.
A nonlinear shot noise model for calcium-based synaptic plasticity
Activity dependent synaptic plasticity is considered to be a primary mechanism underlying learning and memory. Yet it is unclear whether plasticity rules such as STDP measured in vitro apply in vivo. Network models with STDP predict that activity patterns (e.g., place-cell spatial selectivity) should change much faster than observed experimentally. We address this gap by investigating a nonlinear calcium-based plasticity rule fit to experiments done in physiological conditions. In this model, LTP and LTD result from intracellular calcium transients arising almost exclusively from synchronous coactivation of pre- and postsynaptic neurons. We analytically approximate the full distribution of nonlinear calcium transients as a function of pre- and postsynaptic firing rates, and temporal correlations. This analysis directly relates activity statistics that can be measured in vivo to the changes in synaptic efficacy they cause. Our results highlight that both high-firing rates and temporal correlations can lead to significant changes to synaptic efficacy. Using a mean-field theory, we show that the nonlinear plasticity rule, without any fine-tuning, gives a stable, unimodal synaptic weight distribution characterized by many strong synapses which remain stable over long periods of time, consistent with electrophysiological and behavioral studies. Moreover, our theory explains how memories encoded by strong synapses can be preferentially stabilized by the plasticity rule. We confirmed our analytical results in a spiking recurrent network. Interestingly, although most synapses are weak and undergo rapid turnover, the fraction of strong synapses are sufficient for supporting realistic spiking dynamics and serve to maintain the network’s cluster structure. Our results provide a mechanistic understanding of how stable memories may emerge on the behavioral level from an STDP rule measured in physiological conditions. Furthermore, the plasticity rule we investigate is mathematically equivalent to other learning rules which rely on the statistics of coincidences, so we expect that our formalism will be useful to study other learning processes beyond the calcium-based plasticity rule.
NMC4 Short Talk: A theory for the population rate of adapting neurons disambiguates mean vs. variance-driven dynamics and explains log-normal response statistics
Recently, the field of computational neuroscience has seen an explosion of the use of trained recurrent network models (RNNs) to model patterns of neural activity. These RNN models are typically characterized by tuned recurrent interactions between rate 'units' whose dynamics are governed by smooth, continuous differential equations. However, the response of biological single neurons is better described by all-or-none events - spikes - that are triggered in response to the processing of their synaptic input by the complex dynamics of their membrane. One line of research has attempted to resolve this discrepancy by linking the average firing probability of a population of simplified spiking neuron models to rate dynamics similar to those used for RNN units. However, challenges remain to account for complex temporal dependencies in the biological single neuron response and for the heterogeneity of synaptic input across the population. Here, we make progress by showing how to derive dynamic rate equations for a population of spiking neurons with multi-timescale adaptation properties - as this was shown to accurately model the response of biological neurons - while they receive independent time-varying inputs, leading to plausible asynchronous activity in the network. The resulting rate equations yield an insightful segregation of the population's response into dynamics that are driven by the mean signal received by the neural population, and dynamics driven by the variance of the input across neurons, with respective timescales that are in agreement with slice experiments. Further, these equations explain how input variability can shape log-normal instantaneous rate distributions across neurons, as observed in vivo. Our results help interpret properties of the neural population response and open the way to investigating whether the more biologically plausible and dynamically complex rate model we derive could provide useful inductive biases if used in an RNN to solve specific tasks.
Refuting the unfolding-argument on the irrelevance of causal structure to consciousness
I will build from Niccolo's discussion of the Blockhead argument to argue that having an FeedForward Network (FN) responding like an recurrent network (RN) in a consciousness experiment is not enough to convince us the two are the same with regards to the posession of mental states and conscious experience. I will then argue that a robust functional equivalence between FFN and RN is akso not supported by the mathematical work on the Universal Approximator theorem, and is also unlikely to hold, as a conjecture, given data in cognitive neuroscience; I will argue that an equivalence of RN and FFN may only apply to static functions between input/output layers and not to the temporal patterns or to the network's reactions to structural perturbations. Finally, I review data indicating that consciousness has functional characteristics, such as a flexible control of behavior, and that cognitive/brain dynamics reveal interacting top-down and bottom-up processes, which are necessary for the mediation of such control processes.
Norse: A library for gradient-based learning in Spiking Neural Networks
We introduce Norse: An open-source library for gradient-based training of spiking neural networks. In contrast to neuron simulators which mainly target computational neuroscientists, our library seamlessly integrates with the existing PyTorch ecosystem using abstractions familiar to the machine learning community. This has immediate benefits in that it provides a familiar interface, hardware accelerator support and, most importantly, the ability to use gradient-based optimization. While many parallel efforts in this direction exist, Norse emphasizes flexibility and usability in three ways. Users can conveniently specify feed-forward (convolutional) architectures, as well as arbitrarily connected recurrent networks. We strictly adhere to a functional and class-based API such that neuron primitives and, for example, plasticity rules composes. Finally, the functional core API ensures compatibility with the PyTorch JIT and ONNX infrastructure. We have made progress to support network execution on the SpiNNaker platform and plan to support other neuromorphic architectures in the future. While the library is useful in its present state, it also has limitations we will address in ongoing work. In particular, we aim to implement event-based gradient computation, using the EventProp algorithm, which will allow us to support sparse event-based data efficiently, as well as work towards support of more complex neuron models. With this library, we hope to contribute to a joint future of computational neuroscience and neuromorphic computing.
Event-based Backpropagation for Exact Gradients in Spiking Neural Networks
Gradient-based optimization powered by the backpropagation algorithm proved to be the pivotal method in the training of non-spiking artificial neural networks. At the same time, spiking neural networks hold the promise for efficient processing of real-world sensory data by communicating using discrete events in continuous time. We derive the backpropagation algorithm for a recurrent network of spiking (leaky integrate-and-fire) neurons with hard thresholds and show that the backward dynamics amount to an event-based backpropagation of errors through time. Our derivation uses the jump conditions for partial derivatives at state discontinuities found by applying the implicit function theorem, allowing us to avoid approximations or substitutions. We find that the gradient exists and is finite almost everywhere in weight space, up to the null set where a membrane potential is precisely tangent to the threshold. Our presented algorithm, EventProp, computes the exact gradient with respect to a general loss function based on spike times and membrane potentials. Crucially, the algorithm allows for an event-based communication scheme in the backward phase, retaining the potential advantages of temporal sparsity afforded by spiking neural networks. We demonstrate the optimization of spiking networks using gradients computed via EventProp and the Yin-Yang and MNIST datasets with either a spike time-based or voltage-based loss function and report competitive performance. Our work supports the rigorous study of gradient-based optimization in spiking neural networks as well as the development of event-based neuromorphic architectures for the efficient training of spiking neural networks. While we consider the leaky integrate-and-fire model in this work, our methodology generalises to any neuron model defined as a hybrid dynamical system.
Tuning dumb neurons to task processing - via homeostasis
Homeostatic plasticity plays a key role in stabilizing neural network activity. But what is its role in neural information processing? We showed analytically how homeostasis changes collective dynamics and consequently information flow - depending on the input to the network. We then studied how input and homeostasis on a recurrent network of LIF neurons impacts information flow and task performance. We showed how we can tune the working point of the network, and found that, contrary to previous assumptions, there is not one optimal working point for a family of tasks, but each task may require its own working point.
Frontal circuit specialisations for decision making
During primate evolution, prefrontal cortex (PFC) expanded substantially relative to other cortical areas. The expansion of PFC circuits likely supported the increased cognitive abilities of humans and anthropoids to plan, evaluate, and decide between different courses of action. But what do these circuits compute as a decision is being made, and how can they be related to anatomical specialisations within and across PFC? To address this, we recorded PFC activity during value-based decision making using single unit recording in non-human primates and magnetoencephalography in humans. At a macrocircuit level, we found that value correlates differ substantially across PFC subregions. They are heavily shaped by each subregion’s anatomical connections and by the decision-maker’s current locus of attention. At a microcircuit level, we found that the temporal evolution of value correlates can be predicted using cortical recurrent network models that temporally integrate incoming decision evidence. These models reflect the fact that PFC circuits are highly recurrent in nature and have synaptic properties that support persistent activity across temporally extended cognitive tasks. Our findings build upon recent work describing economic decision making as a process of attention-weighted evidence integration across time.
A theory for Hebbian learning in recurrent E-I networks
The Stabilized Supralinear Network is a model of recurrently connected excitatory (E) and inhibitory (I) neurons with a supralinear input-output relation. It can explain cortical computations such as response normalization and inhibitory stabilization. However, the network's connectivity is designed by hand, based on experimental measurements. How the recurrent synaptic weights can be learned from the sensory input statistics in a biologically plausible way is unknown. Earlier theoretical work on plasticity focused on single neurons and the balance of excitation and inhibition but did not consider the simultaneous plasticity of recurrent synapses and the formation of receptive fields. Here we present a recurrent E-I network model where all synaptic connections are simultaneously plastic, and E neurons self-stabilize by recruiting co-tuned inhibition. Motivated by experimental results, we employ a local Hebbian plasticity rule with multiplicative normalization for E and I synapses. We develop a theoretical framework that explains how plasticity enables inhibition balanced excitatory receptive fields that match experimental results. We show analytically that sufficiently strong inhibition allows neurons' receptive fields to decorrelate and distribute themselves across the stimulus space. For strong recurrent excitation, the network becomes stabilized by inhibition, which prevents unconstrained self-excitation. In this regime, external inputs integrate sublinearly. As in the Stabilized Supralinear Network, this results in response normalization and winner-takes-all dynamics: when two competing stimuli are presented, the network response is dominated by the stronger stimulus while the weaker stimulus is suppressed. In summary, we present a biologically plausible theoretical framework to model plasticity in fully plastic recurrent E-I networks. While the connectivity is derived from the sensory input statistics, the circuit performs meaningful computations. Our work provides a mathematical framework of plasticity in recurrent networks, which has previously only been studied numerically and can serve as the basis for a new generation of brain-inspired unsupervised machine learning algorithms.
Recurrent network dynamics lead to interference in sequential learning
Learning in real life is often sequential: A learner first learns task A, then task B. If the tasks are related, the learner may adapt the previously learned representation instead of generating a new one from scratch. Adaptation may ease learning task B but may also decrease the performance on task A. Such interference has been observed in experimental and machine learning studies. In the latter case, it is mediated by correlations between weight updates for the different tasks. In typical applications, like image classification with feed-forward networks, these correlated weight updates can be traced back to input correlations. For many neuroscience tasks, however, networks need to not only transform the input, but also generate substantial internal dynamics. Here we illuminate the role of internal dynamics for interference in recurrent neural networks (RNNs). We analyze RNNs trained sequentially on neuroscience tasks with gradient descent and observe forgetting even for orthogonal tasks. We find that the degree of interference changes systematically with tasks properties, especially with emphasis on input-driven over autonomously generated dynamics. To better understand our numerical observations, we thoroughly analyze a simple model of working memory: For task A, a network is presented with an input pattern and trained to generate a fixed point aligned with this pattern. For task B, the network has to memorize a second, orthogonal pattern. Adapting an existing representation corresponds to the rotation of the fixed point in phase space, as opposed to the emergence of a new one. We show that the two modes of learning – rotation vs. new formation – are directly linked to recurrent vs. input-driven dynamics. We make this notion precise in a further simplified, analytically tractable model, where learning is restricted to a 2x2 matrix. In our analysis of trained RNNs, we also make the surprising observation that, across different tasks, larger random initial connectivity reduces interference. Analyzing the fixed point task reveals the underlying mechanism: The random connectivity strongly accelerates the learning mode of new formation, and has less effect on rotation. The prior thus wins the race to zero loss, and interference is reduced. Altogether, our work offers a new perspective on sequential learning in recurrent networks, and the emphasis on internally generated dynamics allows us to take the history of individual learners into account.
Generalizing theories of cerebellum-like learning
Since the theories of Marr, Ito, and Albus, the cerebellum has provided an attractive well-characterized model system to investigate biological mechanisms of learning. In recent years, theories have been developed that provide a normative account for many features of the anatomy and function of cerebellar cortex and cerebellum-like systems, including the distribution of parallel fiber-Purkinje cell synaptic weights, the expansion in neuron number of the granule cell layer and their synaptic in-degree, and sparse coding by granule cells. Typically, these theories focus on the learning of random mappings between uncorrelated inputs and binary outputs, an assumption that may be reasonable for certain forms of associative conditioning but is also quite far from accounting for the important role the cerebellum plays in the control of smooth movements. I will discuss in-progress work with Marjorie Xie, Samuel Muscinelli, and Kameron Decker Harris generalizing these learning theories to correlated inputs and general classes of smooth input-output mappings. Our studies build on earlier work in theoretical neuroscience as well as recent advances in the kernel theory of wide neural networks. They illuminate the role of pre-expansion structures in processing input stimuli and the significance of sparse granule cell activity. If there is time, I will also discuss preliminary work with Jack Lindsey extending these theories beyond cerebellum-like structures to recurrent networks.
A geometric framework to predict structure from function in neural networks
The structural connectivity matrix of synaptic weights between neurons is a critical determinant of overall network function. However, quantitative links between neural network structure and function are complex and subtle. For example, many networks can give rise to similar functional responses, and the same network can function differently depending on context. Whether certain patterns of synaptic connectivity are required to generate specific network-level computations is largely unknown. Here we introduce a geometric framework for identifying synaptic connections required by steady-state responses in recurrent networks of rectified-linear neurons. Assuming that the number of specified response patterns does not exceed the number of input synapses, we analytically calculate all feedforward and recurrent connectivity matrices that can generate the specified responses from the network inputs. We then use this analytical characterization to rigorously analyze the solution space geometry and derive certainty conditions guaranteeing a non-zero synapse between neurons.
Residual population dynamics as a window into neural computation
Neural activity in frontal and motor cortices can be considered to be the manifestation of a dynamical system implemented by large neural populations in recurrently connected networks. The computations emerging from such population-level dynamics reflect the interaction between external inputs into a network and its internal, recurrent dynamics. Isolating these two contributions in experimentally recorded neural activity, however, is challenging, limiting the resulting insights into neural computations. I will present an approach to addressing this challenge based on response residuals, i.e. variability in the population trajectory across repetitions of the same task condition. A complete characterization of residual dynamics is well-suited to systematically compare computations across brain areas and tasks, and leads to quantitative predictions about the consequences of small, arbitrary causal perturbations.
The emergence and modulation of time in neural circuits and behavior
Spontaneous behavior in animals and humans shows a striking amount of variability both in the spatial domain (which actions to choose) and temporal domain (when to act). Concatenating actions into sequences and behavioral plans reveals the existence of a hierarchy of timescales ranging from hundreds of milliseconds to minutes. How do multiple timescales emerge from neural circuit dynamics? How do circuits modulate temporal responses to flexibly adapt to changing demands? In this talk, we will present recent results from experiments and theory suggesting a new computational mechanism generating the temporal variability underlying naturalistic behavior. We will show how neural activity from premotor areas unfolds through temporal sequences of attractors, which predict the intention to act. These sequences naturally emerge from recurrent cortical networks, where correlated neural variability plays a crucial role in explaining the observed variability in action timing. We will then discuss how reaction times in these recurrent circuits can be accelerated or slowed down via gain modulation, induced by neuromodulation or perturbations. Finally, we will present a general mechanism producing a reservoir of multiple timescales in recurrent networks.
Using noise to probe recurrent neural network structure and prune synapses
Many networks in the brain are sparsely connected, and the brain eliminates synapses during development and learning. How could the brain decide which synapses to prune? In a recurrent network, determining the importance of a synapse between two neurons is a difficult computational problem, depending on the role that both neurons play and on all possible pathways of information flow between them. Noise is ubiquitous in neural systems, and often considered an irritant to be overcome. In the first part of this talk, I will suggest that noise could play a functional role in synaptic pruning, allowing the brain to probe network structure and determine which synapses are redundant. I will introduce a simple, local, unsupervised plasticity rule that either strengthens or prunes synapses using only synaptic weight and the noise-driven covariance of the neighboring neurons. For a subset of linear and rectified-linear networks, this rule provably preserves the spectrum of the original matrix and hence preserves network dynamics even when the fraction of pruned synapses asymptotically approaches 1. The plasticity rule is biologically-plausible and may suggest a new role for noise in neural computation. Time permitting, I will then turn to the problem of extracting structure from neural population data sets using dimensionality reduction methods. I will argue that nonlinear structures naturally arise in neural data and show how these nonlinearities cause linear methods of dimensionality reduction, such as Principal Components Analysis, to fail dramatically in identifying low-dimensional structure.
E-prop: A biologically inspired paradigm for learning in recurrent networks of spiking neurons
Transformative advances in deep learning, such as deep reinforcement learning, usually rely on gradient-based learning methods such as backpropagation through time (BPTT) as a core learning algorithm. However, BPTT is not argued to be biologically plausible, since it requires to a propagate gradients backwards in time and across neurons. Here, we propose e-prop, a novel gradient-based learning method with local and online weight update rules for recurrent neural networks, and in particular recurrent spiking neural networks (RSNNs). As a result, e-prop has the potential to provide a substantial fraction of the power of deep learning to RSNNs. In this presentation, we will motivate e-prop from the perspective of recent insights in neuroscience and show how these have to be combined to form an algorithm for online gradient descent. The mathematical results will be supported by empirical evidence in supervised and reinforcement learning tasks. We will also discuss how limitations that are inherited from gradient-based learning methods, such as sample-efficiency, can be addressed by considering an evolution-like optimization that enhances learning on particular task families. The emerging learning architecture can be used to learn tasks by a single demonstration, hence enabling one-shot learning.
Recurrent network models of adaptive and maladaptive learning
During periods of persistent and inescapable stress, animals can switch from active to passive coping strategies to manage effort-expenditure. Such normally adaptive behavioural state transitions can become maladaptive in disorders such as depression. We developed a new class of multi-region recurrent neural network (RNN) models to infer brain-wide interactions driving such maladaptive behaviour. The models were trained to match experimental data across two levels simultaneously: brain-wide neural dynamics from 10-40,000 neurons and the realtime behaviour of the fish. Analysis of the trained RNN models revealed a specific change in inter-area connectivity between the habenula (Hb) and raphe nucleus during the transition into passivity. We then characterized the multi-region neural dynamics underlying this transition. Using the interaction weights derived from the RNN models, we calculated the input currents from different brain regions to each Hb neuron. We then computed neural manifolds spanning these input currents across all Hb neurons to define subspaces within the Hb activity that captured communication with each other brain region independently. At the onset of stress, there was an immediate response within the Hb/raphe subspace alone. However, RNN models identified no early or fast-timescale change in the strengths of interactions between these regions. As the animal lapsed into passivity, the responses within the Hb/raphe subspace decreased, accompanied by a concomitant change in the interactions between the raphe and Hb inferred from the RNN weights. This innovative combination of network modeling and neural dynamics analysis points to dual mechanisms with distinct timescales driving the behavioural state transition: early response to stress is mediated by reshaping the neural dynamics within a preserved network architecture, while long-term state changes correspond to altered connectivity between neural ensembles in distinct brain regions.
Conditions for sequence replay in recurrent network models of CA3
Bernstein Conference 2024
Effect of experience on context-dependent learning in recurrent networks
Bernstein Conference 2024
A family of synaptic plasticity rules based on spike times produces a diversity of triplet motifs in recurrent networks
Bernstein Conference 2024
Linking Neural Manifolds to Circuit Structure in Recurrent Networks
Bernstein Conference 2024
Response variability can accelerate learning in feedforward-recurrent networks
Bernstein Conference 2024
Reverse engineering recurrent network models reveals mechanisms for location memory
Bernstein Conference 2024
Synaptic Upscaling Amplifies Chaotic Dynamics in Recurrent Networks of Rate Neurons
Bernstein Conference 2024
Multitask computation in recurrent networks utilizes shared dynamical motifs
COSYNE 2022
Multitask computation in recurrent networks utilizes shared dynamical motifs
COSYNE 2022
Revisiting the flexibility-stability dilemma in recurrent networks using a multiplicative plasticity rule
COSYNE 2022
Revisiting the flexibility-stability dilemma in recurrent networks using a multiplicative plasticity rule
COSYNE 2022
Inhibitory control of plasticity promotes stability and competitive learning in recurrent networks
COSYNE 2023
Locally coupled oscillatory recurrent networks learn traveling waves and topographic organization
COSYNE 2023
Spiking and bursting activity in stochastic recurrent networks
COSYNE 2023
Synaptic-type-specific clustering optimizes the computational capabilities of balanced recurrent networks
COSYNE 2023
Barcode activity in a recurrent network model of the hippocampus enables efficient memory binding
COSYNE 2025
Effect of experience on context-dependent learning in recurrent networks
COSYNE 2025
Enhancing the causal predictive power in recurrent network models of neural dynamics
COSYNE 2025
A family of synaptic plasticity rules shapes triplet motifs in recurrent networks
COSYNE 2025
Nonlinear Manifolds Underlie Robust Representations of Continuous Variables in Recurrent Networks
COSYNE 2025
Predicting neural activity in connectome-constrained recurrent networks
COSYNE 2025
Fading memory as inductive bias in residual recurrent networks
FENS Forum 2024
Reverse engineering recurrent network models reveals mechanisms for location memory
FENS Forum 2024
recurrent network coverage
50 items