Neural Network
neural network
Computational Mechanisms of Predictive Processing in Brains and Machines
Predictive processing offers a unifying view of neural computation, proposing that brains continuously anticipate sensory input and update internal models based on prediction errors. In this talk, I will present converging evidence for the computational mechanisms underlying this framework across human neuroscience and deep neural networks. I will begin with recent work showing that large-scale distributed prediction-error encoding in the human brain directly predicts how sensory representations reorganize through predictive learning. I will then turn to PredNet, a popular predictive coding inspired deep network that has been widely used to model real-world biological vision systems. Using dynamic stimuli generated with our Spatiotemporal Style Transfer algorithm, we demonstrate that PredNet relies primarily on low-level spatiotemporal structure and remains insensitive to high-level content, revealing limits in its generalization capacity. Finally, I will discuss new recurrent vision models that integrate top-down feedback connections with intrinsic neural variability, uncovering a dual mechanism for robust sensory coding in which neural variability decorrelates unit responses, while top-down feedback stabilizes network dynamics. Together, these results outline how prediction error signaling and top-down feedback pathways shape adaptive sensory processing in biological and artificial systems.
From Spiking Predictive Coding to Learning Abstract Object Representation
In a first part of the talk, I will present Predictive Coding Light (PCL), a novel unsupervised learning architecture for spiking neural networks. In contrast to conventional predictive coding approaches, which only transmit prediction errors to higher processing stages, PCL learns inhibitory lateral and top-down connectivity to suppress the most predictable spikes and passes a compressed representation of the input to higher processing stages. We show that PCL reproduces a range of biological findings and exhibits a favorable tradeoff between energy consumption and downstream classification performance on challenging benchmarks. A second part of the talk will feature our lab’s efforts to explain how infants and toddlers might learn abstract object representations without supervision. I will present deep learning models that exploit the temporal and multimodal structure of their sensory inputs to learn representations of individual objects, object categories, or abstract super-categories such as „kitchen object“ in a fully unsupervised fashion. These models offer a parsimonious account of how abstract semantic knowledge may be rooted in children's embodied first-person experiences.
Developmental and evolutionary perspectives on thalamic function
Brain organization and function is a complex topic. We are good at establishing correlates of perception and behavior across forebrain circuits, as well as manipulating activity in these circuits to affect behavior. However, we still lack good models for the large-scale organization and function of the forebrain. What are the contributions of the cortex, basal ganglia, and thalamus to behavior? In addressing these questions, we often ascribe function to each area as if it were an independent processing unit. However, we know from the anatomy that the cortex, basal ganglia, and thalamus, are massively interconnected in a large network. One way to generate insight into these questions is to consider the evolution and development of forebrain systems. In this talk, I will discuss the developmental and evolutionary (comparative anatomy) data on the thalamus, and how it fits within forebrain networks. I will address questions including, when did the thalamus appear in evolution, how is the thalamus organized across the vertebrate lineage, and how can the change in the organization of forebrain networks affect behavioral repertoires.
Neurobiological constraints on learning: bug or feature?
Understanding how brains learn requires bridging evidence across scales—from behaviour and neural circuits to cells, synapses, and molecules. In our work, we use computational modelling and data analysis to explore how the physical properties of neurons and neural circuits constrain learning. These include limits imposed by brain wiring, energy availability, molecular noise, and the 3D structure of dendritic spines. In this talk I will describe one such project testing if wiring motifs from fly brain connectomes can improve performance of reservoir computers, a type of recurrent neural network. The hope is that these insights into brain learning will lead to improved learning algorithms for artificial systems.
Functional Plasticity in the Language Network – evidence from Neuroimaging and Neurostimulation
Efficient cognition requires flexible interactions between distributed neural networks in the human brain. These networks adapt to challenges by flexibly recruiting different regions and connections. In this talk, I will discuss how we study functional network plasticity and reorganization with combined neurostimulation and neuroimaging across the adult life span. I will argue that short-term plasticity enables flexible adaptation to challenges, via functional reorganization. My key hypothesis is that disruption of higher-level cognitive functions such as language can be compensated for by the recruitment of domain-general networks in our brain. Examples from healthy young brains illustrate how neurostimulation can be used to temporarily interfere with efficient processing, probing short-term network plasticity at the systems level. Examples from people with dyslexia help to better understand network disorders in the language domain and outline the potential of facilitatory neurostimulation for treatment. I will also discuss examples from aging brains where plasticity helps to compensate for loss of function. Finally, examples from lesioned brains after stroke provide insight into the brain’s potential for long-term reorganization and recovery of function. Collectively, these results challenge the view of a modular organization of the human brain and argue for a flexible redistribution of function via systems plasticity.
Deepfake emotional expressions trigger the uncanny valley brain response, even when they are not recognised as fake
Facial expressions are inherently dynamic, and our visual system is sensitive to subtle changes in their temporal sequence. However, researchers often use dynamic morphs of photographs—simplified, linear representations of motion—to study the neural correlates of dynamic face perception. To explore the brain's sensitivity to natural facial motion, we constructed a novel dynamic face database using generative neural networks, trained on a verified set of video-recorded emotional expressions. The resulting deepfakes, consciously indistinguishable from videos, enabled us to separate biological motion from photorealistic form. Results showed that conventional dynamic morphs elicit distinct responses in the brain compared to videos and photos, suggesting they violate expectations (n400) and have reduced social salience (late positive potential). This suggests that dynamic morphs misrepresent facial dynamism, resulting in misleading insights about the neural and behavioural correlates of face perception. Deepfakes and videos elicited largely similar neural responses, suggesting they could be used as a proxy for real faces in vision research, where video recordings cannot be experimentally manipulated. And yet, despite being consciously undetectable as fake, deepfakes elicited an expectation violation response in the brain. This points to a neural sensitivity to naturalistic facial motion, beyond conscious awareness. Despite some differences in neural responses, the realism and manipulability of deepfakes make them a valuable asset for research where videos are unfeasible. Using these stimuli, we proposed a novel marker for the conscious perception of naturalistic facial motion – Frontal delta activity – which was elevated for videos and deepfakes, but not for photos or dynamic morphs.
Brain Emulation Challenge Workshop
Brain Emulation Challenge workshop will tackle cutting-edge topics such as ground-truthing for validation, leveraging artificial datasets generated from virtual brain tissue, and the transformative potential of virtual brain platforms, such as applied to the forthcoming Brain Emulation Challenge.
Memory formation in hippocampal microcircuit
The centre of memory is the medial temporal lobe (MTL) and especially the hippocampus. In our research, a more flexible brain-inspired computational microcircuit of the CA1 region of the mammalian hippocampus was upgraded and used to examine how information retrieval could be affected under different conditions. Six models (1-6) were created by modulating different excitatory and inhibitory pathways. The results showed that the increase in the strength of the feedforward excitation was the most effective way to recall memories. In other words, that allows the system to access stored memories more accurately.
Analyzing Network-Level Brain Processing and Plasticity Using Molecular Neuroimaging
Behavior and cognition depend on the integrated action of neural structures and populations distributed throughout the brain. We recently developed a set of molecular imaging tools that enable multiregional processing and plasticity in neural networks to be studied at a brain-wide scale in rodents and nonhuman primates. Here we will describe how a novel genetically encoded activity reporter enables information flow in virally labeled neural circuitry to be monitored by fMRI. Using the reporter to perform functional imaging of synaptically defined neural populations in the rat somatosensory system, we show how activity is transformed within brain regions to yield characteristics specific to distinct output projections. We also show how this approach enables regional activity to be modeled in terms of inputs, in a paradigm that we are extending to address circuit-level origins of functional specialization in marmoset brains. In the second part of the talk, we will discuss how another genetic tool for MRI enables systematic studies of the relationship between anatomical and functional connectivity in the mouse brain. We show that variations in physical and functional connectivity can be dissociated both across individual subjects and over experience. We also use the tool to examine brain-wide relationships between plasticity and activity during an opioid treatment. This work demonstrates the possibility of studying diverse brain-wide processing phenomena using molecular neuroimaging.
The Brain Prize winners' webinar
This webinar brings together three leaders in theoretical and computational neuroscience—Larry Abbott, Haim Sompolinsky, and Terry Sejnowski—to discuss how neural circuits generate fundamental aspects of the mind. Abbott illustrates mechanisms in electric fish that differentiate self-generated electric signals from external sensory cues, showing how predictive plasticity and two-stage signal cancellation mediate a sense of self. Sompolinsky explores attractor networks, revealing how discrete and continuous attractors can stabilize activity patterns, enable working memory, and incorporate chaotic dynamics underlying spontaneous behaviors. He further highlights the concept of object manifolds in high-level sensory representations and raises open questions on integrating connectomics with theoretical frameworks. Sejnowski bridges these motifs with modern artificial intelligence, demonstrating how large-scale neural networks capture language structures through distributed representations that parallel biological coding. Together, their presentations emphasize the synergy between empirical data, computational modeling, and connectomics in explaining the neural basis of cognition—offering insights into perception, memory, language, and the emergence of mind-like processes.
Sensory cognition
This webinar features presentations from SueYeon Chung (New York University) and Srinivas Turaga (HHMI Janelia Research Campus) on theoretical and computational approaches to sensory cognition. Chung introduced a “neural manifold” framework to capture how high-dimensional neural activity is structured into meaningful manifolds reflecting object representations. She demonstrated that manifold geometry—shaped by radius, dimensionality, and correlations—directly governs a population’s capacity for classifying or separating stimuli under nuisance variations. Applying these ideas as a data analysis tool, she showed how measuring object-manifold geometry can explain transformations along the ventral visual stream and suggested that manifold principles also yield better self-supervised neural network models resembling mammalian visual cortex. Turaga described simulating the entire fruit fly visual pathway using its connectome, modeling 64 key cell types in the optic lobe. His team’s systematic approach—combining sparse connectivity from electron microscopy with simple dynamical parameters—recapitulated known motion-selective responses and produced novel testable predictions. Together, these studies underscore the power of combining connectomic detail, task objectives, and geometric theories to unravel neural computations bridging from stimuli to cognitive functions.
Brain-Wide Compositionality and Learning Dynamics in Biological Agents
Biological agents continually reconcile the internal states of their brain circuits with incoming sensory and environmental evidence to evaluate when and how to act. The brains of biological agents, including animals and humans, exploit many evolutionary innovations, chiefly modularity—observable at the level of anatomically-defined brain regions, cortical layers, and cell types among others—that can be repurposed in a compositional manner to endow the animal with a highly flexible behavioral repertoire. Accordingly, their behaviors show their own modularity, yet such behavioral modules seldom correspond directly to traditional notions of modularity in brains. It remains unclear how to link neural and behavioral modularity in a compositional manner. We propose a comprehensive framework—compositional modes—to identify overarching compositionality spanning specialized submodules, such as brain regions. Our framework directly links the behavioral repertoire with distributed patterns of population activity, brain-wide, at multiple concurrent spatial and temporal scales. Using whole-brain recordings of zebrafish brains, we introduce an unsupervised pipeline based on neural network models, constrained by biological data, to reveal highly conserved compositional modes across individuals despite the naturalistic (spontaneous or task-independent) nature of their behaviors. These modes provided a scaffolding for other modes that account for the idiosyncratic behavior of each fish. We then demonstrate experimentally that compositional modes can be manipulated in a consistent manner by behavioral and pharmacological perturbations. Our results demonstrate that even natural behavior in different individuals can be decomposed and understood using a relatively small number of neurobehavioral modules—the compositional modes—and elucidate a compositional neural basis of behavior. This approach aligns with recent progress in understanding how reasoning capabilities and internal representational structures develop over the course of learning or training, offering insights into the modularity and flexibility in artificial and biological agents.
Use case determines the validity of neural systems comparisons
Deep learning provides new data-driven tools to relate neural activity to perception and cognition, aiding scientists in developing theories of neural computation that increasingly resemble biological systems both at the level of behavior and of neural activity. But what in a deep neural network should correspond to what in a biological system? This question is addressed implicitly in the use of comparison measures that relate specific neural or behavioral dimensions via a particular functional form. However, distinct comparison methodologies can give conflicting results in recovering even a known ground-truth model in an idealized setting, leaving open the question of what to conclude from the outcome of a systems comparison using any given methodology. Here, we develop a framework to make explicit and quantitative the effect of both hypothesis-driven aspects—such as details of the architecture of a deep neural network—as well as methodological choices in a systems comparison setting. We demonstrate via the learning dynamics of deep neural networks that, while the role of the comparison methodology is often de-emphasized relative to hypothesis-driven aspects, this choice can impact and even invert the conclusions to be drawn from a comparison between neural systems. We provide evidence that the right way to adjudicate a comparison depends on the use case—the scientific hypothesis under investigation—which could range from identifying single-neuron or circuit-level correspondences to capturing generalizability to new stimulus properties
Comparing supervised learning dynamics: Deep neural networks match human data efficiency but show a generalisation lag
Recent research has seen many behavioral comparisons between humans and deep neural networks (DNNs) in the domain of image classification. Often, comparison studies focus on the end-result of the learning process by measuring and comparing the similarities in the representations of object categories once they have been formed. However, the process of how these representations emerge—that is, the behavioral changes and intermediate stages observed during the acquisition—is less often directly and empirically compared. In this talk, I'm going to report a detailed investigation of the learning dynamics in human observers and various classic and state-of-the-art DNNs. We develop a constrained supervised learning environment to align learning-relevant conditions such as starting point, input modality, available input data and the feedback provided. Across the whole learning process we evaluate and compare how well learned representations can be generalized to previously unseen test data. Comparisons across the entire learning process indicate that DNNs demonstrate a level of data efficiency comparable to human learners, challenging some prevailing assumptions in the field. However, our results also reveal representational differences: while DNNs' learning is characterized by a pronounced generalisation lag, humans appear to immediately acquire generalizable representations without a preliminary phase of learning training set-specific information that is only later transferred to novel data.
Error Consistency between Humans and Machines as a function of presentation duration
Within the last decade, Deep Artificial Neural Networks (DNNs) have emerged as powerful computer vision systems that match or exceed human performance on many benchmark tasks such as image classification. But whether current DNNs are suitable computational models of the human visual system remains an open question: While DNNs have proven to be capable of predicting neural activations in primate visual cortex, psychophysical experiments have shown behavioral differences between DNNs and human subjects, as quantified by error consistency. Error consistency is typically measured by briefly presenting natural or corrupted images to human subjects and asking them to perform an n-way classification task under time pressure. But for how long should stimuli ideally be presented to guarantee a fair comparison with DNNs? Here we investigate the influence of presentation time on error consistency, to test the hypothesis that higher-level processing drives behavioral differences. We systematically vary presentation times of backward-masked stimuli from 8.3ms to 266ms and measure human performance and reaction times on natural, lowpass-filtered and noisy images. Our experiment constitutes a fine-grained analysis of human image classification under both image corruptions and time pressure, showing that even drastically time-constrained humans who are exposed to the stimuli for only two frames, i.e. 16.6ms, can still solve our 8-way classification task with success rates way above chance. We also find that human-to-human error consistency is already stable at 16.6ms.
Probing neural population dynamics with recurrent neural networks
Large-scale recordings of neural activity are providing new opportunities to study network-level dynamics with unprecedented detail. However, the sheer volume of data and its dynamical complexity are major barriers to uncovering and interpreting these dynamics. I will present latent factor analysis via dynamical systems, a sequential autoencoding approach that enables inference of dynamics from neuronal population spiking activity on single trials and millisecond timescales. I will also discuss recent adaptations of the method to uncover dynamics from neural activity recorded via 2P Calcium imaging. Finally, time permitting, I will mention recent efforts to improve the interpretability of deep-learning based dynamical systems models.
Learning representations of specifics and generalities over time
There is a fundamental tension between storing discrete traces of individual experiences, which allows recall of particular moments in our past without interference, and extracting regularities across these experiences, which supports generalization and prediction in similar situations in the future. One influential proposal for how the brain resolves this tension is that it separates the processes anatomically into Complementary Learning Systems, with the hippocampus rapidly encoding individual episodes and the neocortex slowly extracting regularities over days, months, and years. But this does not explain our ability to learn and generalize from new regularities in our environment quickly, often within minutes. We have put forward a neural network model of the hippocampus that suggests that the hippocampus itself may contain complementary learning systems, with one pathway specializing in the rapid learning of regularities and a separate pathway handling the region’s classic episodic memory functions. This proposal has broad implications for how we learn and represent novel information of specific and generalized types, which we test across statistical learning, inference, and category learning paradigms. We also explore how this system interacts with slower-learning neocortical memory systems, with empirical and modeling investigations into how the hippocampus shapes neocortical representations during sleep. Together, the work helps us understand how structured information in our environment is initially encoded and how it then transforms over time.
Maintaining Plasticity in Neural Networks
Nonstationarity presents a variety of challenges for machine learning systems. One surprising pathology which can arise in nonstationary learning problems is plasticity loss, whereby making progress on new learning objectives becomes more difficult as training progresses. Networks which are unable to adapt in response to changes in their environment experience plateaus or even declines in performance in highly non-stationary domains such as reinforcement learning, where the learner must quickly adapt to new information even after hundreds of millions of optimization steps. The loss of plasticity manifests in a cluster of related empirical phenomena which have been identified by a number of recent works, including the primacy bias, implicit under-parameterization, rank collapse, and capacity loss. While this phenomenon is widely observed, it is still not fully understood. This talk will present exciting recent results which shed light on the mechanisms driving the loss of plasticity in a variety of learning problems and survey methods to maintain network plasticity in non-stationary tasks, with a particular focus on deep reinforcement learning.
Learning produces a hippocampal cognitive map in the form of an orthogonalized state machine
Cognitive maps confer animals with flexible intelligence by representing spatial, temporal, and abstract relationships that can be used to shape thought, planning, and behavior. Cognitive maps have been observed in the hippocampus, but their algorithmic form and the processes by which they are learned remain obscure. Here, we employed large-scale, longitudinal two-photon calcium imaging to record activity from thousands of neurons in the CA1 region of the hippocampus while mice learned to efficiently collect rewards from two subtly different versions of linear tracks in virtual reality. The results provide a detailed view of the formation of a cognitive map in the hippocampus. Throughout learning, both the animal behavior and hippocampal neural activity progressed through multiple intermediate stages, gradually revealing improved task representation that mirrored improved behavioral efficiency. The learning process led to progressive decorrelations in initially similar hippocampal neural activity within and across tracks, ultimately resulting in orthogonalized representations resembling a state machine capturing the inherent struture of the task. We show that a Hidden Markov Model (HMM) and a biologically plausible recurrent neural network trained using Hebbian learning can both capture core aspects of the learning dynamics and the orthogonalized representational structure in neural activity. In contrast, we show that gradient-based learning of sequence models such as Long Short-Term Memory networks (LSTMs) and Transformers do not naturally produce such orthogonalized representations. We further demonstrate that mice exhibited adaptive behavior in novel task settings, with neural activity reflecting flexible deployment of the state machine. These findings shed light on the mathematical form of cognitive maps, the learning rules that sculpt them, and the algorithms that promote adaptive behavior in animals. The work thus charts a course toward a deeper understanding of biological intelligence and offers insights toward developing more robust learning algorithms in artificial intelligence.
Reimagining the neuron as a controller: A novel model for Neuroscience and AI
We build upon and expand the efficient coding and predictive information models of neurons, presenting a novel perspective that neurons not only predict but also actively influence their future inputs through their outputs. We introduce the concept of neurons as feedback controllers of their environments, a role traditionally considered computationally demanding, particularly when the dynamical system characterizing the environment is unknown. By harnessing a novel data-driven control framework, we illustrate the feasibility of biological neurons functioning as effective feedback controllers. This innovative approach enables us to coherently explain various experimental findings that previously seemed unrelated. Our research has profound implications, potentially revolutionizing the modeling of neuronal circuits and paving the way for the creation of alternative, biologically inspired artificial neural networks.
Mathematical and computational modelling of ocular hemodynamics: from theory to applications
Changes in ocular hemodynamics may be indicative of pathological conditions in the eye (e.g. glaucoma, age-related macular degeneration), but also elsewhere in the body (e.g. systemic hypertension, diabetes, neurodegenerative disorders). Thanks to its transparent fluids and structures that allow the light to go through, the eye offers a unique window on the circulation from large to small vessels, and from arteries to veins. Deciphering the causes that lead to changes in ocular hemodynamics in a specific individual could help prevent vision loss as well as aid in the diagnosis and management of diseases beyond the eye. In this talk, we will discuss how mathematical and computational modelling can help in this regard. We will focus on two main factors, namely blood pressure (BP), which drives the blood flow through the vessels, and intraocular pressure (IOP), which compresses the vessels and may impede the flow. Mechanism-driven models translates fundamental principles of physics and physiology into computable equations that allow for identification of cause-to-effect relationships among interplaying factors (e.g. BP, IOP, blood flow). While invaluable for causality, mechanism-driven models are often based on simplifying assumptions to make them tractable for analysis and simulation; however, this often brings into question their relevance beyond theoretical explorations. Data-driven models offer a natural remedy to address these short-comings. Data-driven methods may be supervised (based on labelled training data) or unsupervised (clustering and other data analytics) and they include models based on statistics, machine learning, deep learning and neural networks. Data-driven models naturally thrive on large datasets, making them scalable to a plethora of applications. While invaluable for scalability, data-driven models are often perceived as black- boxes, as their outcomes are difficult to explain in terms of fundamental principles of physics and physiology and this limits the delivery of actionable insights. The combination of mechanism-driven and data-driven models allows us to harness the advantages of both, as mechanism-driven models excel at interpretability but suffer from a lack of scalability, while data-driven models are excellent at scale but suffer in terms of generalizability and insights for hypothesis generation. This combined, integrative approach represents the pillar of the interdisciplinary approach to data science that will be discussed in this talk, with application to ocular hemodynamics and specific examples in glaucoma research.
A recurrent network model of planning predicts hippocampal replay and human behavior
When interacting with complex environments, humans can rapidly adapt their behavior to changes in task or context. To facilitate this adaptation, we often spend substantial periods of time contemplating possible futures before acting. For such planning to be rational, the benefits of planning to future behavior must at least compensate for the time spent thinking. Here we capture these features of human behavior by developing a neural network model where not only actions, but also planning, are controlled by prefrontal cortex. This model consists of a meta-reinforcement learning agent augmented with the ability to plan by sampling imagined action sequences drawn from its own policy, which we refer to as `rollouts'. Our results demonstrate that this agent learns to plan when planning is beneficial, explaining the empirical variability in human thinking times. Additionally, the patterns of policy rollouts employed by the artificial agent closely resemble patterns of rodent hippocampal replays recently recorded in a spatial navigation task, in terms of both their spatial statistics and their relationship to subsequent behavior. Our work provides a new theory of how the brain could implement planning through prefrontal-hippocampal interactions, where hippocampal replays are triggered by -- and in turn adaptively affect -- prefrontal dynamics.
Loss shaping enhances exact gradient learning with EventProp in Spiking Neural Networks
In vivo direct imaging of neuronal activity at high temporospatial resolution
Advanced noninvasive neuroimaging methods provide valuable information on the brain function, but they have obvious pros and cons in terms of temporal and spatial resolution. Functional magnetic resonance imaging (fMRI) using blood-oxygenation-level-dependent (BOLD) effect provides good spatial resolution in the order of millimeters, but has a poor temporal resolution in the order of seconds due to slow hemodynamic responses to neuronal activation, providing indirect information on neuronal activity. In contrast, electroencephalography (EEG) and magnetoencephalography (MEG) provide excellent temporal resolution in the millisecond range, but spatial information is limited to centimeter scales. Therefore, there has been a longstanding demand for noninvasive brain imaging methods capable of detecting neuronal activity at both high temporal and spatial resolution. In this talk, I will introduce a novel approach that enables Direct Imaging of Neuronal Activity (DIANA) using MRI that can dynamically image neuronal spiking activity in milliseconds precision, achieved by data acquisition scheme of rapid 2D line scan synchronized with periodically applied functional stimuli. DIANA was demonstrated through in vivo mouse brain imaging on a 9.4T animal scanner during electrical whisker-pad stimulation. DIANA with milliseconds temporal resolution had high correlations with neuronal spike activities, which could also be applied in capturing the sequential propagation of neuronal activity along the thalamocortical pathway of brain networks. In terms of the contrast mechanism, DIANA was almost unaffected by hemodynamic responses, but was subject to changes in membrane potential-associated tissue relaxation times such as T2 relaxation time. DIANA is expected to break new ground in brain science by providing an in-depth understanding of the hierarchical functional organization of the brain, including the spatiotemporal dynamics of neural networks.
A recurrent network model of planning explains hippocampal replay and human behavior
When interacting with complex environments, humans can rapidly adapt their behavior to changes in task or context. To facilitate this adaptation, we often spend substantial periods of time contemplating possible futures before acting. For such planning to be rational, the benefits of planning to future behavior must at least compensate for the time spent thinking. Here we capture these features of human behavior by developing a neural network model where not only actions, but also planning, are controlled by prefrontal cortex. This model consists of a meta-reinforcement learning agent augmented with the ability to plan by sampling imagined action sequences drawn from its own policy, which we refer to as 'rollouts'. Our results demonstrate that this agent learns to plan when planning is beneficial, explaining the empirical variability in human thinking times. Additionally, the patterns of policy rollouts employed by the artificial agent closely resemble patterns of rodent hippocampal replays recently recorded in a spatial navigation task, in terms of both their spatial statistics and their relationship to subsequent behavior. Our work provides a new theory of how the brain could implement planning through prefrontal-hippocampal interactions, where hippocampal replays are triggered by - and in turn adaptively affect - prefrontal dynamics.
Feedback control in the nervous system: from cells and circuits to behaviour
The nervous system is fundamentally a closed loop control device: the output of actions continually influences the internal state and subsequent actions. This is true at the single cell and even the molecular level, where “actions” take the form of signals that are fed back to achieve a variety of functions, including homeostasis, excitability and various kinds of multistability that allow switching and storage of memory. It is also true at the behavioural level, where an animal’s motor actions directly influence sensory input on short timescales, and higher level information about goals and intended actions are continually updated on the basis of current and past actions. Studying the brain in a closed loop setting requires a multidisciplinary approach, leveraging engineering and theory as well as advances in measuring and manipulating the nervous system. I will describe our recent attempts to achieve this fusion of approaches at multiple levels in the nervous system, from synaptic signalling to closed loop brain machine interfaces.
Quasicriticality and the quest for a framework of neuronal dynamics
Critical phenomena abound in nature, from forest fires and earthquakes to avalanches in sand and neuronal activity. Since the 2003 publication by Beggs & Plenz on neuronal avalanches, a growing body of work suggests that the brain homeostatically regulates itself to operate near a critical point where information processing is optimal. At this critical point, incoming activity is neither amplified (supercritical) nor damped (subcritical), but approximately preserved as it passes through neural networks. Departures from the critical point have been associated with conditions of poor neurological health like epilepsy, Alzheimer's disease, and depression. One complication that arises from this picture is that the critical point assumes no external input. But, biological neural networks are constantly bombarded by external input. How is then the brain able to homeostatically adapt near the critical point? We’ll see that the theory of quasicriticality, an organizing principle for brain dynamics, can account for this paradoxical situation. As external stimuli drive the cortex, quasicriticality predicts a departure from criticality while maintaining optimal properties for information transmission. We’ll see that simulations and experimental data confirm these predictions and describe new ones that could be tested soon. More importantly, we will see how this organizing principle could help in the search for biomarkers that could soon be tested in clinical studies.
Signatures of criticality in efficient coding networks
The critical brain hypothesis states that the brain can benefit from operating close to a second-order phase transition. While it has been shown that several computational aspects of sensory information processing (e.g., sensitivity to input) are optimal in this regime, it is still unclear whether these computational benefits of criticality can be leveraged by neural systems performing behaviorally relevant computations. To address this question, we investigate signatures of criticality in networks optimized to perform efficient encoding. We consider a network of leaky integrate-and-fire neurons with synaptic transmission delays and input noise. Previously, it was shown that the performance of such networks varies non-monotonically with the noise amplitude. Interestingly, we find that in the vicinity of the optimal noise level for efficient coding, the network dynamics exhibits signatures of criticality, namely, the distribution of avalanche sizes follows a power law. When the noise amplitude is too low or too high for efficient coding, the network appears either super-critical or sub-critical, respectively. This result suggests that two influential, and previously disparate theories of neural processing optimization—efficient coding, and criticality—may be intimately related
The centrality of population-level factors to network computation is demonstrated by a versatile approach for training spiking networks
Neural activity is often described in terms of population-level factors extracted from the responses of many neurons. Factors provide a lower-dimensional description with the aim of shedding light on network computations. Yet, mechanistically, computations are performed not by continuously valued factors but by interactions among neurons that spike discretely and variably. Models provide a means of bridging these levels of description. We developed a general method for training model networks of spiking neurons by leveraging factors extracted from either data or firing-rate-based networks. In addition to providing a useful model-building framework, this formalism illustrates how reliable and continuously valued factors can arise from seemingly stochastic spiking. Our framework establishes procedures for embedding this property in network models with different levels of realism. The relationship between spikes and factors in such networks provides a foundation for interpreting (and subtly redefining) commonly used quantities such as firing rates.
Learning through the eyes and ears of a child
Young children have sophisticated representations of their visual and linguistic environment. Where do these representations come from? How much knowledge arises through generic learning mechanisms applied to sensory data, and how much requires more substantive (possibly innate) inductive biases? We examine these questions by training neural networks solely on longitudinal data collected from a single child (Sullivan et al., 2020), consisting of egocentric video and audio streams. Our principal findings are as follows: 1) Based on visual only training, neural networks can acquire high-level visual features that are broadly useful across categorization and segmentation tasks. 2) Based on language only training, networks can acquire meaningful clusters of words and sentence-level syntactic sensitivity. 3) Based on paired visual and language training, networks can acquire word-referent mappings from tens of noisy examples and align their multi-modal conceptual systems. Taken together, our results show how sophisticated visual and linguistic representations can arise through data-driven learning applied to one child’s first-person experience.
Assigning credit through the "other” connectome
Learning in neural networks requires assigning the right values to thousands to trillions or more of individual connections, so that the network as a whole produces the desired behavior. Neuroscientists have gained insights into this “credit assignment” problem through decades of experimental, modeling, and theoretical studies. This has suggested key roles for synaptic eligibility traces and top-down feedback signals, among other factors. Here we study the potential contribution of another type of signaling that is being revealed in greater and greater fidelity by ongoing molecular and genomics studies. This is the set of modulatory pathways local to a given circuit, which form an intriguing second type of connectome overlayed on top of synaptic connectivity. We will share ongoing modeling and theoretical work that explores the possible roles of this local modulatory connectome in network learning.
The strongly recurrent regime of cortical networks
Modern electrophysiological recordings simultaneously capture single-unit spiking activities of hundreds of neurons. These neurons exhibit highly complex coordination patterns. Where does this complexity stem from? One candidate is the ubiquitous heterogeneity in connectivity of local neural circuits. Studying neural network dynamics in the linearized regime and using tools from statistical field theory of disordered systems, we derive relations between structure and dynamics that are readily applicable to subsampled recordings of neural circuits: Measuring the statistics of pairwise covariances allows us to infer statistical properties of the underlying connectivity. Applying our results to spontaneous activity of macaque motor cortex, we find that the underlying network operates in a strongly recurrent regime. In this regime, network connectivity is highly heterogeneous, as quantified by a large radius of bulk connectivity eigenvalues. Being close to the point of linear instability, this dynamical regime predicts a rich correlation structure, a large dynamical repertoire, long-range interaction patterns, relatively low dimensionality and a sensitive control of neuronal coordination. These predictions are verified in analyses of spontaneous activity of macaque motor cortex and mouse visual cortex. Finally, we show that even microscopic features of connectivity, such as connection motifs, systematically scale up to determine the global organization of activity in neural circuits.
Are place cells just memory cells? Probably yes
Neurons in the rodent hippocampus appear to encode the position of the animal in physical space during movement. Individual ``place cells'' fire in restricted sub-regions of an environment, a feature often taken as evidence that the hippocampus encodes a map of space that subserves navigation. But these same neurons exhibit complex responses to many other variables that defy explanation by position alone, and the hippocampus is known to be more broadly critical for memory formation. Here we elaborate and test a theory of hippocampal coding which produces place cells as a general consequence of efficient memory coding. We constructed neural networks that actively exploit the correlations between memories in order to learn compressed representations of experience. Place cells readily emerged in the trained model, due to the correlations in sensory input between experiences at nearby locations. Notably, these properties were highly sensitive to the compressibility of the sensory environment, with place field size and population coding level in dynamic opposition to optimally encode the correlations between experiences. The effects of learning were also strongly biphasic: nearby locations are represented more similarly following training, while locations with intermediate similarity become increasingly decorrelated, both distance-dependent effects that scaled with the compressibility of the input features. Using virtual reality and 2-photon functional calcium imaging in head-fixed mice, we recorded the simultaneous activity of thousands of hippocampal neurons during virtual exploration to test these predictions. Varying the compressibility of sensory information in the environment produced systematic changes in place cell properties that reflected the changing input statistics, consistent with the theory. We similarly identified representational plasticity during learning, which produced a distance-dependent exchange between compression and pattern separation. These results motivate a more domain-general interpretation of hippocampal computation, one that is naturally compatible with earlier theories on the circuit's importance for episodic memory formation. Work done in collaboration with James Priestley, Lorenzo Posani, Marcus Benna, Attila Losonczy.
Learning to see stuff
Humans are very good at visually recognizing materials and inferring their properties. Without touching surfaces, we can usually tell what they would feel like, and we enjoy vivid visual intuitions about how they typically behave. This is impressive because the retinal image that the visual system receives as input is the result of complex interactions between many physical processes. Somehow the brain has to disentangle these different factors. I will present some recent work in which we show that an unsupervised neural network trained on images of surfaces spontaneously learns to disentangle reflectance, lighting and shape. However, the disentanglement is not perfect, and we find that as a result the network not only predicts the broad successes of human gloss perception, but also the specific pattern of errors that humans exhibit on an image-by-image basis. I will argue this has important implications for thinking about appearance and vision more broadly.
Deep learning applications in ophthalmology
Deep learning techniques have revolutionized the field of image analysis and played a disruptive role in the ability to quickly and efficiently train image analysis models that perform as well as human beings. This talk will cover the beginnings of the application of deep learning in the field of ophthalmology and vision science, and cover a variety of applications of using deep learning as a method for scientific discovery and latent associations.
Understanding Machine Learning via Exactly Solvable Statistical Physics Models
The affinity between statistical physics and machine learning has a long history. I will describe the main lines of this long-lasting friendship in the context of current theoretical challenges and open questions about deep learning. Theoretical physics often proceeds in terms of solvable synthetic models, I will describe the related line of work on solvable models of simple feed-forward neural networks. I will highlight a path forward to capture the subtle interplay between the structure of the data, the architecture of the network, and the optimization algorithms commonly used for learning.
Spatially-embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings
Brain networks exist within the confines of resource limitations. As a result, a brain network must overcome metabolic costs of growing and sustaining the network within its physical space, while simultaneously implementing its required information processing. To observe the effect of these processes, we introduce the spatially-embedded recurrent neural network (seRNN). seRNNs learn basic task-related inferences while existing within a 3D Euclidean space, where the communication of constituent neurons is constrained by a sparse connectome. We find that seRNNs, similar to primate cerebral cortices, naturally converge on solving inferences using modular small-world networks, in which functionally similar units spatially configure themselves to utilize an energetically-efficient mixed-selective code. As all these features emerge in unison, seRNNs reveal how many common structural and functional brain motifs are strongly intertwined and can be attributed to basic biological optimization processes. seRNNs can serve as model systems to bridge between structural and functional research communities to move neuroscientific understanding forward.
Meta-learning functional plasticity rules in neural networks
Synaptic plasticity is known to be a key player in the brain’s life-long learning abilities. However, due to experimental limitations, the nature of the local changes at individual synapses and their link with emerging network-level computations remain unclear. I will present a numerical, meta-learning approach to deduce plasticity rules from either neuronal activity data and/or prior knowledge about the network's computation. I will first show how to recover known rules, given a human-designed loss function in rate networks, or directly from data, using an adversarial approach. Then I will present how to scale-up this approach to recurrent spiking networks using simulation-based inference.
Extracting computational mechanisms from neural data using low-rank RNNs
An influential theory in systems neuroscience suggests that brain function can be understood through low-dimensional dynamics [Vyas et al 2020]. However, a challenge in this framework is that a single computational task may involve a range of dynamic processes. To understand which processes are at play in the brain, it is important to use data on neural activity to constrain models. In this study, we present a method for extracting low-dimensional dynamics from data using low-rank recurrent neural networks (lrRNNs), a highly expressive and understandable type of model [Mastrogiuseppe & Ostojic 2018, Dubreuil, Valente et al. 2022]. We first test our approach using synthetic data created from full-rank RNNs that have been trained on various brain tasks. We find that lrRNNs fitted to neural activity allow us to identify the collective computational processes and make new predictions for inactivations in the original RNNs. We then apply our method to data recorded from the prefrontal cortex of primates during a context-dependent decision-making task. Our approach enables us to assign computational roles to the different latent variables and provides a mechanistic model of the recorded dynamics, which can be used to perform in silico experiments like inactivations and provide testable predictions.
Geometry of concept learning
Understanding Human ability to learn novel concepts from just a few sensory experiences is a fundamental problem in cognitive neuroscience. I will describe a recent work with Ben Sorcher and Surya Ganguli (PNAS, October 2022) in which we propose a simple, biologically plausible, and mathematically tractable neural mechanism for few-shot learning of naturalistic concepts. We posit that the concepts that can be learned from few examples are defined by tightly circumscribed manifolds in the neural firing-rate space of higher-order sensory areas. Discrimination between novel concepts is performed by downstream neurons implementing ‘prototype’ decision rule, in which a test example is classified according to the nearest prototype constructed from the few training examples. We show that prototype few-shot learning achieves high few-shot learning accuracy on natural visual concepts using both macaque inferotemporal cortex representations and deep neural network (DNN) models of these representations. We develop a mathematical theory that links few-shot learning to the geometric properties of the neural concept manifolds and demonstrate its agreement with our numerical simulations across different DNNs as well as different layers. Intriguingly, we observe striking mismatches between the geometry of manifolds in intermediate stages of the primate visual pathway and in trained DNNs. Finally, we show that linguistic descriptors of visual concepts can be used to discriminate images belonging to novel concepts, without any prior visual experience of these concepts (a task known as ‘zero-shot’ learning), indicated a remarkable alignment of manifold representations of concepts in visual and language modalities. I will discuss ongoing effort to extend this work to other high level cognitive tasks.
Analyzing artificial neural networks to understand the brain
In the first part of this talk I will present work showing that recurrent neural networks can replicate broad behavioral patterns associated with dynamic visual object recognition in humans. An analysis of these networks shows that different types of recurrence use different strategies to solve the object recognition problem. The similarities between artificial neural networks and the brain presents another opportunity, beyond using them just as models of biological processing. In the second part of this talk, I will discuss—and solicit feedback on—a proposed research plan for testing a wide range of analysis tools frequently applied to neural data on artificial neural networks. I will present the motivation for this approach as well as the form the results could take and how this would benefit neuroscience.
Convex neural codes in recurrent networks and sensory systems
Neural activity in many sensory systems is organized on low-dimensional manifolds by means of convex receptive fields. Neural codes in these areas are constrained by this organization, as not every neural code is compatible with convex receptive fields. The same codes are also constrained by the structure of the underlying neural network. In my talk I will attempt to provide answers to the following natural questions: (i) How do recurrent circuits generate codes that are compatible with the convexity of receptive fields? (ii) How can we utilize the constraints imposed by the convex receptive field to understand the underlying stimulus space. To answer question (i), we describe the combinatorics of the steady states and fixed points of recurrent networks that satisfy the Dale’s law. It turns out the combinatorics of the fixed points are completely determined by two distinct conditions: (a) the connectivity graph of the network and (b) a spectral condition on the synaptic matrix. We give a characterization of exactly which features of connectivity determine the combinatorics of the fixed points. We also find that a generic recurrent network that satisfies Dale's law outputs convex combinatorial codes. To address question (ii), I will describe methods based on ideas from topology and geometry that take advantage of the convex receptive field properties to infer the dimension of (non-linear) neural representations. I will illustrate the first method by inferring basic features of the neural representations in the mouse olfactory bulb.
Connecting performance benefits on visual tasks to neural mechanisms using convolutional neural networks
Behavioral studies have demonstrated that certain task features reliably enhance classification performance for challenging visual stimuli. These include extended image presentation time and the valid cueing of attention. Here, I will show how convolutional neural networks can be used as a model of the visual system that connects neural activity changes with such performance changes. Specifically, I will discuss how different anatomical forms of recurrence can account for better classification of noisy and degraded images with extended processing time. I will then show how experimentally-observed neural activity changes associated with feature attention lead to observed performance changes on detection tasks. I will also discuss the implications these results have for how we identify the neural mechanisms and architectures important for behavior.
Can a single neuron solve MNIST? Neural computation of machine learning tasks emerges from the interaction of dendritic properties
Physiological experiments have highlighted how the dendrites of biological neurons can nonlinearly process distributed synaptic inputs. However, it is unclear how qualitative aspects of a dendritic tree, such as its branched morphology, its repetition of presynaptic inputs, voltage-gated ion channels, electrical properties and complex synapses, determine neural computation beyond this apparent nonlinearity. While it has been speculated that the dendritic tree of a neuron can be seen as a multi-layer neural network and it has been shown that such an architecture could be computationally strong, we do not know if that computational strength is preserved under these qualitative biological constraints. Here we simulate multi-layer neural network models of dendritic computation with and without these constraints. We find that dendritic model performance on interesting machine learning tasks is not hurt by most of these constraints and may synergistically benefit from all of them combined. Our results suggest that single real dendritic trees may be able to learn a surprisingly broad range of tasks through the emergent capabilities afforded by their properties.
Circuit solutions for programming actions
The hippocampus is one of the few regions in the adult mammalian brain which is endowed with life-long neurogenesis. Despite intense investigation, it remains unclear how neurons newly-generated may retain unique functions that contribute to modulate hippocampal information processing and cognition. In this talk, I will present some recent findings revealing how enhanced forms of plasticity in adult-born neurons underlie the way they become incorporated into pre-existing networks in response to experience.
Neural networks in the replica-mean field limits
In this talk, we propose to decipher the activity of neural networks via a “multiply and conquer” approach. This approach considers limit networks made of infinitely many replicas with the same basic neural structure. The key point is that these so-called replica-mean-field networks are in fact simplified, tractable versions of neural networks that retain important features of the finite network structure of interest. The finite size of neuronal populations and synaptic interactions is a core determinant of neural dynamics, being responsible for non-zero correlation in the spiking activity and for finite transition rates between metastable neural states. Theoretically, we develop our replica framework by expanding on ideas from the theory of communication networks rather than from statistical physics to establish Poissonian mean-field limits for spiking networks. Computationally, we leverage our original replica approach to characterize the stationary spiking activity of various network models via reduction to tractable functional equations. We conclude by discussing perspectives about how to use our replica framework to probe nontrivial regimes of spiking correlations and transition rates between metastable neural states.
Bridging the gap between artificial models and cortical circuits
Artificial neural networks simplify complex biological circuits into tractable models for computational exploration and experimentation. However, the simplification of artificial models also undermines their applicability to real brain dynamics. Typical efforts to address this mismatch add complexity to increasingly unwieldy models. Here, we take a different approach; by reducing the complexity of a biological cortical culture, we aim to distil the essential factors of neuronal dynamics and plasticity. We leverage recent advances in growing neurons from human induced pluripotent stem cells (hiPSCs) to analyse ex vivo cortical cultures with only two distinct excitatory and inhibitory neuron populations. Over 6 weeks of development, we record from thousands of neurons using high-density microelectrode arrays (HD-MEAs) that allow access to individual neurons and the broader population dynamics. We compare these dynamics to two-population artificial networks of single-compartment neurons with random sparse connections and show that they produce similar dynamics. Specifically, our model captures the firing and bursting statistics of the cultures. Moreover, tightly integrating models and cultures allows us to evaluate the impact of changing architectures over weeks of development, with and without external stimuli. Broadly, the use of simplified cortical cultures enables us to use the repertoire of theoretical neuroscience techniques established over the past decades on artificial network models. Our approach of deriving neural networks from human cells also allows us, for the first time, to directly compare neural dynamics of disease and control. We found that cultures e.g. from epilepsy patients tended to have increasingly more avalanches of synchronous activity over weeks of development, in contrast to the control cultures. Next, we will test possible interventions, in silico and in vitro, in a drive for personalised approaches to medical care. This work starts bridging an important theoretical-experimental neuroscience gap for advancing our understanding of mammalian neuron dynamics.
A biologically plausible inhibitory plasticity rule for world-model learning in SNNs
Memory consolidation is the process by which recent experiences are assimilated into long-term memory. In animals, this process requires the offline replay of sequences observed during online exploration in the hippocampus. Recent experimental work has found that salient but task-irrelevant stimuli are systematically excluded from these replay epochs, suggesting that replay samples from an abstracted model of the world, rather than verbatim previous experiences. We find that this phenomenon can be explained parsimoniously and biologically plausibly by a Hebbian spike time-dependent plasticity rule at inhibitory synapses. Using spiking networks at three levels of abstraction–leaky integrate-and-fire, biophysically detailed, and abstract binary–we show that this rule enables efficient inference of a model of the structure of the world. While plasticity has previously mainly been studied at excitatory synapses, we find that plasticity at excitatory synapses alone is insufficient to accomplish this type of structural learning. We present theoretical results in a simplified model showing that in the presence of Hebbian excitatory and inhibitory plasticity, the replayed sequences form a statistical estimator of a latent sequence, which converges asymptotically to the ground truth. Our work outlines a direct link between the synaptic and cognitive levels of memory consolidation, and highlights a potential conceptually distinct role for inhibition in computing with SNNs.
Merging insights from artificial and biological neural networks for neuromorphic intelligence
Training Dynamic Spiking Neural Network via Forward Propagation Through Time
With recent advances in learning algorithms, recurrent networks of spiking neurons are achieving performance competitive with standard recurrent neural networks. Still, these learning algorithms are limited to small networks of simple spiking neurons and modest-length temporal sequences, as they impose high memory requirements, have difficulty training complex neuron models, and are incompatible with online learning.Taking inspiration from the concept of Liquid Time-Constant (LTCs), we introduce a novel class of spiking neurons, the Liquid Time-Constant Spiking Neuron (LTC-SN), resulting in functionality similar to the gating operation in LSTMs. We integrate these neurons in SNNs that are trained with FPTT and demonstrate that thus trained LTC-SNNs outperform various SNNs trained with BPTT on long sequences while enabling online learning and drastically reducing memory complexity. We show this for several classical benchmarks that can easily be varied in sequence length, like the Add Task and the DVS-gesture benchmark. We also show how FPTT-trained LTC-SNNs can be applied to large convolutional SNNs, where we demonstrate novel state-of-the-art for online learning in SNNs on a number of standard benchmarks (S-MNIST, R-MNIST, DVS-GESTURE) and also show that large feedforward SNNs can be trained successfully in an online manner to near (Fashion-MNIST, DVS-CIFAR10) or exceeding (PS-MNIST, R-MNIST) state-of-the-art performance as obtained with offline BPTT. Finally, the training and memory efficiency of FPTT enables us to directly train SNNs in an end-to-end manner at network sizes and complexity that was previously infeasible: we demonstrate this by training in an end-to-end fashion the first deep and performant spiking neural network for object localization and recognition. Taken together, we out contribution enable for the first time training large-scale complex spiking neural network architectures online and on long temporal sequences.
Universal function approximation in balanced spiking networks through convex-concave boundary composition
The spike-threshold nonlinearity is a fundamental, yet enigmatic, component of biological computation — despite its role in many theories, it has evaded definitive characterisation. Indeed, much classic work has attempted to limit the focus on spiking by smoothing over the spike threshold or by approximating spiking dynamics with firing-rate dynamics. Here, we take a novel perspective that captures the full potential of spike-based computation. Based on previous studies of the geometry of efficient spike-coding networks, we consider a population of neurons with low-rank connectivity, allowing us to cast each neuron’s threshold as a boundary in a space of population modes, or latent variables. Each neuron divides this latent space into subthreshold and suprathreshold areas. We then demonstrate how a network of inhibitory (I) neurons forms a convex, attracting boundary in the latent coding space, and a network of excitatory (E) neurons forms a concave, repellant boundary. Finally, we show how the combination of the two yields stable dynamics at the crossing of the E and I boundaries, and can be mapped onto a constrained optimization problem. The resultant EI networks are balanced, inhibition-stabilized, and exhibit asynchronous irregular activity, thereby closely resembling cortical networks of the brain. Moreover, we demonstrate how such networks can be tuned to either suppress or amplify noise, and how the composition of inhibitory convex and excitatory concave boundaries can result in universal function approximation. Our work puts forth a new theory of biologically-plausible computation in balanced spiking networks, and could serve as a novel framework for scalable and interpretable computation with spikes.
Spiking Deep Learning with SpikingJelly
Behavioral Timescale Synaptic Plasticity (BTSP) for biologically plausible credit assignment across multiple layers via top-down gating of dendritic plasticity
A central problem in biological learning is how information about the outcome of a decision or behavior can be used to reliably guide learning across distributed neural circuits while obeying biological constraints. This “credit assignment” problem is commonly solved in artificial neural networks through supervised gradient descent and the backpropagation algorithm. In contrast, biological learning is typically modelled using unsupervised Hebbian learning rules. While these rules only use local information to update synaptic weights, and are sometimes combined with weight constraints to reflect a diversity of excitatory (only positive weights) and inhibitory (only negative weights) cell types, they do not prescribe a clear mechanism for how to coordinate learning across multiple layers and propagate error information accurately across the network. In recent years, several groups have drawn inspiration from the known dendritic non-linearities of pyramidal neurons to propose new learning rules and network architectures that enable biologically plausible multi-layer learning by processing error information in segregated dendrites. Meanwhile, recent experimental results from the hippocampus have revealed a new form of plasticity—Behavioral Timescale Synaptic Plasticity (BTSP)—in which large dendritic depolarizations rapidly reshape synaptic weights and stimulus selectivity with as little as a single stimulus presentation (“one-shot learning”). Here we explore the implications of this new learning rule through a biologically plausible implementation in a rate neuron network. We demonstrate that regulation of dendritic spiking and BTSP by top-down feedback signals can effectively coordinate plasticity across multiple network layers in a simple pattern recognition task. By analyzing hidden feature representations and weight trajectories during learning, we show the differences between networks trained with standard backpropagation, Hebbian learning rules, and BTSP.
Beyond Biologically Plausible Spiking Networks for Neuromorphic Computing
Biologically plausible spiking neural networks (SNNs) are an emerging architecture for deep learning tasks due to their energy efficiency when implemented on neuromorphic hardware. However, many of the biological features are at best irrelevant and at worst counterproductive when evaluated in the context of task performance and suitability for neuromorphic hardware. In this talk, I will present an alternative paradigm to design deep learning architectures with good task performance in real-world benchmarks while maintaining all the advantages of SNNs. We do this by focusing on two main features – event-based computation and activity sparsity. Starting from the performant gated recurrent unit (GRU) deep learning architecture, we modify it to make it event-based and activity-sparse. The resulting event-based GRU (EGRU) is extremely efficient for both training and inference. At the same time, it achieves performance close to conventional deep learning architectures in challenging tasks such as language modelling, gesture recognition and sequential MNIST.
Why dendrites matter for biological and artificial circuits
Nonlinear computations in spiking neural networks through multiplicative synapses
The brain efficiently performs nonlinear computations through its intricate networks of spiking neurons, but how this is done remains elusive. While recurrent spiking networks implementing linear computations can be directly derived and easily understood (e.g., in the spike coding network (SCN) framework), the connectivity required for nonlinear computations can be harder to interpret, as they require additional non-linearities (e.g., dendritic or synaptic) weighted through supervised training. Here we extend the SCN framework to directly implement any polynomial dynamical system. This results in networks requiring multiplicative synapses, which we term the multiplicative spike coding network (mSCN). We demonstrate how the required connectivity for several nonlinear dynamical systems can be directly derived and implemented in mSCNs, without training. We also show how to precisely carry out higher-order polynomials with coupled networks that use only pair-wise multiplicative synapses, and provide expected numbers of connections for each synapse type. Overall, our work provides an alternative method for implementing nonlinear computations in spiking neural networks, while keeping all the attractive features of standard SCNs such as robustness, irregular and sparse firing, and interpretable connectivity. Finally, we discuss the biological plausibility of mSCNs, and how the high accuracy and robustness of the approach may be of interest for neuromorphic computing.
Memory-enriched computation and learning in spiking neural networks through Hebbian plasticity
Memory is a key component of biological neural systems that enables the retention of information over a huge range of temporal scales, ranging from hundreds of milliseconds up to years. While Hebbian plasticity is believed to play a pivotal role in biological memory, it has so far been analyzed mostly in the context of pattern completion and unsupervised learning. Here, we propose that Hebbian plasticity is fundamental for computations in biological neural systems. We introduce a novel spiking neural network (SNN) architecture that is enriched by Hebbian synaptic plasticity. We experimentally show that our memory-equipped SNN model outperforms state-of-the-art deep learning mechanisms in a sequential pattern-memorization task, as well as demonstrate superior out-of-distribution generalization capabilities compared to these models. We further show that our model can be successfully applied to one-shot learning and classification of handwritten characters, improving over the state-of-the-art SNN model. We also demonstrate the capability of our model to learn associations for audio to image synthesis from spoken and handwritten digits. Our SNN model further presents a novel solution to a variety of cognitive question answering tasks from a standard benchmark, achieving comparable performance to both memory-augmented ANN and SNN-based state-of-the-art solutions to this problem. Finally we demonstrate that our model is able to learn from rewards on an episodic reinforcement learning task and attain near-optimal strategy on a memory-based card game. Hence, our results show that Hebbian enrichment renders spiking neural networks surprisingly versatile in terms of their computational as well as learning capabilities. Since local Hebbian plasticity can easily be implemented in neuromorphic hardware, this also suggests that powerful cognitive neuromorphic systems can be build based on this principle.
Algorithm-Hardware Co-design for Efficient and Robust Spiking Neural Networks
Brian2CUDA: Generating Efficient CUDA Code for Spiking Neural Networks
Graphics processing units (GPUs) are widely available and have been used with great success to accelerate scientific computing in the last decade. These advances, however, are often not available to researchers interested in simulating spiking neural networks, but lacking the technical knowledge to write the necessary low-level code. Writing low-level code is not necessary when using the popular Brian simulator, which provides a framework to generate efficient CPU code from high-level model definitions in Python. Here, we present Brian2CUDA, an open-source software that extends the Brian simulator with a GPU backend. Our implementation generates efficient code for the numerical integration of neuronal states and for the propagation of synaptic events on GPUs, making use of their massively parallel arithmetic capabilities. We benchmark the performance improvements of our software for several model types and find that it can accelerate simulations by up to three orders of magnitude compared to Brian’s CPU backend. Currently, Brian2CUDA is the only package that supports Brian’s full feature set on GPUs, including arbitrary neuron and synapse models, plasticity rules, and heterogeneous delays. When comparing its performance with Brian2GeNN, another GPU-based backend for the Brian simulator with fewer features, we find that Brian2CUDA gives comparable speedups, while being typically slower for small and faster for large networks. By combining the flexibility of the Brian simulator with the simulation speed of GPUs, Brian2CUDA enables researchers to efficiently simulate spiking neural networks with minimal effort and thereby makes the advancements of GPU computing available to a larger audience of neuroscientists.
Development of Interictal Networks: Implications for Epilepsy Progression and Cognition
Epilepsy is a common and disabling neurologic condition affecting adults and children that results from complex dysfunction of neural networks and is ineffectively treated with current therapies in up to one third of patients. This dysfunction can have especially severe consequences in pediatric age group, where neurodevelopment may be irreversibly affected. Furthermore, although seizures are the most obvious manifestation of epilepsy, the cognitive and psychiatric dysfunction that often coexists in patients with this disorder has the potential to be equally disabling. Given these challenges, her research program aims to better understand how epileptic activity disrupts the proper development and function of neural networks, with the overall goal of identifying novel biomarkers and systems level treatments for epileptic disorders and their comorbidities, especially those affecting children.
Biological-plausible learning with a two compartment neuron model in recurrent neural networks
Bernstein Conference 2024
Knocking out co-active plasticity rules in neural networks reveals synapse type-specific contributions for learning and memory
Bernstein Conference 2024
The cost of behavioral flexibility: a modeling study of reversal learning using a spiking neural network
Bernstein Conference 2024
Critical organisation for complex temporal tasks in neural networks
Bernstein Conference 2024
Defining the Limits: Upper Bound of Non-Neurobiological Treatment Efficacy through Cognitive-Neural Network Alignment
Bernstein Conference 2024
Dendrites endow artificial neural networks with accurate, robust and parameter-efficient learning
Bernstein Conference 2024
Dynamical representations between biologically plausible and implausible task-trained neural networks
Bernstein Conference 2024
Emergence of Synfire Chains in Functional Multi-Layer Spiking Neural Networks
Bernstein Conference 2024
Efficient cortical spike train decoding for brain-machine interface implants with recurrent spiking neural networks
Bernstein Conference 2024
Enhancing learning through neuromodulation-aware spiking neural networks
Bernstein Conference 2024
Evolutionary algorithms support recurrent plasticity in spiking neural network models of neocortical task learning
Bernstein Conference 2024
Excitatory and inhibitory neurons exhibit distinct roles for task learning, temporal scaling, and working memory in recurrent spiking neural network models of neocortex.
Bernstein Conference 2024
Experiment-based Models to Study Local Learning Rules for Spiking Neural Networks
Bernstein Conference 2024
A feedback control algorithm for online learning in Spiking Neural Networks and Neuromorphic devices
Bernstein Conference 2024
Generalizing deep neural network model captures the functional organization of feature selective retinal ganglion cell axonal boutons in the superior colliculus
Bernstein Conference 2024
A high-throughput single-cell stimulation platform to study plasticity in engineered neural networks in vitro
FENS Forum 2024
Identifying task-specific dynamics in recurrent neural networks using Dynamical Similarity Analysis
Bernstein Conference 2024
Inferring stochastic low-rank recurrent neural networks from neural data
Bernstein Conference 2024
Integrating Biological and Artificial Neural Networks for Solving Non-Linear Problems
Bernstein Conference 2024
Intracortical microstimulation in a spiking neural network model of the primary visual cortex
Bernstein Conference 2024
Parameter specification in spiking neural networks using simulation-based inference
Bernstein Conference 2024
Predicting V1 contextual modulation and neural tuning using a convolutional neural network
Bernstein Conference 2024
Rapid prototyping in spiking neural network modeling with NESTML and NEST Desktop
Bernstein Conference 2024
'Reusers' and 'Unlearners' display distinct effects of forgetting on reversal learning in neural networks
Bernstein Conference 2024
On The Role Of Temporal Hierarchy In Spiking Neural Networks
Bernstein Conference 2024
Seamless Deployment of Pre-trained Spiking Neural Networks onto SpiNNaker2
Bernstein Conference 2024
Shaping Low-Rank Recurrent Neural Networks with Biological Learning Rules
Bernstein Conference 2024
Short-Distance Connections Enhance Neural Network Dynamics
Bernstein Conference 2024
Smooth exact gradient descent learning in spiking neural networks
Bernstein Conference 2024
Unraveling perceptual biases: Insights from spiking recurrent neural networks
Bernstein Conference 2024
Using Dynamical Systems Theory to Improve Temporal Credit Assignment in Spiking Neural Networks
Bernstein Conference 2024
Attractor neural networks with metastable synapses
COSYNE 2022
A high-throughput pipeline for evaluating recurrent neural networks on multiple datasets
COSYNE 2022
Cross-Frequency Coupling Increases Memory Capacity in Oscillatory Neural Networks
COSYNE 2022
Deep neural network modeling of a visually-guided social behavior
COSYNE 2022
Emergence of time persistence in an interpretable data-driven neural network model
COSYNE 2022
Gain-mediated statistical adaptation in recurrent neural networks
COSYNE 2022
Gain-mediated statistical adaptation in recurrent neural networks
COSYNE 2022
A high-throughput pipeline for evaluating recurrent neural networks on multiple datasets
COSYNE 2022
Intrinsic dimension of neural activity: comparing artificial and biological neural networks
Bernstein Conference 2024