Neural Network
neural network
Latest
Computational Mechanisms of Predictive Processing in Brains and Machines
Predictive processing offers a unifying view of neural computation, proposing that brains continuously anticipate sensory input and update internal models based on prediction errors. In this talk, I will present converging evidence for the computational mechanisms underlying this framework across human neuroscience and deep neural networks. I will begin with recent work showing that large-scale distributed prediction-error encoding in the human brain directly predicts how sensory representations reorganize through predictive learning. I will then turn to PredNet, a popular predictive coding inspired deep network that has been widely used to model real-world biological vision systems. Using dynamic stimuli generated with our Spatiotemporal Style Transfer algorithm, we demonstrate that PredNet relies primarily on low-level spatiotemporal structure and remains insensitive to high-level content, revealing limits in its generalization capacity. Finally, I will discuss new recurrent vision models that integrate top-down feedback connections with intrinsic neural variability, uncovering a dual mechanism for robust sensory coding in which neural variability decorrelates unit responses, while top-down feedback stabilizes network dynamics. Together, these results outline how prediction error signaling and top-down feedback pathways shape adaptive sensory processing in biological and artificial systems.
From Spiking Predictive Coding to Learning Abstract Object Representation
In a first part of the talk, I will present Predictive Coding Light (PCL), a novel unsupervised learning architecture for spiking neural networks. In contrast to conventional predictive coding approaches, which only transmit prediction errors to higher processing stages, PCL learns inhibitory lateral and top-down connectivity to suppress the most predictable spikes and passes a compressed representation of the input to higher processing stages. We show that PCL reproduces a range of biological findings and exhibits a favorable tradeoff between energy consumption and downstream classification performance on challenging benchmarks. A second part of the talk will feature our lab’s efforts to explain how infants and toddlers might learn abstract object representations without supervision. I will present deep learning models that exploit the temporal and multimodal structure of their sensory inputs to learn representations of individual objects, object categories, or abstract super-categories such as „kitchen object“ in a fully unsupervised fashion. These models offer a parsimonious account of how abstract semantic knowledge may be rooted in children's embodied first-person experiences.
Developmental and evolutionary perspectives on thalamic function
Brain organization and function is a complex topic. We are good at establishing correlates of perception and behavior across forebrain circuits, as well as manipulating activity in these circuits to affect behavior. However, we still lack good models for the large-scale organization and function of the forebrain. What are the contributions of the cortex, basal ganglia, and thalamus to behavior? In addressing these questions, we often ascribe function to each area as if it were an independent processing unit. However, we know from the anatomy that the cortex, basal ganglia, and thalamus, are massively interconnected in a large network. One way to generate insight into these questions is to consider the evolution and development of forebrain systems. In this talk, I will discuss the developmental and evolutionary (comparative anatomy) data on the thalamus, and how it fits within forebrain networks. I will address questions including, when did the thalamus appear in evolution, how is the thalamus organized across the vertebrate lineage, and how can the change in the organization of forebrain networks affect behavioral repertoires.
Neurobiological constraints on learning: bug or feature?
Understanding how brains learn requires bridging evidence across scales—from behaviour and neural circuits to cells, synapses, and molecules. In our work, we use computational modelling and data analysis to explore how the physical properties of neurons and neural circuits constrain learning. These include limits imposed by brain wiring, energy availability, molecular noise, and the 3D structure of dendritic spines. In this talk I will describe one such project testing if wiring motifs from fly brain connectomes can improve performance of reservoir computers, a type of recurrent neural network. The hope is that these insights into brain learning will lead to improved learning algorithms for artificial systems.
Functional Plasticity in the Language Network – evidence from Neuroimaging and Neurostimulation
Efficient cognition requires flexible interactions between distributed neural networks in the human brain. These networks adapt to challenges by flexibly recruiting different regions and connections. In this talk, I will discuss how we study functional network plasticity and reorganization with combined neurostimulation and neuroimaging across the adult life span. I will argue that short-term plasticity enables flexible adaptation to challenges, via functional reorganization. My key hypothesis is that disruption of higher-level cognitive functions such as language can be compensated for by the recruitment of domain-general networks in our brain. Examples from healthy young brains illustrate how neurostimulation can be used to temporarily interfere with efficient processing, probing short-term network plasticity at the systems level. Examples from people with dyslexia help to better understand network disorders in the language domain and outline the potential of facilitatory neurostimulation for treatment. I will also discuss examples from aging brains where plasticity helps to compensate for loss of function. Finally, examples from lesioned brains after stroke provide insight into the brain’s potential for long-term reorganization and recovery of function. Collectively, these results challenge the view of a modular organization of the human brain and argue for a flexible redistribution of function via systems plasticity.
Brain Emulation Challenge Workshop
Brain Emulation Challenge workshop will tackle cutting-edge topics such as ground-truthing for validation, leveraging artificial datasets generated from virtual brain tissue, and the transformative potential of virtual brain platforms, such as applied to the forthcoming Brain Emulation Challenge.
Memory formation in hippocampal microcircuit
The centre of memory is the medial temporal lobe (MTL) and especially the hippocampus. In our research, a more flexible brain-inspired computational microcircuit of the CA1 region of the mammalian hippocampus was upgraded and used to examine how information retrieval could be affected under different conditions. Six models (1-6) were created by modulating different excitatory and inhibitory pathways. The results showed that the increase in the strength of the feedforward excitation was the most effective way to recall memories. In other words, that allows the system to access stored memories more accurately.
Analyzing Network-Level Brain Processing and Plasticity Using Molecular Neuroimaging
Behavior and cognition depend on the integrated action of neural structures and populations distributed throughout the brain. We recently developed a set of molecular imaging tools that enable multiregional processing and plasticity in neural networks to be studied at a brain-wide scale in rodents and nonhuman primates. Here we will describe how a novel genetically encoded activity reporter enables information flow in virally labeled neural circuitry to be monitored by fMRI. Using the reporter to perform functional imaging of synaptically defined neural populations in the rat somatosensory system, we show how activity is transformed within brain regions to yield characteristics specific to distinct output projections. We also show how this approach enables regional activity to be modeled in terms of inputs, in a paradigm that we are extending to address circuit-level origins of functional specialization in marmoset brains. In the second part of the talk, we will discuss how another genetic tool for MRI enables systematic studies of the relationship between anatomical and functional connectivity in the mouse brain. We show that variations in physical and functional connectivity can be dissociated both across individual subjects and over experience. We also use the tool to examine brain-wide relationships between plasticity and activity during an opioid treatment. This work demonstrates the possibility of studying diverse brain-wide processing phenomena using molecular neuroimaging.
The Brain Prize winners' webinar
This webinar brings together three leaders in theoretical and computational neuroscience—Larry Abbott, Haim Sompolinsky, and Terry Sejnowski—to discuss how neural circuits generate fundamental aspects of the mind. Abbott illustrates mechanisms in electric fish that differentiate self-generated electric signals from external sensory cues, showing how predictive plasticity and two-stage signal cancellation mediate a sense of self. Sompolinsky explores attractor networks, revealing how discrete and continuous attractors can stabilize activity patterns, enable working memory, and incorporate chaotic dynamics underlying spontaneous behaviors. He further highlights the concept of object manifolds in high-level sensory representations and raises open questions on integrating connectomics with theoretical frameworks. Sejnowski bridges these motifs with modern artificial intelligence, demonstrating how large-scale neural networks capture language structures through distributed representations that parallel biological coding. Together, their presentations emphasize the synergy between empirical data, computational modeling, and connectomics in explaining the neural basis of cognition—offering insights into perception, memory, language, and the emergence of mind-like processes.
Sensory cognition
This webinar features presentations from SueYeon Chung (New York University) and Srinivas Turaga (HHMI Janelia Research Campus) on theoretical and computational approaches to sensory cognition. Chung introduced a “neural manifold” framework to capture how high-dimensional neural activity is structured into meaningful manifolds reflecting object representations. She demonstrated that manifold geometry—shaped by radius, dimensionality, and correlations—directly governs a population’s capacity for classifying or separating stimuli under nuisance variations. Applying these ideas as a data analysis tool, she showed how measuring object-manifold geometry can explain transformations along the ventral visual stream and suggested that manifold principles also yield better self-supervised neural network models resembling mammalian visual cortex. Turaga described simulating the entire fruit fly visual pathway using its connectome, modeling 64 key cell types in the optic lobe. His team’s systematic approach—combining sparse connectivity from electron microscopy with simple dynamical parameters—recapitulated known motion-selective responses and produced novel testable predictions. Together, these studies underscore the power of combining connectomic detail, task objectives, and geometric theories to unravel neural computations bridging from stimuli to cognitive functions.
Brain-Wide Compositionality and Learning Dynamics in Biological Agents
Biological agents continually reconcile the internal states of their brain circuits with incoming sensory and environmental evidence to evaluate when and how to act. The brains of biological agents, including animals and humans, exploit many evolutionary innovations, chiefly modularity—observable at the level of anatomically-defined brain regions, cortical layers, and cell types among others—that can be repurposed in a compositional manner to endow the animal with a highly flexible behavioral repertoire. Accordingly, their behaviors show their own modularity, yet such behavioral modules seldom correspond directly to traditional notions of modularity in brains. It remains unclear how to link neural and behavioral modularity in a compositional manner. We propose a comprehensive framework—compositional modes—to identify overarching compositionality spanning specialized submodules, such as brain regions. Our framework directly links the behavioral repertoire with distributed patterns of population activity, brain-wide, at multiple concurrent spatial and temporal scales. Using whole-brain recordings of zebrafish brains, we introduce an unsupervised pipeline based on neural network models, constrained by biological data, to reveal highly conserved compositional modes across individuals despite the naturalistic (spontaneous or task-independent) nature of their behaviors. These modes provided a scaffolding for other modes that account for the idiosyncratic behavior of each fish. We then demonstrate experimentally that compositional modes can be manipulated in a consistent manner by behavioral and pharmacological perturbations. Our results demonstrate that even natural behavior in different individuals can be decomposed and understood using a relatively small number of neurobehavioral modules—the compositional modes—and elucidate a compositional neural basis of behavior. This approach aligns with recent progress in understanding how reasoning capabilities and internal representational structures develop over the course of learning or training, offering insights into the modularity and flexibility in artificial and biological agents.
Use case determines the validity of neural systems comparisons
Deep learning provides new data-driven tools to relate neural activity to perception and cognition, aiding scientists in developing theories of neural computation that increasingly resemble biological systems both at the level of behavior and of neural activity. But what in a deep neural network should correspond to what in a biological system? This question is addressed implicitly in the use of comparison measures that relate specific neural or behavioral dimensions via a particular functional form. However, distinct comparison methodologies can give conflicting results in recovering even a known ground-truth model in an idealized setting, leaving open the question of what to conclude from the outcome of a systems comparison using any given methodology. Here, we develop a framework to make explicit and quantitative the effect of both hypothesis-driven aspects—such as details of the architecture of a deep neural network—as well as methodological choices in a systems comparison setting. We demonstrate via the learning dynamics of deep neural networks that, while the role of the comparison methodology is often de-emphasized relative to hypothesis-driven aspects, this choice can impact and even invert the conclusions to be drawn from a comparison between neural systems. We provide evidence that the right way to adjudicate a comparison depends on the use case—the scientific hypothesis under investigation—which could range from identifying single-neuron or circuit-level correspondences to capturing generalizability to new stimulus properties
Probing neural population dynamics with recurrent neural networks
Large-scale recordings of neural activity are providing new opportunities to study network-level dynamics with unprecedented detail. However, the sheer volume of data and its dynamical complexity are major barriers to uncovering and interpreting these dynamics. I will present latent factor analysis via dynamical systems, a sequential autoencoding approach that enables inference of dynamics from neuronal population spiking activity on single trials and millisecond timescales. I will also discuss recent adaptations of the method to uncover dynamics from neural activity recorded via 2P Calcium imaging. Finally, time permitting, I will mention recent efforts to improve the interpretability of deep-learning based dynamical systems models.
Learning representations of specifics and generalities over time
There is a fundamental tension between storing discrete traces of individual experiences, which allows recall of particular moments in our past without interference, and extracting regularities across these experiences, which supports generalization and prediction in similar situations in the future. One influential proposal for how the brain resolves this tension is that it separates the processes anatomically into Complementary Learning Systems, with the hippocampus rapidly encoding individual episodes and the neocortex slowly extracting regularities over days, months, and years. But this does not explain our ability to learn and generalize from new regularities in our environment quickly, often within minutes. We have put forward a neural network model of the hippocampus that suggests that the hippocampus itself may contain complementary learning systems, with one pathway specializing in the rapid learning of regularities and a separate pathway handling the region’s classic episodic memory functions. This proposal has broad implications for how we learn and represent novel information of specific and generalized types, which we test across statistical learning, inference, and category learning paradigms. We also explore how this system interacts with slower-learning neocortical memory systems, with empirical and modeling investigations into how the hippocampus shapes neocortical representations during sleep. Together, the work helps us understand how structured information in our environment is initially encoded and how it then transforms over time.
Maintaining Plasticity in Neural Networks
Nonstationarity presents a variety of challenges for machine learning systems. One surprising pathology which can arise in nonstationary learning problems is plasticity loss, whereby making progress on new learning objectives becomes more difficult as training progresses. Networks which are unable to adapt in response to changes in their environment experience plateaus or even declines in performance in highly non-stationary domains such as reinforcement learning, where the learner must quickly adapt to new information even after hundreds of millions of optimization steps. The loss of plasticity manifests in a cluster of related empirical phenomena which have been identified by a number of recent works, including the primacy bias, implicit under-parameterization, rank collapse, and capacity loss. While this phenomenon is widely observed, it is still not fully understood. This talk will present exciting recent results which shed light on the mechanisms driving the loss of plasticity in a variety of learning problems and survey methods to maintain network plasticity in non-stationary tasks, with a particular focus on deep reinforcement learning.
Learning produces a hippocampal cognitive map in the form of an orthogonalized state machine
Cognitive maps confer animals with flexible intelligence by representing spatial, temporal, and abstract relationships that can be used to shape thought, planning, and behavior. Cognitive maps have been observed in the hippocampus, but their algorithmic form and the processes by which they are learned remain obscure. Here, we employed large-scale, longitudinal two-photon calcium imaging to record activity from thousands of neurons in the CA1 region of the hippocampus while mice learned to efficiently collect rewards from two subtly different versions of linear tracks in virtual reality. The results provide a detailed view of the formation of a cognitive map in the hippocampus. Throughout learning, both the animal behavior and hippocampal neural activity progressed through multiple intermediate stages, gradually revealing improved task representation that mirrored improved behavioral efficiency. The learning process led to progressive decorrelations in initially similar hippocampal neural activity within and across tracks, ultimately resulting in orthogonalized representations resembling a state machine capturing the inherent struture of the task. We show that a Hidden Markov Model (HMM) and a biologically plausible recurrent neural network trained using Hebbian learning can both capture core aspects of the learning dynamics and the orthogonalized representational structure in neural activity. In contrast, we show that gradient-based learning of sequence models such as Long Short-Term Memory networks (LSTMs) and Transformers do not naturally produce such orthogonalized representations. We further demonstrate that mice exhibited adaptive behavior in novel task settings, with neural activity reflecting flexible deployment of the state machine. These findings shed light on the mathematical form of cognitive maps, the learning rules that sculpt them, and the algorithms that promote adaptive behavior in animals. The work thus charts a course toward a deeper understanding of biological intelligence and offers insights toward developing more robust learning algorithms in artificial intelligence.
Reimagining the neuron as a controller: A novel model for Neuroscience and AI
We build upon and expand the efficient coding and predictive information models of neurons, presenting a novel perspective that neurons not only predict but also actively influence their future inputs through their outputs. We introduce the concept of neurons as feedback controllers of their environments, a role traditionally considered computationally demanding, particularly when the dynamical system characterizing the environment is unknown. By harnessing a novel data-driven control framework, we illustrate the feasibility of biological neurons functioning as effective feedback controllers. This innovative approach enables us to coherently explain various experimental findings that previously seemed unrelated. Our research has profound implications, potentially revolutionizing the modeling of neuronal circuits and paving the way for the creation of alternative, biologically inspired artificial neural networks.
A recurrent network model of planning predicts hippocampal replay and human behavior
When interacting with complex environments, humans can rapidly adapt their behavior to changes in task or context. To facilitate this adaptation, we often spend substantial periods of time contemplating possible futures before acting. For such planning to be rational, the benefits of planning to future behavior must at least compensate for the time spent thinking. Here we capture these features of human behavior by developing a neural network model where not only actions, but also planning, are controlled by prefrontal cortex. This model consists of a meta-reinforcement learning agent augmented with the ability to plan by sampling imagined action sequences drawn from its own policy, which we refer to as `rollouts'. Our results demonstrate that this agent learns to plan when planning is beneficial, explaining the empirical variability in human thinking times. Additionally, the patterns of policy rollouts employed by the artificial agent closely resemble patterns of rodent hippocampal replays recently recorded in a spatial navigation task, in terms of both their spatial statistics and their relationship to subsequent behavior. Our work provides a new theory of how the brain could implement planning through prefrontal-hippocampal interactions, where hippocampal replays are triggered by -- and in turn adaptively affect -- prefrontal dynamics.
Loss shaping enhances exact gradient learning with EventProp in Spiking Neural Networks
In vivo direct imaging of neuronal activity at high temporospatial resolution
Advanced noninvasive neuroimaging methods provide valuable information on the brain function, but they have obvious pros and cons in terms of temporal and spatial resolution. Functional magnetic resonance imaging (fMRI) using blood-oxygenation-level-dependent (BOLD) effect provides good spatial resolution in the order of millimeters, but has a poor temporal resolution in the order of seconds due to slow hemodynamic responses to neuronal activation, providing indirect information on neuronal activity. In contrast, electroencephalography (EEG) and magnetoencephalography (MEG) provide excellent temporal resolution in the millisecond range, but spatial information is limited to centimeter scales. Therefore, there has been a longstanding demand for noninvasive brain imaging methods capable of detecting neuronal activity at both high temporal and spatial resolution. In this talk, I will introduce a novel approach that enables Direct Imaging of Neuronal Activity (DIANA) using MRI that can dynamically image neuronal spiking activity in milliseconds precision, achieved by data acquisition scheme of rapid 2D line scan synchronized with periodically applied functional stimuli. DIANA was demonstrated through in vivo mouse brain imaging on a 9.4T animal scanner during electrical whisker-pad stimulation. DIANA with milliseconds temporal resolution had high correlations with neuronal spike activities, which could also be applied in capturing the sequential propagation of neuronal activity along the thalamocortical pathway of brain networks. In terms of the contrast mechanism, DIANA was almost unaffected by hemodynamic responses, but was subject to changes in membrane potential-associated tissue relaxation times such as T2 relaxation time. DIANA is expected to break new ground in brain science by providing an in-depth understanding of the hierarchical functional organization of the brain, including the spatiotemporal dynamics of neural networks.
A recurrent network model of planning explains hippocampal replay and human behavior
When interacting with complex environments, humans can rapidly adapt their behavior to changes in task or context. To facilitate this adaptation, we often spend substantial periods of time contemplating possible futures before acting. For such planning to be rational, the benefits of planning to future behavior must at least compensate for the time spent thinking. Here we capture these features of human behavior by developing a neural network model where not only actions, but also planning, are controlled by prefrontal cortex. This model consists of a meta-reinforcement learning agent augmented with the ability to plan by sampling imagined action sequences drawn from its own policy, which we refer to as 'rollouts'. Our results demonstrate that this agent learns to plan when planning is beneficial, explaining the empirical variability in human thinking times. Additionally, the patterns of policy rollouts employed by the artificial agent closely resemble patterns of rodent hippocampal replays recently recorded in a spatial navigation task, in terms of both their spatial statistics and their relationship to subsequent behavior. Our work provides a new theory of how the brain could implement planning through prefrontal-hippocampal interactions, where hippocampal replays are triggered by - and in turn adaptively affect - prefrontal dynamics.
Feedback control in the nervous system: from cells and circuits to behaviour
The nervous system is fundamentally a closed loop control device: the output of actions continually influences the internal state and subsequent actions. This is true at the single cell and even the molecular level, where “actions” take the form of signals that are fed back to achieve a variety of functions, including homeostasis, excitability and various kinds of multistability that allow switching and storage of memory. It is also true at the behavioural level, where an animal’s motor actions directly influence sensory input on short timescales, and higher level information about goals and intended actions are continually updated on the basis of current and past actions. Studying the brain in a closed loop setting requires a multidisciplinary approach, leveraging engineering and theory as well as advances in measuring and manipulating the nervous system. I will describe our recent attempts to achieve this fusion of approaches at multiple levels in the nervous system, from synaptic signalling to closed loop brain machine interfaces.
Quasicriticality and the quest for a framework of neuronal dynamics
Critical phenomena abound in nature, from forest fires and earthquakes to avalanches in sand and neuronal activity. Since the 2003 publication by Beggs & Plenz on neuronal avalanches, a growing body of work suggests that the brain homeostatically regulates itself to operate near a critical point where information processing is optimal. At this critical point, incoming activity is neither amplified (supercritical) nor damped (subcritical), but approximately preserved as it passes through neural networks. Departures from the critical point have been associated with conditions of poor neurological health like epilepsy, Alzheimer's disease, and depression. One complication that arises from this picture is that the critical point assumes no external input. But, biological neural networks are constantly bombarded by external input. How is then the brain able to homeostatically adapt near the critical point? We’ll see that the theory of quasicriticality, an organizing principle for brain dynamics, can account for this paradoxical situation. As external stimuli drive the cortex, quasicriticality predicts a departure from criticality while maintaining optimal properties for information transmission. We’ll see that simulations and experimental data confirm these predictions and describe new ones that could be tested soon. More importantly, we will see how this organizing principle could help in the search for biomarkers that could soon be tested in clinical studies.
Signatures of criticality in efficient coding networks
The critical brain hypothesis states that the brain can benefit from operating close to a second-order phase transition. While it has been shown that several computational aspects of sensory information processing (e.g., sensitivity to input) are optimal in this regime, it is still unclear whether these computational benefits of criticality can be leveraged by neural systems performing behaviorally relevant computations. To address this question, we investigate signatures of criticality in networks optimized to perform efficient encoding. We consider a network of leaky integrate-and-fire neurons with synaptic transmission delays and input noise. Previously, it was shown that the performance of such networks varies non-monotonically with the noise amplitude. Interestingly, we find that in the vicinity of the optimal noise level for efficient coding, the network dynamics exhibits signatures of criticality, namely, the distribution of avalanche sizes follows a power law. When the noise amplitude is too low or too high for efficient coding, the network appears either super-critical or sub-critical, respectively. This result suggests that two influential, and previously disparate theories of neural processing optimization—efficient coding, and criticality—may be intimately related
The centrality of population-level factors to network computation is demonstrated by a versatile approach for training spiking networks
Neural activity is often described in terms of population-level factors extracted from the responses of many neurons. Factors provide a lower-dimensional description with the aim of shedding light on network computations. Yet, mechanistically, computations are performed not by continuously valued factors but by interactions among neurons that spike discretely and variably. Models provide a means of bridging these levels of description. We developed a general method for training model networks of spiking neurons by leveraging factors extracted from either data or firing-rate-based networks. In addition to providing a useful model-building framework, this formalism illustrates how reliable and continuously valued factors can arise from seemingly stochastic spiking. Our framework establishes procedures for embedding this property in network models with different levels of realism. The relationship between spikes and factors in such networks provides a foundation for interpreting (and subtly redefining) commonly used quantities such as firing rates.
Learning through the eyes and ears of a child
Young children have sophisticated representations of their visual and linguistic environment. Where do these representations come from? How much knowledge arises through generic learning mechanisms applied to sensory data, and how much requires more substantive (possibly innate) inductive biases? We examine these questions by training neural networks solely on longitudinal data collected from a single child (Sullivan et al., 2020), consisting of egocentric video and audio streams. Our principal findings are as follows: 1) Based on visual only training, neural networks can acquire high-level visual features that are broadly useful across categorization and segmentation tasks. 2) Based on language only training, networks can acquire meaningful clusters of words and sentence-level syntactic sensitivity. 3) Based on paired visual and language training, networks can acquire word-referent mappings from tens of noisy examples and align their multi-modal conceptual systems. Taken together, our results show how sophisticated visual and linguistic representations can arise through data-driven learning applied to one child’s first-person experience.
Assigning credit through the "other” connectome
Learning in neural networks requires assigning the right values to thousands to trillions or more of individual connections, so that the network as a whole produces the desired behavior. Neuroscientists have gained insights into this “credit assignment” problem through decades of experimental, modeling, and theoretical studies. This has suggested key roles for synaptic eligibility traces and top-down feedback signals, among other factors. Here we study the potential contribution of another type of signaling that is being revealed in greater and greater fidelity by ongoing molecular and genomics studies. This is the set of modulatory pathways local to a given circuit, which form an intriguing second type of connectome overlayed on top of synaptic connectivity. We will share ongoing modeling and theoretical work that explores the possible roles of this local modulatory connectome in network learning.
The strongly recurrent regime of cortical networks
Modern electrophysiological recordings simultaneously capture single-unit spiking activities of hundreds of neurons. These neurons exhibit highly complex coordination patterns. Where does this complexity stem from? One candidate is the ubiquitous heterogeneity in connectivity of local neural circuits. Studying neural network dynamics in the linearized regime and using tools from statistical field theory of disordered systems, we derive relations between structure and dynamics that are readily applicable to subsampled recordings of neural circuits: Measuring the statistics of pairwise covariances allows us to infer statistical properties of the underlying connectivity. Applying our results to spontaneous activity of macaque motor cortex, we find that the underlying network operates in a strongly recurrent regime. In this regime, network connectivity is highly heterogeneous, as quantified by a large radius of bulk connectivity eigenvalues. Being close to the point of linear instability, this dynamical regime predicts a rich correlation structure, a large dynamical repertoire, long-range interaction patterns, relatively low dimensionality and a sensitive control of neuronal coordination. These predictions are verified in analyses of spontaneous activity of macaque motor cortex and mouse visual cortex. Finally, we show that even microscopic features of connectivity, such as connection motifs, systematically scale up to determine the global organization of activity in neural circuits.
Are place cells just memory cells? Probably yes
Neurons in the rodent hippocampus appear to encode the position of the animal in physical space during movement. Individual ``place cells'' fire in restricted sub-regions of an environment, a feature often taken as evidence that the hippocampus encodes a map of space that subserves navigation. But these same neurons exhibit complex responses to many other variables that defy explanation by position alone, and the hippocampus is known to be more broadly critical for memory formation. Here we elaborate and test a theory of hippocampal coding which produces place cells as a general consequence of efficient memory coding. We constructed neural networks that actively exploit the correlations between memories in order to learn compressed representations of experience. Place cells readily emerged in the trained model, due to the correlations in sensory input between experiences at nearby locations. Notably, these properties were highly sensitive to the compressibility of the sensory environment, with place field size and population coding level in dynamic opposition to optimally encode the correlations between experiences. The effects of learning were also strongly biphasic: nearby locations are represented more similarly following training, while locations with intermediate similarity become increasingly decorrelated, both distance-dependent effects that scaled with the compressibility of the input features. Using virtual reality and 2-photon functional calcium imaging in head-fixed mice, we recorded the simultaneous activity of thousands of hippocampal neurons during virtual exploration to test these predictions. Varying the compressibility of sensory information in the environment produced systematic changes in place cell properties that reflected the changing input statistics, consistent with the theory. We similarly identified representational plasticity during learning, which produced a distance-dependent exchange between compression and pattern separation. These results motivate a more domain-general interpretation of hippocampal computation, one that is naturally compatible with earlier theories on the circuit's importance for episodic memory formation. Work done in collaboration with James Priestley, Lorenzo Posani, Marcus Benna, Attila Losonczy.
Learning to see stuff
Humans are very good at visually recognizing materials and inferring their properties. Without touching surfaces, we can usually tell what they would feel like, and we enjoy vivid visual intuitions about how they typically behave. This is impressive because the retinal image that the visual system receives as input is the result of complex interactions between many physical processes. Somehow the brain has to disentangle these different factors. I will present some recent work in which we show that an unsupervised neural network trained on images of surfaces spontaneously learns to disentangle reflectance, lighting and shape. However, the disentanglement is not perfect, and we find that as a result the network not only predicts the broad successes of human gloss perception, but also the specific pattern of errors that humans exhibit on an image-by-image basis. I will argue this has important implications for thinking about appearance and vision more broadly.
Understanding Machine Learning via Exactly Solvable Statistical Physics Models
The affinity between statistical physics and machine learning has a long history. I will describe the main lines of this long-lasting friendship in the context of current theoretical challenges and open questions about deep learning. Theoretical physics often proceeds in terms of solvable synthetic models, I will describe the related line of work on solvable models of simple feed-forward neural networks. I will highlight a path forward to capture the subtle interplay between the structure of the data, the architecture of the network, and the optimization algorithms commonly used for learning.
Spatially-embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings
Brain networks exist within the confines of resource limitations. As a result, a brain network must overcome metabolic costs of growing and sustaining the network within its physical space, while simultaneously implementing its required information processing. To observe the effect of these processes, we introduce the spatially-embedded recurrent neural network (seRNN). seRNNs learn basic task-related inferences while existing within a 3D Euclidean space, where the communication of constituent neurons is constrained by a sparse connectome. We find that seRNNs, similar to primate cerebral cortices, naturally converge on solving inferences using modular small-world networks, in which functionally similar units spatially configure themselves to utilize an energetically-efficient mixed-selective code. As all these features emerge in unison, seRNNs reveal how many common structural and functional brain motifs are strongly intertwined and can be attributed to basic biological optimization processes. seRNNs can serve as model systems to bridge between structural and functional research communities to move neuroscientific understanding forward.
Meta-learning functional plasticity rules in neural networks
Synaptic plasticity is known to be a key player in the brain’s life-long learning abilities. However, due to experimental limitations, the nature of the local changes at individual synapses and their link with emerging network-level computations remain unclear. I will present a numerical, meta-learning approach to deduce plasticity rules from either neuronal activity data and/or prior knowledge about the network's computation. I will first show how to recover known rules, given a human-designed loss function in rate networks, or directly from data, using an adversarial approach. Then I will present how to scale-up this approach to recurrent spiking networks using simulation-based inference.
Extracting computational mechanisms from neural data using low-rank RNNs
An influential theory in systems neuroscience suggests that brain function can be understood through low-dimensional dynamics [Vyas et al 2020]. However, a challenge in this framework is that a single computational task may involve a range of dynamic processes. To understand which processes are at play in the brain, it is important to use data on neural activity to constrain models. In this study, we present a method for extracting low-dimensional dynamics from data using low-rank recurrent neural networks (lrRNNs), a highly expressive and understandable type of model [Mastrogiuseppe & Ostojic 2018, Dubreuil, Valente et al. 2022]. We first test our approach using synthetic data created from full-rank RNNs that have been trained on various brain tasks. We find that lrRNNs fitted to neural activity allow us to identify the collective computational processes and make new predictions for inactivations in the original RNNs. We then apply our method to data recorded from the prefrontal cortex of primates during a context-dependent decision-making task. Our approach enables us to assign computational roles to the different latent variables and provides a mechanistic model of the recorded dynamics, which can be used to perform in silico experiments like inactivations and provide testable predictions.
Geometry of concept learning
Understanding Human ability to learn novel concepts from just a few sensory experiences is a fundamental problem in cognitive neuroscience. I will describe a recent work with Ben Sorcher and Surya Ganguli (PNAS, October 2022) in which we propose a simple, biologically plausible, and mathematically tractable neural mechanism for few-shot learning of naturalistic concepts. We posit that the concepts that can be learned from few examples are defined by tightly circumscribed manifolds in the neural firing-rate space of higher-order sensory areas. Discrimination between novel concepts is performed by downstream neurons implementing ‘prototype’ decision rule, in which a test example is classified according to the nearest prototype constructed from the few training examples. We show that prototype few-shot learning achieves high few-shot learning accuracy on natural visual concepts using both macaque inferotemporal cortex representations and deep neural network (DNN) models of these representations. We develop a mathematical theory that links few-shot learning to the geometric properties of the neural concept manifolds and demonstrate its agreement with our numerical simulations across different DNNs as well as different layers. Intriguingly, we observe striking mismatches between the geometry of manifolds in intermediate stages of the primate visual pathway and in trained DNNs. Finally, we show that linguistic descriptors of visual concepts can be used to discriminate images belonging to novel concepts, without any prior visual experience of these concepts (a task known as ‘zero-shot’ learning), indicated a remarkable alignment of manifold representations of concepts in visual and language modalities. I will discuss ongoing effort to extend this work to other high level cognitive tasks.
Analyzing artificial neural networks to understand the brain
In the first part of this talk I will present work showing that recurrent neural networks can replicate broad behavioral patterns associated with dynamic visual object recognition in humans. An analysis of these networks shows that different types of recurrence use different strategies to solve the object recognition problem. The similarities between artificial neural networks and the brain presents another opportunity, beyond using them just as models of biological processing. In the second part of this talk, I will discuss—and solicit feedback on—a proposed research plan for testing a wide range of analysis tools frequently applied to neural data on artificial neural networks. I will present the motivation for this approach as well as the form the results could take and how this would benefit neuroscience.
Convex neural codes in recurrent networks and sensory systems
Neural activity in many sensory systems is organized on low-dimensional manifolds by means of convex receptive fields. Neural codes in these areas are constrained by this organization, as not every neural code is compatible with convex receptive fields. The same codes are also constrained by the structure of the underlying neural network. In my talk I will attempt to provide answers to the following natural questions: (i) How do recurrent circuits generate codes that are compatible with the convexity of receptive fields? (ii) How can we utilize the constraints imposed by the convex receptive field to understand the underlying stimulus space. To answer question (i), we describe the combinatorics of the steady states and fixed points of recurrent networks that satisfy the Dale’s law. It turns out the combinatorics of the fixed points are completely determined by two distinct conditions: (a) the connectivity graph of the network and (b) a spectral condition on the synaptic matrix. We give a characterization of exactly which features of connectivity determine the combinatorics of the fixed points. We also find that a generic recurrent network that satisfies Dale's law outputs convex combinatorial codes. To address question (ii), I will describe methods based on ideas from topology and geometry that take advantage of the convex receptive field properties to infer the dimension of (non-linear) neural representations. I will illustrate the first method by inferring basic features of the neural representations in the mouse olfactory bulb.
Can a single neuron solve MNIST? Neural computation of machine learning tasks emerges from the interaction of dendritic properties
Physiological experiments have highlighted how the dendrites of biological neurons can nonlinearly process distributed synaptic inputs. However, it is unclear how qualitative aspects of a dendritic tree, such as its branched morphology, its repetition of presynaptic inputs, voltage-gated ion channels, electrical properties and complex synapses, determine neural computation beyond this apparent nonlinearity. While it has been speculated that the dendritic tree of a neuron can be seen as a multi-layer neural network and it has been shown that such an architecture could be computationally strong, we do not know if that computational strength is preserved under these qualitative biological constraints. Here we simulate multi-layer neural network models of dendritic computation with and without these constraints. We find that dendritic model performance on interesting machine learning tasks is not hurt by most of these constraints and may synergistically benefit from all of them combined. Our results suggest that single real dendritic trees may be able to learn a surprisingly broad range of tasks through the emergent capabilities afforded by their properties.
Connecting performance benefits on visual tasks to neural mechanisms using convolutional neural networks
Behavioral studies have demonstrated that certain task features reliably enhance classification performance for challenging visual stimuli. These include extended image presentation time and the valid cueing of attention. Here, I will show how convolutional neural networks can be used as a model of the visual system that connects neural activity changes with such performance changes. Specifically, I will discuss how different anatomical forms of recurrence can account for better classification of noisy and degraded images with extended processing time. I will then show how experimentally-observed neural activity changes associated with feature attention lead to observed performance changes on detection tasks. I will also discuss the implications these results have for how we identify the neural mechanisms and architectures important for behavior.
Circuit solutions for programming actions
The hippocampus is one of the few regions in the adult mammalian brain which is endowed with life-long neurogenesis. Despite intense investigation, it remains unclear how neurons newly-generated may retain unique functions that contribute to modulate hippocampal information processing and cognition. In this talk, I will present some recent findings revealing how enhanced forms of plasticity in adult-born neurons underlie the way they become incorporated into pre-existing networks in response to experience.
Neural networks in the replica-mean field limits
In this talk, we propose to decipher the activity of neural networks via a “multiply and conquer” approach. This approach considers limit networks made of infinitely many replicas with the same basic neural structure. The key point is that these so-called replica-mean-field networks are in fact simplified, tractable versions of neural networks that retain important features of the finite network structure of interest. The finite size of neuronal populations and synaptic interactions is a core determinant of neural dynamics, being responsible for non-zero correlation in the spiking activity and for finite transition rates between metastable neural states. Theoretically, we develop our replica framework by expanding on ideas from the theory of communication networks rather than from statistical physics to establish Poissonian mean-field limits for spiking networks. Computationally, we leverage our original replica approach to characterize the stationary spiking activity of various network models via reduction to tractable functional equations. We conclude by discussing perspectives about how to use our replica framework to probe nontrivial regimes of spiking correlations and transition rates between metastable neural states.
Bridging the gap between artificial models and cortical circuits
Artificial neural networks simplify complex biological circuits into tractable models for computational exploration and experimentation. However, the simplification of artificial models also undermines their applicability to real brain dynamics. Typical efforts to address this mismatch add complexity to increasingly unwieldy models. Here, we take a different approach; by reducing the complexity of a biological cortical culture, we aim to distil the essential factors of neuronal dynamics and plasticity. We leverage recent advances in growing neurons from human induced pluripotent stem cells (hiPSCs) to analyse ex vivo cortical cultures with only two distinct excitatory and inhibitory neuron populations. Over 6 weeks of development, we record from thousands of neurons using high-density microelectrode arrays (HD-MEAs) that allow access to individual neurons and the broader population dynamics. We compare these dynamics to two-population artificial networks of single-compartment neurons with random sparse connections and show that they produce similar dynamics. Specifically, our model captures the firing and bursting statistics of the cultures. Moreover, tightly integrating models and cultures allows us to evaluate the impact of changing architectures over weeks of development, with and without external stimuli. Broadly, the use of simplified cortical cultures enables us to use the repertoire of theoretical neuroscience techniques established over the past decades on artificial network models. Our approach of deriving neural networks from human cells also allows us, for the first time, to directly compare neural dynamics of disease and control. We found that cultures e.g. from epilepsy patients tended to have increasingly more avalanches of synchronous activity over weeks of development, in contrast to the control cultures. Next, we will test possible interventions, in silico and in vitro, in a drive for personalised approaches to medical care. This work starts bridging an important theoretical-experimental neuroscience gap for advancing our understanding of mammalian neuron dynamics.
A biologically plausible inhibitory plasticity rule for world-model learning in SNNs
Memory consolidation is the process by which recent experiences are assimilated into long-term memory. In animals, this process requires the offline replay of sequences observed during online exploration in the hippocampus. Recent experimental work has found that salient but task-irrelevant stimuli are systematically excluded from these replay epochs, suggesting that replay samples from an abstracted model of the world, rather than verbatim previous experiences. We find that this phenomenon can be explained parsimoniously and biologically plausibly by a Hebbian spike time-dependent plasticity rule at inhibitory synapses. Using spiking networks at three levels of abstraction–leaky integrate-and-fire, biophysically detailed, and abstract binary–we show that this rule enables efficient inference of a model of the structure of the world. While plasticity has previously mainly been studied at excitatory synapses, we find that plasticity at excitatory synapses alone is insufficient to accomplish this type of structural learning. We present theoretical results in a simplified model showing that in the presence of Hebbian excitatory and inhibitory plasticity, the replayed sequences form a statistical estimator of a latent sequence, which converges asymptotically to the ground truth. Our work outlines a direct link between the synaptic and cognitive levels of memory consolidation, and highlights a potential conceptually distinct role for inhibition in computing with SNNs.
Merging insights from artificial and biological neural networks for neuromorphic intelligence
Training Dynamic Spiking Neural Network via Forward Propagation Through Time
With recent advances in learning algorithms, recurrent networks of spiking neurons are achieving performance competitive with standard recurrent neural networks. Still, these learning algorithms are limited to small networks of simple spiking neurons and modest-length temporal sequences, as they impose high memory requirements, have difficulty training complex neuron models, and are incompatible with online learning.Taking inspiration from the concept of Liquid Time-Constant (LTCs), we introduce a novel class of spiking neurons, the Liquid Time-Constant Spiking Neuron (LTC-SN), resulting in functionality similar to the gating operation in LSTMs. We integrate these neurons in SNNs that are trained with FPTT and demonstrate that thus trained LTC-SNNs outperform various SNNs trained with BPTT on long sequences while enabling online learning and drastically reducing memory complexity. We show this for several classical benchmarks that can easily be varied in sequence length, like the Add Task and the DVS-gesture benchmark. We also show how FPTT-trained LTC-SNNs can be applied to large convolutional SNNs, where we demonstrate novel state-of-the-art for online learning in SNNs on a number of standard benchmarks (S-MNIST, R-MNIST, DVS-GESTURE) and also show that large feedforward SNNs can be trained successfully in an online manner to near (Fashion-MNIST, DVS-CIFAR10) or exceeding (PS-MNIST, R-MNIST) state-of-the-art performance as obtained with offline BPTT. Finally, the training and memory efficiency of FPTT enables us to directly train SNNs in an end-to-end manner at network sizes and complexity that was previously infeasible: we demonstrate this by training in an end-to-end fashion the first deep and performant spiking neural network for object localization and recognition. Taken together, we out contribution enable for the first time training large-scale complex spiking neural network architectures online and on long temporal sequences.
Universal function approximation in balanced spiking networks through convex-concave boundary composition
The spike-threshold nonlinearity is a fundamental, yet enigmatic, component of biological computation — despite its role in many theories, it has evaded definitive characterisation. Indeed, much classic work has attempted to limit the focus on spiking by smoothing over the spike threshold or by approximating spiking dynamics with firing-rate dynamics. Here, we take a novel perspective that captures the full potential of spike-based computation. Based on previous studies of the geometry of efficient spike-coding networks, we consider a population of neurons with low-rank connectivity, allowing us to cast each neuron’s threshold as a boundary in a space of population modes, or latent variables. Each neuron divides this latent space into subthreshold and suprathreshold areas. We then demonstrate how a network of inhibitory (I) neurons forms a convex, attracting boundary in the latent coding space, and a network of excitatory (E) neurons forms a concave, repellant boundary. Finally, we show how the combination of the two yields stable dynamics at the crossing of the E and I boundaries, and can be mapped onto a constrained optimization problem. The resultant EI networks are balanced, inhibition-stabilized, and exhibit asynchronous irregular activity, thereby closely resembling cortical networks of the brain. Moreover, we demonstrate how such networks can be tuned to either suppress or amplify noise, and how the composition of inhibitory convex and excitatory concave boundaries can result in universal function approximation. Our work puts forth a new theory of biologically-plausible computation in balanced spiking networks, and could serve as a novel framework for scalable and interpretable computation with spikes.
Spiking Deep Learning with SpikingJelly
Behavioral Timescale Synaptic Plasticity (BTSP) for biologically plausible credit assignment across multiple layers via top-down gating of dendritic plasticity
A central problem in biological learning is how information about the outcome of a decision or behavior can be used to reliably guide learning across distributed neural circuits while obeying biological constraints. This “credit assignment” problem is commonly solved in artificial neural networks through supervised gradient descent and the backpropagation algorithm. In contrast, biological learning is typically modelled using unsupervised Hebbian learning rules. While these rules only use local information to update synaptic weights, and are sometimes combined with weight constraints to reflect a diversity of excitatory (only positive weights) and inhibitory (only negative weights) cell types, they do not prescribe a clear mechanism for how to coordinate learning across multiple layers and propagate error information accurately across the network. In recent years, several groups have drawn inspiration from the known dendritic non-linearities of pyramidal neurons to propose new learning rules and network architectures that enable biologically plausible multi-layer learning by processing error information in segregated dendrites. Meanwhile, recent experimental results from the hippocampus have revealed a new form of plasticity—Behavioral Timescale Synaptic Plasticity (BTSP)—in which large dendritic depolarizations rapidly reshape synaptic weights and stimulus selectivity with as little as a single stimulus presentation (“one-shot learning”). Here we explore the implications of this new learning rule through a biologically plausible implementation in a rate neuron network. We demonstrate that regulation of dendritic spiking and BTSP by top-down feedback signals can effectively coordinate plasticity across multiple network layers in a simple pattern recognition task. By analyzing hidden feature representations and weight trajectories during learning, we show the differences between networks trained with standard backpropagation, Hebbian learning rules, and BTSP.
Beyond Biologically Plausible Spiking Networks for Neuromorphic Computing
Biologically plausible spiking neural networks (SNNs) are an emerging architecture for deep learning tasks due to their energy efficiency when implemented on neuromorphic hardware. However, many of the biological features are at best irrelevant and at worst counterproductive when evaluated in the context of task performance and suitability for neuromorphic hardware. In this talk, I will present an alternative paradigm to design deep learning architectures with good task performance in real-world benchmarks while maintaining all the advantages of SNNs. We do this by focusing on two main features – event-based computation and activity sparsity. Starting from the performant gated recurrent unit (GRU) deep learning architecture, we modify it to make it event-based and activity-sparse. The resulting event-based GRU (EGRU) is extremely efficient for both training and inference. At the same time, it achieves performance close to conventional deep learning architectures in challenging tasks such as language modelling, gesture recognition and sequential MNIST.
Why dendrites matter for biological and artificial circuits
neural network coverage
50 items