Neural Network Models
neural network models
Prof. Dr. Friedemann Pulvermüller
Simulation studies with neural network models of language and cognition, Preparation, implementation and evaluation of neurocognitive experiments on language and cognition (ECoG, EEG, fMRI), Tractography analyzes and use of their results for optimizing neural models
Prof. Dr. Jakob Macke
We have several openings for Postdoctoral Researchers to work at the intersection of Machine Learning and Computational Neuroscience funded by the ERC Consolidator Grant “DeepCoMechTome: Using deep learning to understand computations in neural circuits with Connectome-constrained Mechanistic Models“. The goal of DeepCoMechTome is to develop simulation-based machine learning tools that will make it possible to build neural network models that are both biologically realistic and computationally powerful.
Maxime Carrière
The ERC Advanced Grant “Material Constraints Enabling Human Cognition (MatCo)” at the Freie Universität Berlin aims to build network models of the human brain that mimic neurocognitive processes involved in language, communication and cognition. A main strategy is to use neural network models constrained by neuroanatomical and neurophysiological features of the human brain in order to explain aspects of human cognition. To this end, neural network simulations are performed and evaluated in neurophysiological and neurometabolic experiments. This neurocomputational and experimental research targets novel explanations of human language and cognition on the basis of neurobiological principles. In the MatCo project, 3 positions are currently available: 1 full time position for a Scientific Researcher at the postdoctoral level Fixed-term (until 30.9.2025), Salary Scale 13 TV-L FU ID: WiMi_MatCo100_08-2022, 2 part time positions (65%) for Scientific Researchers at the predoctoral level Fixed-term (until 30.9.2025), Salary Scale 13 TV-L FU ID: WiMi_MatCo65_08-2022
Stefano Panzeri
The postdoctoral positions are focused on investigating how networks of neurons in the cerebral cortex encode, process, and transmit information to generate behaviors such as sensation and decision-making. The research involves developing and using information-theoretic and machine learning methods to study population coding, as well as neural network models to individuate mechanisms for neural information processing and generation of functions. The laboratory has extensive international collaborations and offers a well-funded research environment with opportunities for advanced training and personal scientific growth.
Kenji Doya
Multiple open research positions at the Neural Computation Unit at OIST, including: 1. Theory and experimental investigation of Bayesian inference and reinforcement learning by the cortex, basal ganglia, and neuromodulator systems. 2. Data-driven construction of neural network models and large-scale simulation. 3. Application of wearable devices for monitoring mind and body state to support healthy life. 4. Flexible and robust reinforcement learning and meta-learning algorithms. 5. Development of smartphone robots for multi-agent learning and embodied evolution.
Sensory cognition
This webinar features presentations from SueYeon Chung (New York University) and Srinivas Turaga (HHMI Janelia Research Campus) on theoretical and computational approaches to sensory cognition. Chung introduced a “neural manifold” framework to capture how high-dimensional neural activity is structured into meaningful manifolds reflecting object representations. She demonstrated that manifold geometry—shaped by radius, dimensionality, and correlations—directly governs a population’s capacity for classifying or separating stimuli under nuisance variations. Applying these ideas as a data analysis tool, she showed how measuring object-manifold geometry can explain transformations along the ventral visual stream and suggested that manifold principles also yield better self-supervised neural network models resembling mammalian visual cortex. Turaga described simulating the entire fruit fly visual pathway using its connectome, modeling 64 key cell types in the optic lobe. His team’s systematic approach—combining sparse connectivity from electron microscopy with simple dynamical parameters—recapitulated known motion-selective responses and produced novel testable predictions. Together, these studies underscore the power of combining connectomic detail, task objectives, and geometric theories to unravel neural computations bridging from stimuli to cognitive functions.
Brain-Wide Compositionality and Learning Dynamics in Biological Agents
Biological agents continually reconcile the internal states of their brain circuits with incoming sensory and environmental evidence to evaluate when and how to act. The brains of biological agents, including animals and humans, exploit many evolutionary innovations, chiefly modularity—observable at the level of anatomically-defined brain regions, cortical layers, and cell types among others—that can be repurposed in a compositional manner to endow the animal with a highly flexible behavioral repertoire. Accordingly, their behaviors show their own modularity, yet such behavioral modules seldom correspond directly to traditional notions of modularity in brains. It remains unclear how to link neural and behavioral modularity in a compositional manner. We propose a comprehensive framework—compositional modes—to identify overarching compositionality spanning specialized submodules, such as brain regions. Our framework directly links the behavioral repertoire with distributed patterns of population activity, brain-wide, at multiple concurrent spatial and temporal scales. Using whole-brain recordings of zebrafish brains, we introduce an unsupervised pipeline based on neural network models, constrained by biological data, to reveal highly conserved compositional modes across individuals despite the naturalistic (spontaneous or task-independent) nature of their behaviors. These modes provided a scaffolding for other modes that account for the idiosyncratic behavior of each fish. We then demonstrate experimentally that compositional modes can be manipulated in a consistent manner by behavioral and pharmacological perturbations. Our results demonstrate that even natural behavior in different individuals can be decomposed and understood using a relatively small number of neurobehavioral modules—the compositional modes—and elucidate a compositional neural basis of behavior. This approach aligns with recent progress in understanding how reasoning capabilities and internal representational structures develop over the course of learning or training, offering insights into the modularity and flexibility in artificial and biological agents.
Can a single neuron solve MNIST? Neural computation of machine learning tasks emerges from the interaction of dendritic properties
Physiological experiments have highlighted how the dendrites of biological neurons can nonlinearly process distributed synaptic inputs. However, it is unclear how qualitative aspects of a dendritic tree, such as its branched morphology, its repetition of presynaptic inputs, voltage-gated ion channels, electrical properties and complex synapses, determine neural computation beyond this apparent nonlinearity. While it has been speculated that the dendritic tree of a neuron can be seen as a multi-layer neural network and it has been shown that such an architecture could be computationally strong, we do not know if that computational strength is preserved under these qualitative biological constraints. Here we simulate multi-layer neural network models of dendritic computation with and without these constraints. We find that dendritic model performance on interesting machine learning tasks is not hurt by most of these constraints and may synergistically benefit from all of them combined. Our results suggest that single real dendritic trees may be able to learn a surprisingly broad range of tasks through the emergent capabilities afforded by their properties.
A Game Theoretical Framework for Quantifying Causes in Neural Networks
Which nodes in a brain network causally influence one another, and how do such interactions utilize the underlying structural connectivity? One of the fundamental goals of neuroscience is to pinpoint such causal relations. Conventionally, these relationships are established by manipulating a node while tracking changes in another node. A causal role is then assigned to the first node if this intervention led to a significant change in the state of the tracked node. In this presentation, I use a series of intuitive thought experiments to demonstrate the methodological shortcomings of the current ‘causation via manipulation’ framework. Namely, a node might causally influence another node, but how much and through which mechanistic interactions? Therefore, establishing a causal relationship, however reliable, does not provide the proper causal understanding of the system, because there often exists a wide range of causal influences that require to be adequately decomposed. To do so, I introduce a game-theoretical framework called Multi-perturbation Shapley value Analysis (MSA). Then, I present our work in which we employed MSA on an Echo State Network (ESN), quantified how much its nodes were influencing each other, and compared these measures with the underlying synaptic strength. We found that: 1. Even though the network itself was sparse, every node could causally influence other nodes. In this case, a mere elucidation of causal relationships did not provide any useful information. 2. Additionally, the full knowledge of the structural connectome did not provide a complete causal picture of the system either, since nodes frequently influenced each other indirectly, that is, via other intermediate nodes. Our results show that just elucidating causal contributions in complex networks such as the brain is not sufficient to draw mechanistic conclusions. Moreover, quantifying causal interactions requires a systematic and extensive manipulation framework. The framework put forward here benefits from employing neural network models, and in turn, provides explainability for them.
What does the primary visual cortex tell us about object recognition?
Object recognition relies on the complex visual representations in cortical areas at the top of the ventral stream hierarchy. While these are thought to be derived from low-level stages of visual processing, this has not been shown, yet. Here, I describe the results of two projects exploring the contributions of primary visual cortex (V1) processing to object recognition using artificial neural networks (ANNs). First, we developed hundreds of ANN-based V1 models and evaluated how their single neurons approximate those in the macaque V1. We found that, for some models, single neurons in intermediate layers are similar to their biological counterparts, and that the distributions of their response properties approximately match those in V1. Furthermore, we observed that models that better matched macaque V1 were also more aligned with human behavior, suggesting that object recognition is derived from low-level. Motivated by these results, we then studied how an ANN’s robustness to image perturbations relates to its ability to predict V1 responses. Despite their high performance in object recognition tasks, ANNs can be fooled by imperceptibly small, explicitly crafted perturbations. We observed that ANNs that better predicted V1 neuronal activity were also more robust to adversarial attacks. Inspired by this, we developed VOneNets, a new class of hybrid ANN vision models. Each VOneNet contains a fixed neural network front-end that simulates primate V1 followed by a neural network back-end adapted from current computer vision models. After training, VOneNets were substantially more robust, outperforming state-of-the-art methods on a set of perturbations. While current neural network architectures are arguably brain-inspired, these results demonstrate that more precisely mimicking just one stage of the primate visual system leads to new gains in computer vision applications and results in better models of the primate ventral stream and object recognition behavior.
NMC4 Keynote: Formation and update of sensory priors in working memory and perceptual decision making tasks
The world around us is complex, but at the same time full of meaningful regularities. We can detect, learn and exploit these regularities automatically in an unsupervised manner i.e. without any direct instruction or explicit reward. For example, we effortlessly estimate the average tallness of people in a room, or the boundaries between words in a language. These regularities and prior knowledge, once learned, can affect the way we acquire and interpret new information to build and update our internal model of the world for future decision-making processes. Despite the ubiquity of passively learning from the structured information in the environment, the mechanisms that support learning from real-world experience are largely unknown. By combing sophisticated cognitive tasks in human and rats, neuronal measurements and perturbations in rat and network modelling, we aim to build a multi-level description of how sensory history is utilised in inferring regularities in temporally extended tasks. In this talk, I will specifically focus on a comparative rat and human model, in combination with neural network models to study how past sensory experiences are utilized to impact working memory and decision making behaviours.
NMC4 Short Talk: Hypothesis-neutral response-optimized models of higher-order visual cortex reveal strong semantic selectivity
Modeling neural responses to naturalistic stimuli has been instrumental in advancing our understanding of the visual system. Dominant computational modeling efforts in this direction have been deeply rooted in preconceived hypotheses. In contrast, hypothesis-neutral computational methodologies with minimal apriorism which bring neuroscience data directly to bear on the model development process are likely to be much more flexible and effective in modeling and understanding tuning properties throughout the visual system. In this study, we develop a hypothesis-neutral approach and characterize response selectivity in the human visual cortex exhaustively and systematically via response-optimized deep neural network models. First, we leverage the unprecedented scale and quality of the recently released Natural Scenes Dataset to constrain parametrized neural models of higher-order visual systems and achieve novel predictive precision, in some cases, significantly outperforming the predictive success of state-of-the-art task-optimized models. Next, we ask what kinds of functional properties emerge spontaneously in these response-optimized models? We examine trained networks through structural ( feature visualizations) as well as functional analysis (feature verbalizations) by running `virtual' fMRI experiments on large-scale probe datasets. Strikingly, despite no category-level supervision, since the models are solely optimized for brain response prediction from scratch, the units in the networks after optimization act as detectors for semantic concepts like `faces' or `words', thereby providing one of the strongest evidences for categorical selectivity in these visual areas. The observed selectivity in model neurons raises another question: are the category-selective units simply functioning as detectors for their preferred category or are they a by-product of a non-category-specific visual processing mechanism? To investigate this, we create selective deprivations in the visual diet of these response-optimized networks and study semantic selectivity in the resulting `deprived' networks, thereby also shedding light on the role of specific visual experiences in shaping neuronal tuning. Together with this new class of data-driven models and novel model interpretability techniques, our study illustrates that DNN models of visual cortex need not be conceived as obscure models with limited explanatory power, rather as powerful, unifying tools for probing the nature of representations and computations in the brain.
Neural network models of binocular depth perception
Our visual experience of living in a three-dimensional world is created from the information contained in the two-dimensional images projected into our eyes. The overlapping visual fields of the two eyes mean that their images are highly correlated, and that the small differences that are present represent an important cue to depth. Binocular neurons encode this information in a way that both maximises efficiency and optimises disparity tuning for the depth structures that are found in our natural environment. Neural network models provide a clear account of how these binocular neurons encode the local binocular disparity in images. These models can be expanded to multi-layer models that are sensitive to salient features of scenes, such as the orientations and discontinuities between surfaces. These deep neural network models have also shown the importance of binocular disparity for the segmentation of images into separate objects, in addition to the estimation of distance. These results demonstrate the usefulness of machine learning approaches as a tool for understanding biological vision.
Towards a neurally mechanistic understanding of visual cognition
I am interested in developing a neurally mechanistic understanding of how primate brains represent the world through its visual system and how such representations enable a remarkable set of intelligent behaviors. In this talk, I will primarily highlight aspects of my current research that focuses on dissecting the brain circuits that support core object recognition behavior (primates’ ability to categorize objects within hundreds of milliseconds) in non-human primates. On the one hand, my work empirically examines how well computational models of the primate ventral visual pathways embed knowledge of the visual brain function (e.g., Bashivan*, Kar*, DiCarlo, Science, 2019). On the other hand, my work has led to various functional and architectural insights that help improve such brain models. For instance, we have exposed the necessity of recurrent computations in primate core object recognition (Kar et al., Nature Neuroscience, 2019), one that is strikingly missing from most feedforward artificial neural network models. Specifically, we have observed that the primate ventral stream requires fast recurrent processing via ventrolateral PFC for robust core object recognition (Kar and DiCarlo, Neuron, 2021). In addition, I have been currently developing various chemogenetic strategies to causally target specific bidirectional neural circuits in the macaque brain during multiple object recognition tasks to further probe their relevance during this behavior. I plan to transform these data and insights into tangible progress in neuroscience via my collaboration with various computational groups and building improved brain models of object recognition. I hope to end the talk with a brief glimpse of some of my planned future work!
How Brain Circuits Function in Health and Disease: Understanding Brain-wide Current Flow
Dr. Rajan and her lab design neural network models based on experimental data, and reverse-engineer them to figure out how brain circuits function in health and disease. They recently developed a powerful framework for tracing neural paths across multiple brain regions— called Current-Based Decomposition (CURBD). This new approach enables the computation of excitatory and inhibitory input currents that drive a given neuron, aiding in the discovery of how entire populations of neurons behave across multiple interacting brain regions. Dr. Rajan’s team has applied this method to studying the neural underpinnings of behavior. As an example, when CURBD was applied to data gathered from an animal model often used to study depression- and anxiety-like behaviors (i.e., learned helplessness) the underlying biology driving adaptive and maladaptive behaviors in the face of stress was revealed. With this framework Dr. Rajan's team probes for mechanisms at work across brain regions that support both healthy and disease states-- as well as identify key divergences from multiple different nervous systems, including zebrafish, mice, non-human primates, and humans.
Inferring brain-wide interactions using data-constrained recurrent neural network models
Behavior arises from the coordinated activity of numerous distinct brain regions. Modern experimental tools allow access to neural populations brain-wide, yet understanding such large-scale datasets necessitates scalable computational models to extract meaningful features of inter-region communication. In this talk, I will introduce Current-Based Decomposition (CURBD), an approach for inferring multi-region interactions using data-constrained recurrent neural network models. I will first show that CURBD accurately isolates inter-region currents in simulated networks with known dynamics. I will then apply CURBD to understand the brain-wide flow of information leading to behavioral state transitions in larval zebrafish. These examples will establish CURBD as a flexible, scalable framework to infer brain-wide interactions that are inaccessible from experimental measurements alone.
Untangling brain wide current flow using neural network models
Rajanlab designs neural network models constrained by experimental data, and reverse engineers them to figure out how brain circuits function in health and disease. Recently, we have been developing a powerful new theory-based framework for “in-vivo tract tracing” from multi-regional neural activity collected experimentally. We call this framework CURrent-Based Decomposition (CURBD). CURBD employs recurrent neural networks (RNNs) directly constrained, from the outset, by time series measurements acquired experimentally, such as Ca2+ imaging or electrophysiological data. Once trained, these data-constrained RNNs let us infer matrices quantifying the interactions between all pairs of modeled units. Such model-derived “directed interaction matrices” can then be used to separately compute excitatory and inhibitory input currents that drive a given neuron from all other neurons. Therefore different current sources can be de-mixed – either within the same region or from other regions, potentially brain-wide – which collectively give rise to the population dynamics observed experimentally. Source de-mixed currents obtained through CURBD allow an unprecedented view into multi-region mechanisms inaccessible from measurements alone. We have applied this method successfully to several types of neural data from our experimental collaborators, e.g., zebrafish (Deisseroth lab, Stanford), mice (Harvey lab, Harvard), monkeys (Rudebeck lab, Sinai), and humans (Rutishauser lab, Cedars Sinai), where we have discovered both directed interactions brain wide and inter-area currents during different types of behaviors. With this powerful framework based on data-constrained multi-region RNNs and CURrent Based Decomposition (CURBD), we ask if there are conserved multi-region mechanisms across different species, as well as identify key divergences.
Neural network models – analysis of their spontaneous activity and their response to single-neuron stimulation
Inferring brain-wide current flow using data-constrained neural network models
Rajanlab designs neural network models constrained by experimental data, and reverse engineers them to figure out how brain circuits function in health and disease. Recently, we have been developing a powerful new theory-based framework for “in-vivo tract tracing” from multi-regional neural activity collected experimentally. We call this framework CURrent-Based Decomposition (CURBD). CURBD employs recurrent neural networks (RNNs) directly constrained, from the outset, by time series measurements acquired experimentally, such as Ca2+ imaging or electrophysiological data. Once trained, these data-constrained RNNs let us infer matrices quantifying the interactions between all pairs of modeled units. Such model-derived “directed interaction matrices” can then be used to separately compute excitatory and inhibitory input currents that drive a given neuron from all other neurons. Therefore different current sources can be de-mixed – either within the same region or from other regions, potentially brain-wide – which collectively give rise to the population dynamics observed experimentally. Source de-mixed currents obtained through CURBD allow an unprecedented view into multi-region mechanisms inaccessible from measurements alone. We have applied this method successfully to several types of neural data from our experimental collaborators, e.g., zebrafish (Deisseroth lab, Stanford), mice (Harvey lab, Harvard), monkeys (Rudebeck lab, Sinai), and humans (Rutishauser lab, Cedars Sinai), where we have discovered both directed interactions brain wide and inter-area currents during different types of behaviors. With this framework based on data-constrained multi-region RNNs and CURrent Based Decomposition (CURBD), we can ask if there are conserved multi-region mechanisms across different species, as well as identify key divergences.
Neural and computational principles of the processing of dynamic faces and bodies
Body motion is a fundamental signal of social communication. This includes facial as well as full-body movements. Combining advanced methods from computer animation with motion capture in humans and monkeys, we synthesized highly-realistic monkey avatar models. Our face avatar is perceived by monkeys as almost equivalent to a real animal, and does not induce an ‘uncanny valley effect’, unlike all other previously used avatar models in studies with monkeys. Applying machine-learning methods for the control of motion style, we were able to investigate how species-specific shape and dynamic cues influence the perception of human and monkey facial expressions. Human observers showed very fast learning of monkey expressions, and a perceptual encoding of expression dynamics that was largely independent of facial shape. This result is in line with the fact that facial shape evolved faster than the neuromuscular control in primate phylogenesis. At the same time, it challenges popular neural network models of the recognition of dynamic faces that assume a joint encoding of facial shape and dynamics. We propose an alternative physiologically-inspired neural model that realizes such an orthogonal encoding of facial shape and expression from video sequences. As second example, we investigated the perception of social interactions from abstract stimuli, similar to the ones by Heider & Simmel (1944), and also from more realistic stimuli. We developed and validated a new generative model for the synthesis of such social interaction, which is based on a modification of human navigation model. We demonstrate that the recognition of such stimuli, including the perception of agency, can be accounted for by a relatively elementary physiologically-inspired hierarchical neural recognition model, that does not require the assumption of sophisticated inference mechanisms, as postulated by some cognitive theories of social recognition. Summarizing, this suggests that essential phenomena in social cognition might be accounted for by a small set of simple neural principles that can be easily implemented by cortical circuits. The developed technologies for stimulus control form the basis of electrophysiological studies that can verify specific neural circuits, as the ones proposed by our theoretical models.
Geometry of Neural Computation Unifies Working Memory and Planning
Cognitive tasks typically require the integration of working memory, contextual processing, and planning to be carried out in close coordination. However, these computations are typically studied within neuroscience as independent modular processes in the brain. In this talk I will present an alternative view, that neural representations of mappings between expected stimuli and contingent goal actions can unify working memory and planning computations. We term these stored maps contingency representations. We developed a "conditional delayed logic" task capable of disambiguating the types of representations used during performance of delay tasks. Human behaviour in this task is consistent with the contingency representation, and not with traditional sensory models of working memory. In task-optimized artificial recurrent neural network models, we investigated the representational geometry and dynamical circuit mechanisms supporting contingency-based computation, and show how contingency representation explains salient observations of neuronal tuning properties in prefrontal cortex. Finally, our theory generates novel and falsifiable predictions for single-unit and population neural recordings.
Evolutionary algorithms support recurrent plasticity in spiking neural network models of neocortical task learning
Bernstein Conference 2024
Excitatory and inhibitory neurons exhibit distinct roles for task learning, temporal scaling, and working memory in recurrent spiking neural network models of neocortex.
Bernstein Conference 2024
Identifying and adaptively perturbing compact deep neural network models of visual cortex
COSYNE 2022
Identifying and adaptively perturbing compact deep neural network models of visual cortex
COSYNE 2022
Automated identification of data-consistent spiking neural network models
COSYNE 2023
A novel deep neural network models two streams of visual processing from retina to cortex
COSYNE 2023
Spiking neural network models of developmental frequency acceleration in the mouse prefrontal cortex
FENS Forum 2024