Artificial Neural Network
artificial neural network
Error Consistency between Humans and Machines as a function of presentation duration
Within the last decade, Deep Artificial Neural Networks (DNNs) have emerged as powerful computer vision systems that match or exceed human performance on many benchmark tasks such as image classification. But whether current DNNs are suitable computational models of the human visual system remains an open question: While DNNs have proven to be capable of predicting neural activations in primate visual cortex, psychophysical experiments have shown behavioral differences between DNNs and human subjects, as quantified by error consistency. Error consistency is typically measured by briefly presenting natural or corrupted images to human subjects and asking them to perform an n-way classification task under time pressure. But for how long should stimuli ideally be presented to guarantee a fair comparison with DNNs? Here we investigate the influence of presentation time on error consistency, to test the hypothesis that higher-level processing drives behavioral differences. We systematically vary presentation times of backward-masked stimuli from 8.3ms to 266ms and measure human performance and reaction times on natural, lowpass-filtered and noisy images. Our experiment constitutes a fine-grained analysis of human image classification under both image corruptions and time pressure, showing that even drastically time-constrained humans who are exposed to the stimuli for only two frames, i.e. 16.6ms, can still solve our 8-way classification task with success rates way above chance. We also find that human-to-human error consistency is already stable at 16.6ms.
Reimagining the neuron as a controller: A novel model for Neuroscience and AI
We build upon and expand the efficient coding and predictive information models of neurons, presenting a novel perspective that neurons not only predict but also actively influence their future inputs through their outputs. We introduce the concept of neurons as feedback controllers of their environments, a role traditionally considered computationally demanding, particularly when the dynamical system characterizing the environment is unknown. By harnessing a novel data-driven control framework, we illustrate the feasibility of biological neurons functioning as effective feedback controllers. This innovative approach enables us to coherently explain various experimental findings that previously seemed unrelated. Our research has profound implications, potentially revolutionizing the modeling of neuronal circuits and paving the way for the creation of alternative, biologically inspired artificial neural networks.
The centrality of population-level factors to network computation is demonstrated by a versatile approach for training spiking networks
Neural activity is often described in terms of population-level factors extracted from the responses of many neurons. Factors provide a lower-dimensional description with the aim of shedding light on network computations. Yet, mechanistically, computations are performed not by continuously valued factors but by interactions among neurons that spike discretely and variably. Models provide a means of bridging these levels of description. We developed a general method for training model networks of spiking neurons by leveraging factors extracted from either data or firing-rate-based networks. In addition to providing a useful model-building framework, this formalism illustrates how reliable and continuously valued factors can arise from seemingly stochastic spiking. Our framework establishes procedures for embedding this property in network models with different levels of realism. The relationship between spikes and factors in such networks provides a foundation for interpreting (and subtly redefining) commonly used quantities such as firing rates.
Analyzing artificial neural networks to understand the brain
In the first part of this talk I will present work showing that recurrent neural networks can replicate broad behavioral patterns associated with dynamic visual object recognition in humans. An analysis of these networks shows that different types of recurrence use different strategies to solve the object recognition problem. The similarities between artificial neural networks and the brain presents another opportunity, beyond using them just as models of biological processing. In the second part of this talk, I will discuss—and solicit feedback on—a proposed research plan for testing a wide range of analysis tools frequently applied to neural data on artificial neural networks. I will present the motivation for this approach as well as the form the results could take and how this would benefit neuroscience.
Bridging the gap between artificial models and cortical circuits
Artificial neural networks simplify complex biological circuits into tractable models for computational exploration and experimentation. However, the simplification of artificial models also undermines their applicability to real brain dynamics. Typical efforts to address this mismatch add complexity to increasingly unwieldy models. Here, we take a different approach; by reducing the complexity of a biological cortical culture, we aim to distil the essential factors of neuronal dynamics and plasticity. We leverage recent advances in growing neurons from human induced pluripotent stem cells (hiPSCs) to analyse ex vivo cortical cultures with only two distinct excitatory and inhibitory neuron populations. Over 6 weeks of development, we record from thousands of neurons using high-density microelectrode arrays (HD-MEAs) that allow access to individual neurons and the broader population dynamics. We compare these dynamics to two-population artificial networks of single-compartment neurons with random sparse connections and show that they produce similar dynamics. Specifically, our model captures the firing and bursting statistics of the cultures. Moreover, tightly integrating models and cultures allows us to evaluate the impact of changing architectures over weeks of development, with and without external stimuli. Broadly, the use of simplified cortical cultures enables us to use the repertoire of theoretical neuroscience techniques established over the past decades on artificial network models. Our approach of deriving neural networks from human cells also allows us, for the first time, to directly compare neural dynamics of disease and control. We found that cultures e.g. from epilepsy patients tended to have increasingly more avalanches of synchronous activity over weeks of development, in contrast to the control cultures. Next, we will test possible interventions, in silico and in vitro, in a drive for personalised approaches to medical care. This work starts bridging an important theoretical-experimental neuroscience gap for advancing our understanding of mammalian neuron dynamics.
Behavioral Timescale Synaptic Plasticity (BTSP) for biologically plausible credit assignment across multiple layers via top-down gating of dendritic plasticity
A central problem in biological learning is how information about the outcome of a decision or behavior can be used to reliably guide learning across distributed neural circuits while obeying biological constraints. This “credit assignment” problem is commonly solved in artificial neural networks through supervised gradient descent and the backpropagation algorithm. In contrast, biological learning is typically modelled using unsupervised Hebbian learning rules. While these rules only use local information to update synaptic weights, and are sometimes combined with weight constraints to reflect a diversity of excitatory (only positive weights) and inhibitory (only negative weights) cell types, they do not prescribe a clear mechanism for how to coordinate learning across multiple layers and propagate error information accurately across the network. In recent years, several groups have drawn inspiration from the known dendritic non-linearities of pyramidal neurons to propose new learning rules and network architectures that enable biologically plausible multi-layer learning by processing error information in segregated dendrites. Meanwhile, recent experimental results from the hippocampus have revealed a new form of plasticity—Behavioral Timescale Synaptic Plasticity (BTSP)—in which large dendritic depolarizations rapidly reshape synaptic weights and stimulus selectivity with as little as a single stimulus presentation (“one-shot learning”). Here we explore the implications of this new learning rule through a biologically plausible implementation in a rate neuron network. We demonstrate that regulation of dendritic spiking and BTSP by top-down feedback signals can effectively coordinate plasticity across multiple network layers in a simple pattern recognition task. By analyzing hidden feature representations and weight trajectories during learning, we show the differences between networks trained with standard backpropagation, Hebbian learning rules, and BTSP.
Towards multi-system network models for cognitive neuroscience
Artificial neural networks can be useful for studying brain functions. In cognitive neuroscience, recurrent neural networks are often used to model cognitive functions. I will first offer my opinion on what is missing in the classical use of recurrent neural networks. Then I will discuss two lines of ongoing efforts in our group to move beyond the classical recurrent neural networks by studying multi-system neural networks (the talk will focus on two-system networks). These are networks that combine modules for several neural systems, such as vision, audition, prefrontal, hippocampal systems. I will showcase how multi-system networks can potentially be constrained by experimental data in fundamental ways and at scale.
Flexible multitask computation in recurrent networks utilizes shared dynamical motifs
Flexible computation is a hallmark of intelligent behavior. Yet, little is known about how neural networks contextually reconfigure for different computations. Humans are able to perform a new task without extensive training, presumably through the composition of elementary processes that were previously learned. Cognitive scientists have long hypothesized the possibility of a compositional neural code, where complex neural computations are made up of constituent components; however, the neural substrate underlying this structure remains elusive in biological and artificial neural networks. Here we identified an algorithmic neural substrate for compositional computation through the study of multitasking artificial recurrent neural networks. Dynamical systems analyses of networks revealed learned computational strategies that mirrored the modular subtask structure of the task-set used for training. Dynamical motifs such as attractors, decision boundaries and rotations were reused across different task computations. For example, tasks that required memory of a continuous circular variable repurposed the same ring attractor. We show that dynamical motifs are implemented by clusters of units and are reused across different contexts, allowing for flexibility and generalization of previously learned computation. Lesioning these clusters resulted in modular effects on network performance: a lesion that destroyed one dynamical motif only minimally perturbed the structure of other dynamical motifs. Finally, modular dynamical motifs could be reconfigured for fast transfer learning. After slow initial learning of dynamical motifs, a subsequent faster stage of learning reconfigured motifs to perform novel tasks. This work contributes to a more fundamental understanding of compositional computation underlying flexible general intelligence in neural systems. We present a conceptual framework that establishes dynamical motifs as a fundamental unit of computation, intermediate between the neuron and the network. As more whole brain imaging studies record neural activity from multiple specialized systems simultaneously, the framework of dynamical motifs will guide questions about specialization and generalization across brain regions.
Time as a continuous dimension in natural and artificial networks
Neural representations of time are central to our understanding of the world around us. I review cognitive, neurophysiological and theoretical work that converges on three simple ideas. First, the time of past events is remembered via populations of neurons with a continuum of functional time constants. Second, these time constants evenly tile the log time axis. This results in a neural Weber-Fechner scale for time which can support behavioral Weber-Fechner laws and characteristic behavioral effects in memory experiments. Third, these populations appear as dual pairs---one type of population contains cells that change firing rate monotonically over time and a second type of population that has circumscribed temporal receptive fields. These ideas can be used to build artificial neural networks that have novel properties. Of particular interest, a convolutional neural network built using these principles can generalize to arbitrary rescaling of its inputs. That is, after learning to perform a classification task on a time series presented at one speed, it successfully classifies stimuli presented slowed down or sped up. This result illustrates the point that this confluence of ideas originating in cognitive psychology and measured in the mammalian brain could have wide-reaching impacts on AI research.
Network science and network medicine: New strategies for understanding and treating the biological basis of mental ill-health
The last twenty years have witnessed extraordinarily rapid progress in basic neuroscience, including breakthrough technologies such as optogenetics, and the collection of unprecedented amounts of neuroimaging, genetic and other data relevant to neuroscience and mental health. However, the translation of this progress into improved understanding of brain function and dysfunction has been comparatively slow. As a result, the development of therapeutics for mental health has stagnated too. One central challenge has been to extract meaning from these large, complex, multivariate datasets, which requires a shift towards systems-level mathematical and computational approaches. A second challenge has been reconciling different scales of investigation, from genes and molecules to cells, circuits, tissue, whole-brain, and ultimately behaviour. In this talk I will describe several strands of work using mathematical, statistical, and bioinformatic methods to bridge these gaps. Topics will include: using artificial neural networks to link the organization of large-scale brain connectivity to cognitive function; using multivariate statistical methods to link disease-related changes in brain networks to the underlying biological processes; and using network-based approaches to move from genetic insights towards drug discovey. Finally, I will discuss how simple organisms such as C. elegans can serve to inspire, test, and validate new methods and insights in networks neuroscience.
Analogical Reasoning with Neuro-Symbolic AI
Knowledge discovery with computers requires a huge amount of search. Analogical reasoning is effective for efficient knowledge discovery. Therefore, we proposed analogical reasoning systems based on first-order predicate logic using Neuro-Symbolic AI. Neuro-Symbolic AI is a combination of Symbolic AI and artificial neural networks and has features that are easy for human interpretation and robust against data ambiguity and errors. We have implemented analogical reasoning systems by Neuro-symbolic AI models with word embedding which can represent similarity between words. Using the proposed systems, we efficiently extracted unknown rules from knowledge bases described in Prolog. The proposed method is the first case of analogical reasoning based on the first-order predicate logic using deep learning.
What does the primary visual cortex tell us about object recognition?
Object recognition relies on the complex visual representations in cortical areas at the top of the ventral stream hierarchy. While these are thought to be derived from low-level stages of visual processing, this has not been shown, yet. Here, I describe the results of two projects exploring the contributions of primary visual cortex (V1) processing to object recognition using artificial neural networks (ANNs). First, we developed hundreds of ANN-based V1 models and evaluated how their single neurons approximate those in the macaque V1. We found that, for some models, single neurons in intermediate layers are similar to their biological counterparts, and that the distributions of their response properties approximately match those in V1. Furthermore, we observed that models that better matched macaque V1 were also more aligned with human behavior, suggesting that object recognition is derived from low-level. Motivated by these results, we then studied how an ANN’s robustness to image perturbations relates to its ability to predict V1 responses. Despite their high performance in object recognition tasks, ANNs can be fooled by imperceptibly small, explicitly crafted perturbations. We observed that ANNs that better predicted V1 neuronal activity were also more robust to adversarial attacks. Inspired by this, we developed VOneNets, a new class of hybrid ANN vision models. Each VOneNet contains a fixed neural network front-end that simulates primate V1 followed by a neural network back-end adapted from current computer vision models. After training, VOneNets were substantially more robust, outperforming state-of-the-art methods on a set of perturbations. While current neural network architectures are arguably brain-inspired, these results demonstrate that more precisely mimicking just one stage of the primate visual system leads to new gains in computer vision applications and results in better models of the primate ventral stream and object recognition behavior.
NMC4 Short Talk: Rank similarity filters for computationally-efficient machine learning on high dimensional data
Real world datasets commonly contain nonlinearly separable classes, requiring nonlinear classifiers. However, these classifiers are less computationally efficient than their linear counterparts. This inefficiency wastes energy, resources and time. We were inspired by the efficiency of the brain to create a novel type of computationally efficient Artificial Neural Network (ANN) called Rank Similarity Filters. They can be used to both transform and classify nonlinearly separable datasets with many datapoints and dimensions. The weights of the filters are set using the rank orders of features in a datapoint, or optionally the 'confusion' adjusted ranks between features (determined from their distributions in the dataset). The activation strength of a filter determines its similarity to other points in the dataset, a measure based on cosine similarity. The activation of many Rank Similarity Filters transforms samples into a new nonlinear space suitable for linear classification (Rank Similarity Transform (RST)). We additionally used this method to create the nonlinear Rank Similarity Classifier (RSC), which is a fast and accurate multiclass classifier, and the nonlinear Rank Similarity Probabilistic Classifier (RSPC), which is an extension to the multilabel case. We evaluated the classifiers on multiple datasets and RSC is competitive with existing classifiers but with superior computational efficiency. Code for RST, RSC and RSPC is open source and was written in Python using the popular scikit-learn framework to make it easily accessible (https://github.com/KatharineShapcott/rank-similarity). In future extensions the algorithm can be applied to hardware suitable for the parallelization of an ANN (GPU) and a Spiking Neural Network (neuromorphic computing) with corresponding performance gains. This makes Rank Similarity Filters a promising biologically inspired solution to the problem of efficient analysis of nonlinearly separable data.
Norse: A library for gradient-based learning in Spiking Neural Networks
Norse aims to exploit the advantages of bio-inspired neural components, which are sparse and event-driven - a fundamental difference from artificial neural networks. Norse expands PyTorch with primitives for bio-inspired neural components, bringing you two advantages: a modern and proven infrastructure based on PyTorch and deep learning-compatible spiking neural network components.
Event-based Backpropagation for Exact Gradients in Spiking Neural Networks
Gradient-based optimization powered by the backpropagation algorithm proved to be the pivotal method in the training of non-spiking artificial neural networks. At the same time, spiking neural networks hold the promise for efficient processing of real-world sensory data by communicating using discrete events in continuous time. We derive the backpropagation algorithm for a recurrent network of spiking (leaky integrate-and-fire) neurons with hard thresholds and show that the backward dynamics amount to an event-based backpropagation of errors through time. Our derivation uses the jump conditions for partial derivatives at state discontinuities found by applying the implicit function theorem, allowing us to avoid approximations or substitutions. We find that the gradient exists and is finite almost everywhere in weight space, up to the null set where a membrane potential is precisely tangent to the threshold. Our presented algorithm, EventProp, computes the exact gradient with respect to a general loss function based on spike times and membrane potentials. Crucially, the algorithm allows for an event-based communication scheme in the backward phase, retaining the potential advantages of temporal sparsity afforded by spiking neural networks. We demonstrate the optimization of spiking networks using gradients computed via EventProp and the Yin-Yang and MNIST datasets with either a spike time-based or voltage-based loss function and report competitive performance. Our work supports the rigorous study of gradient-based optimization in spiking neural networks as well as the development of event-based neuromorphic architectures for the efficient training of spiking neural networks. While we consider the leaky integrate-and-fire model in this work, our methodology generalises to any neuron model defined as a hybrid dynamical system.
Optimal initialization strategies for Deep Spiking Neural Networks
Recent advances in neuromorphic hardware and Surrogate Gradient (SG) learning highlight the potential of Spiking Neural Networks (SNNs) for energy-efficient signal processing and learning. Like in Artificial Neural Networks (ANNs), training performance in SNNs strongly depends on the initialization of synaptic and neuronal parameters. While there are established methods of initializing deep ANNs for high performance, effective strategies for optimal SNN initialization are lacking. Here, we address this gap and propose flexible data-dependent initialization strategies for SNNs.
On the implicit bias of SGD in deep learning
Tali's work emphasized the tradeoff between compression and information preservation. In this talk I will explore this theme in the context of deep learning. Artificial neural networks have recently revolutionized the field of machine learning. However, we still do not have sufficient theoretical understanding of how such models can be successfully learned. Two specific questions in this context are: how can neural nets be learned despite the non-convexity of the learning problem, and how can they generalize well despite often having more parameters than training data. I will describe our recent work showing that gradient-descent optimization indeed leads to 'simpler' models, where simplicity is captured by lower weight norm and in some cases clustering of weight vectors. We demonstrate this for several teacher and student architectures, including learning linear teachers with ReLU networks, learning boolean functions and learning convolutional pattern detection architectures.
Introducing YAPiC: An Open Source tool for biologists to perform complex image segmentation with deep learning
Robust detection of biological structures such as neuronal dendrites in brightfield micrographs, tumor tissue in histological slides, or pathological brain regions in MRI scans is a fundamental task in bio-image analysis. Detection of those structures requests complex decision making which is often impossible with current image analysis software, and therefore typically executed by humans in a tedious and time-consuming manual procedure. Supervised pixel classification based on Deep Convolutional Neural Networks (DNNs) is currently emerging as the most promising technique to solve such complex region detection tasks. Here, a self-learning artificial neural network is trained with a small set of manually annotated images to eventually identify the trained structures from large image data sets in a fully automated way. While supervised pixel classification based on faster machine learning algorithms like Random Forests are nowadays part of the standard toolbox of bio-image analysts (e.g. Ilastik), the currently emerging tools based on deep learning are still rarely used. There is also not much experience in the community how much training data has to be collected, to obtain a reasonable prediction result with deep learning based approaches. Our software YAPiC (Yet Another Pixel Classifier) provides an easy-to-use Python- and command line interface and is purely designed for intuitive pixel classification of multidimensional images with DNNs. With the aim to integrate well in the current open source ecosystem, YAPiC utilizes the Ilastik user interface in combination with a high performance GPU server for model training and prediction. Numerous research groups at our institute have already successfully applied YAPiC for a variety of tasks. From our experience, a surprisingly low amount of sparse label data is needed to train a sufficiently working classifier for typical bioimaging applications. Not least because of this, YAPiC has become the "standard weapon” for our core facility to detect objects in hard-to-segement images. We would like to present some use cases like cell classification in high content screening, tissue detection in histological slides, quantification of neural outgrowth in phase contrast time series, or actin filament detection in transmission electron microscopy.
Towards a neurally mechanistic understanding of visual cognition
I am interested in developing a neurally mechanistic understanding of how primate brains represent the world through its visual system and how such representations enable a remarkable set of intelligent behaviors. In this talk, I will primarily highlight aspects of my current research that focuses on dissecting the brain circuits that support core object recognition behavior (primates’ ability to categorize objects within hundreds of milliseconds) in non-human primates. On the one hand, my work empirically examines how well computational models of the primate ventral visual pathways embed knowledge of the visual brain function (e.g., Bashivan*, Kar*, DiCarlo, Science, 2019). On the other hand, my work has led to various functional and architectural insights that help improve such brain models. For instance, we have exposed the necessity of recurrent computations in primate core object recognition (Kar et al., Nature Neuroscience, 2019), one that is strikingly missing from most feedforward artificial neural network models. Specifically, we have observed that the primate ventral stream requires fast recurrent processing via ventrolateral PFC for robust core object recognition (Kar and DiCarlo, Neuron, 2021). In addition, I have been currently developing various chemogenetic strategies to causally target specific bidirectional neural circuits in the macaque brain during multiple object recognition tasks to further probe their relevance during this behavior. I plan to transform these data and insights into tangible progress in neuroscience via my collaboration with various computational groups and building improved brain models of object recognition. I hope to end the talk with a brief glimpse of some of my planned future work!
Artificial neural networks do not adequately mimic whatever is going on in the real brain
One may think that Deep Learning technology works in ways that are similar to the human brain. This is not really true. Our best AI technology still does not mimic the brain sufficiently well to be a match in intelligence. I will describe seven differences on how our minds work in ways diametrically opposite to those of Deep Learning technology.
Do deep learning latent spaces resemble human brain representations?
In recent years, artificial neural networks have demonstrated human-like or super-human performance in many tasks including image or speech recognition, natural language processing (NLP), playing Go, chess, poker and video-games. One remarkable feature of the resulting models is that they can develop very intuitive latent representations of their inputs. In these latent spaces, simple linear operations tend to give meaningful results, as in the well-known analogy QUEEN-WOMAN+MAN=KING. We postulate that human brain representations share essential properties with these deep learning latent spaces. To verify this, we test whether artificial latent spaces can serve as a good model for decoding brain activity. We report improvements over state-of-the-art performance for reconstructing seen and imagined face images from fMRI brain activation patterns, using the latent space of a GAN (Generative Adversarial Network) model coupled with a Variational AutoEncoder (VAE). With another GAN model (BigBiGAN), we can decode and reconstruct natural scenes of any category from the corresponding brain activity. Our results suggest that deep learning can produce high-level representations approaching those found in the human brain. Finally, I will discuss whether these deep learning latent spaces could be relevant to the study of consciousness.
A function approximation perspective on neural representations
Activity patterns of neural populations in natural and artificial neural networks constitute representations of data. The nature of these representations and how they are learned are key questions in neuroscience and deep learning. In his talk, I will describe my group's efforts in building a theory of representations as feature maps leading to sample efficient function approximation. Kernel methods are at the heart of these developments. I will present applications to deep learning and neuronal data.
A computational explanation for domain specificity in the human brain
Many regions of the human brain conduct highly specific functions, such as recognizing faces, understanding language, and thinking about other people’s thoughts. Why might this domain specific organization be a good design strategy for brains, and what is the origin of domain specificity in the first place? In this talk, I will present recent work testing whether the segregation of face and object perception in human brains emerges naturally from an optimization for both tasks. We trained artificial neural networks on face and object recognition, and found that networks were able to perform both tasks well by spontaneously segregating them into distinct pathways. Critically, networks neither had prior knowledge nor any inductive bias about the tasks. Furthermore, networks optimized on tasks which apparently do not develop specialization in the human brain, such as food or cars, and object categorization showed less task segregation. These results suggest that functional segregation can spontaneously emerge without a task-specific bias, and that the domain-specific organization of the cortex may reflect a computational optimization for the real-world tasks humans solve.
Back-propagation in spiking neural networks
Back-propagation is a powerful supervised learning algorithm in artificial neural networks, because it solves the credit assignment problem (essentially: what should the hidden layers do?). This algorithm has led to the deep learning revolution. But unfortunately, back-propagation cannot be used directly in spiking neural networks (SNN). Indeed, it requires differentiable activation functions, whereas spikes are all-or-none events which cause discontinuities. Here we present two strategies to overcome this problem. The first one is to use a so-called 'surrogate gradient', that is to approximate the derivative of the threshold function with the derivative of a sigmoid. We will present some applications of this method for time series processing (audio, internet traffic, EEG). The second one concerns a specific class of SNNs, which process static inputs using latency coding with at most one spike per neuron. Using approximations, we derived a latency-based back-propagation rule for this sort of networks, called S4NN, and applied it to image classification.
On temporal coding in spiking neural networks with alpha synaptic function
The timing of individual neuronal spikes is essential for biological brains to make fast responses to sensory stimuli. However, conventional artificial neural networks lack the intrinsic temporal coding ability present in biological networks. We propose a spiking neural network model that encodes information in the relative timing of individual neuron spikes. In classification tasks, the output of the network is indicated by the first neuron to spike in the output layer. This temporal coding scheme allows the supervised training of the network with backpropagation, using locally exact derivatives of the postsynaptic spike times with respect to presynaptic spike times. The network operates using a biologically-plausible alpha synaptic transfer function. Additionally, we use trainable synchronisation pulses that provide bias, add flexibility during training and exploit the decay part of the alpha function. We show that such networks can be trained successfully on noisy Boolean logic tasks and on the MNIST dataset encoded in time. The results show that the spiking neural network outperforms comparable spiking models on MNIST and achieves similar quality to fully connected conventional networks with the same architecture. We also find that the spiking network spontaneously discovers two operating regimes, mirroring the accuracy-speed trade-off observed in human decision-making: a slow regime, where a decision is taken after all hidden neurons have spiked and the accuracy is very high, and a fast regime, where a decision is taken very fast but the accuracy is lower. These results demonstrate the computational power of spiking networks with biological characteristics that encode information in the timing of individual neurons. By studying temporal coding in spiking networks, we aim to create building blocks towards energy-efficient and more complex biologically-inspired neural architectures.
Geometry of Neural Computation Unifies Working Memory and Planning
Cognitive tasks typically require the integration of working memory, contextual processing, and planning to be carried out in close coordination. However, these computations are typically studied within neuroscience as independent modular processes in the brain. In this talk I will present an alternative view, that neural representations of mappings between expected stimuli and contingent goal actions can unify working memory and planning computations. We term these stored maps contingency representations. We developed a "conditional delayed logic" task capable of disambiguating the types of representations used during performance of delay tasks. Human behaviour in this task is consistent with the contingency representation, and not with traditional sensory models of working memory. In task-optimized artificial recurrent neural network models, we investigated the representational geometry and dynamical circuit mechanisms supporting contingency-based computation, and show how contingency representation explains salient observations of neuronal tuning properties in prefrontal cortex. Finally, our theory generates novel and falsifiable predictions for single-unit and population neural recordings.
Rational thoughts in neural codes
First, we describe a new method for inferring the mental model of an animal performing a natural task. We use probabilistic methods to compute the most likely mental model based on an animal’s sensory observations and actions. This also reveals dynamic beliefs that would be optimal according to the animal’s internal model, and thus provides a practical notion of “rational thoughts.” Second, we construct a neural coding framework by which these rational thoughts, their computational dynamics, and actions can be identified within the manifold of neural activity. We illustrate the value of this approach by training an artificial neural network to perform a generalization of a widely used foraging task. We analyze the network’s behaviour to find rational thoughts, and successfully recover the neural properties that implemented those thoughts, providing a way of interpreting the complex neural dynamics of the artificial brain. Joint work with Zhengwei Wu, Minhae Kwon, Saurabh Daptardar, and Paul Schrater.
Dendrites endow artificial neural networks with accurate, robust and parameter-efficient learning
Bernstein Conference 2024
Integrating Biological and Artificial Neural Networks for Solving Non-Linear Problems
Bernstein Conference 2024
Model metamers complement existing benchmarks of biological and artificial neural network alignment
COSYNE 2023
Pre-training artificial neural networks with spontaneous retinal activity improves image prediction
COSYNE 2023
Reinforcement learning at multiple timescales in biological and artificial neural networks
COSYNE 2023
Inter-individual Variability in Primate Inferior Temporal Cortex Representations: Insights from Macaque Neural Responses and Artificial Neural Networks
COSYNE 2025
Mapping social perception to social behavior using artificial neural networks
COSYNE 2025
Probing Motion-Form Interactions in the Macaque Inferior Temporal Cortex and Artificial Neural Networks for Complex Scene Understanding
COSYNE 2025
Review of applications of graph theory and network neuroscience in the development of artificial neural networks
Neuromatch 5