Learning Algorithm
learning algorithm
Neurobiological constraints on learning: bug or feature?
Understanding how brains learn requires bridging evidence across scales—from behaviour and neural circuits to cells, synapses, and molecules. In our work, we use computational modelling and data analysis to explore how the physical properties of neurons and neural circuits constrain learning. These include limits imposed by brain wiring, energy availability, molecular noise, and the 3D structure of dendritic spines. In this talk I will describe one such project testing if wiring motifs from fly brain connectomes can improve performance of reservoir computers, a type of recurrent neural network. The hope is that these insights into brain learning will lead to improved learning algorithms for artificial systems.
Localisation of Seizure Onset Zone in Epilepsy Using Time Series Analysis of Intracranial Data
There are over 30 million people with drug-resistant epilepsy worldwide. When neuroimaging and non-invasive neural recordings fail to localise seizure onset zones (SOZ), intracranial recordings become the best chance for localisation and seizure-freedom in those patients. However, intracranial neural activities remain hard to visually discriminate across recording channels, which limits the success of intracranial visual investigations. In this presentation, I present methods which quantify intracranial neural time series and combine them with explainable machine learning algorithms to localise the SOZ in the epileptic brain. I present the potentials and limitations of our methods in the localisation of SOZ in epilepsy providing insights for future research in this area.
Learning produces a hippocampal cognitive map in the form of an orthogonalized state machine
Cognitive maps confer animals with flexible intelligence by representing spatial, temporal, and abstract relationships that can be used to shape thought, planning, and behavior. Cognitive maps have been observed in the hippocampus, but their algorithmic form and the processes by which they are learned remain obscure. Here, we employed large-scale, longitudinal two-photon calcium imaging to record activity from thousands of neurons in the CA1 region of the hippocampus while mice learned to efficiently collect rewards from two subtly different versions of linear tracks in virtual reality. The results provide a detailed view of the formation of a cognitive map in the hippocampus. Throughout learning, both the animal behavior and hippocampal neural activity progressed through multiple intermediate stages, gradually revealing improved task representation that mirrored improved behavioral efficiency. The learning process led to progressive decorrelations in initially similar hippocampal neural activity within and across tracks, ultimately resulting in orthogonalized representations resembling a state machine capturing the inherent struture of the task. We show that a Hidden Markov Model (HMM) and a biologically plausible recurrent neural network trained using Hebbian learning can both capture core aspects of the learning dynamics and the orthogonalized representational structure in neural activity. In contrast, we show that gradient-based learning of sequence models such as Long Short-Term Memory networks (LSTMs) and Transformers do not naturally produce such orthogonalized representations. We further demonstrate that mice exhibited adaptive behavior in novel task settings, with neural activity reflecting flexible deployment of the state machine. These findings shed light on the mathematical form of cognitive maps, the learning rules that sculpt them, and the algorithms that promote adaptive behavior in animals. The work thus charts a course toward a deeper understanding of biological intelligence and offers insights toward developing more robust learning algorithms in artificial intelligence.
Richly structured reward predictions in dopaminergic learning circuits
Theories from reinforcement learning have been highly influential for interpreting neural activity in the biological circuits critical for animal and human learning. Central among these is the identification of phasic activity in dopamine neurons as a reward prediction error signal that drives learning in basal ganglia and prefrontal circuits. However, recent findings suggest that dopaminergic prediction error signals have access to complex, structured reward predictions and are sensitive to more properties of outcomes than learning theories with simple scalar value predictions might suggest. Here, I will present recent work in which we probed the identity-specific structure of reward prediction errors in an odor-guided choice task and found evidence for multiple predictive “threads” that segregate reward predictions, and reward prediction errors, according to the specific sensory features of anticipated outcomes. Our results point to an expanded class of neural reinforcement learning algorithms in which biological agents learn rich associative structure from their environment and leverage it to build reward predictions that include information about the specific, and perhaps idiosyncratic, features of available outcomes, using these to guide behavior in even quite simple reward learning tasks.
How AI is advancing Clinical Neuropsychology and Cognitive Neuroscience
This talk aims to highlight the immense potential of Artificial Intelligence (AI) in advancing the field of psychology and cognitive neuroscience. Through the integration of machine learning algorithms, big data analytics, and neuroimaging techniques, AI has the potential to revolutionize the way we study human cognition and brain characteristics. In this talk, I will highlight our latest scientific advancements in utilizing AI to gain deeper insights into variations in cognitive performance across the lifespan and along the continuum from healthy to pathological functioning. The presentation will showcase cutting-edge examples of AI-driven applications, such as deep learning for automated scoring of neuropsychological tests, natural language processing to characeterize semantic coherence of patients with psychosis, and other application to diagnose and treat psychiatric and neurological disorders. Furthermore, the talk will address the challenges and ethical considerations associated with using AI in psychological research, such as data privacy, bias, and interpretability. Finally, the talk will discuss future directions and opportunities for further advancements in this dynamic field.
Off-policy learning in the basal ganglia
I will discuss work with Jack Lindsey modeling reinforcement learning for action selection in the basal ganglia. I will argue that the presence of multiple brain regions, in addition to the basal ganglia, that contribute to motor control motivates the need for an off-policy basal ganglia learning algorithm. I will then describe a biological implementation of such an algorithm that predicts tuning of dopamine neurons to a quantity we call "action surprise," in addition to reward prediction error. In the same model, an implementation of learning from a motor efference copy also predicts a novel solution to the problem of multiplexing feedforward and efference-related striatal activity. The solution exploits the difference between D1 and D2-expressing medium spiny neurons and leads to predictions about striatal dynamics.
Maths, AI and Neuroscience Meeting Stockholm
To understand brain function and develop artificial general intelligence it has become abundantly clear that there should be a close interaction among Neuroscience, machine learning and mathematics. There is a general hope that understanding the brain function will provide us with more powerful machine learning algorithms. On the other hand advances in machine learning are now providing the much needed tools to not only analyse brain activity data but also to design better experiments to expose brain function. Both neuroscience and machine learning explicitly or implicitly deal with high dimensional data and systems. Mathematics can provide powerful new tools to understand and quantify the dynamics of biological and artificial systems as they generate behavior that may be perceived as intelligent.
Training Dynamic Spiking Neural Network via Forward Propagation Through Time
With recent advances in learning algorithms, recurrent networks of spiking neurons are achieving performance competitive with standard recurrent neural networks. Still, these learning algorithms are limited to small networks of simple spiking neurons and modest-length temporal sequences, as they impose high memory requirements, have difficulty training complex neuron models, and are incompatible with online learning.Taking inspiration from the concept of Liquid Time-Constant (LTCs), we introduce a novel class of spiking neurons, the Liquid Time-Constant Spiking Neuron (LTC-SN), resulting in functionality similar to the gating operation in LSTMs. We integrate these neurons in SNNs that are trained with FPTT and demonstrate that thus trained LTC-SNNs outperform various SNNs trained with BPTT on long sequences while enabling online learning and drastically reducing memory complexity. We show this for several classical benchmarks that can easily be varied in sequence length, like the Add Task and the DVS-gesture benchmark. We also show how FPTT-trained LTC-SNNs can be applied to large convolutional SNNs, where we demonstrate novel state-of-the-art for online learning in SNNs on a number of standard benchmarks (S-MNIST, R-MNIST, DVS-GESTURE) and also show that large feedforward SNNs can be trained successfully in an online manner to near (Fashion-MNIST, DVS-CIFAR10) or exceeding (PS-MNIST, R-MNIST) state-of-the-art performance as obtained with offline BPTT. Finally, the training and memory efficiency of FPTT enables us to directly train SNNs in an end-to-end manner at network sizes and complexity that was previously infeasible: we demonstrate this by training in an end-to-end fashion the first deep and performant spiking neural network for object localization and recognition. Taken together, we out contribution enable for the first time training large-scale complex spiking neural network architectures online and on long temporal sequences.
Maths, AI and Neuroscience meeting
To understand brain function and develop artificial general intelligence it has become abundantly clear that there should be a close interaction among Neuroscience, machine learning and mathematics. There is a general hope that understanding the brain function will provide us with more powerful machine learning algorithms. On the other hand advances in machine learning are now providing the much needed tools to not only analyse brain activity data but also to design better experiments to expose brain function. Both neuroscience and machine learning explicitly or implicitly deal with high dimensional data and systems. Mathematics can provide powerful new tools to understand and quantify the dynamics of biological and artificial systems as they generate behavior that may be perceived as intelligent. In this meeting we bring together experts from Mathematics, Artificial Intelligence and Neuroscience for a three day long hybrid meeting. We will have talks on mathematical tools in particular Topology to understand high dimensional data, explainable AI, how AI can help neuroscience and to what extent the brain may be using algorithms similar to the ones used in modern machine learning. Finally we will wrap up with a discussion on some aspects of neural hardware that may not have been considered in machine learning.
Spiking Neural networks as Universal Function Approximators - SNUFA 2021
Like last year this online workshop brings together researchers in the field to present their work and discuss ways of translating these findings into a better understanding of neural circuits. Topics include artificial and biologically plausible learning algorithms and the dissection of trained spiking circuits toward understanding neural processing. We have a manageable number of talks with ample time for discussions. This year’s executive committee comprises Chiara Bartolozzi, Sander Bohté, Dan Goodman, and Friedemann Zenke.
Introducing YAPiC: An Open Source tool for biologists to perform complex image segmentation with deep learning
Robust detection of biological structures such as neuronal dendrites in brightfield micrographs, tumor tissue in histological slides, or pathological brain regions in MRI scans is a fundamental task in bio-image analysis. Detection of those structures requests complex decision making which is often impossible with current image analysis software, and therefore typically executed by humans in a tedious and time-consuming manual procedure. Supervised pixel classification based on Deep Convolutional Neural Networks (DNNs) is currently emerging as the most promising technique to solve such complex region detection tasks. Here, a self-learning artificial neural network is trained with a small set of manually annotated images to eventually identify the trained structures from large image data sets in a fully automated way. While supervised pixel classification based on faster machine learning algorithms like Random Forests are nowadays part of the standard toolbox of bio-image analysts (e.g. Ilastik), the currently emerging tools based on deep learning are still rarely used. There is also not much experience in the community how much training data has to be collected, to obtain a reasonable prediction result with deep learning based approaches. Our software YAPiC (Yet Another Pixel Classifier) provides an easy-to-use Python- and command line interface and is purely designed for intuitive pixel classification of multidimensional images with DNNs. With the aim to integrate well in the current open source ecosystem, YAPiC utilizes the Ilastik user interface in combination with a high performance GPU server for model training and prediction. Numerous research groups at our institute have already successfully applied YAPiC for a variety of tasks. From our experience, a surprisingly low amount of sparse label data is needed to train a sufficiently working classifier for typical bioimaging applications. Not least because of this, YAPiC has become the "standard weapon” for our core facility to detect objects in hard-to-segement images. We would like to present some use cases like cell classification in high content screening, tissue detection in histological slides, quantification of neural outgrowth in phase contrast time series, or actin filament detection in transmission electron microscopy.
Zero-shot visual reasoning with probabilistic analogical mapping
There has been a recent surge of interest in the question of whether and how deep learning algorithms might be capable of abstract reasoning, much of which has centered around datasets based on Raven’s Progressive Matrices (RPM), a visual analogy problem set commonly employed to assess fluid intelligence. This has led to the development of algorithms that are capable of solving RPM-like problems directly from pixel-level inputs. However, these algorithms require extensive direct training on analogy problems, and typically generalize poorly to novel problem types. This is in stark contrast to human reasoners, who are capable of solving RPM and other analogy problems zero-shot — that is, with no direct training on those problems. Indeed, it’s this capacity for zero-shot reasoning about novel problem types, i.e. fluid intelligence, that RPM was originally designed to measure. I will present some results from our recent efforts to model this capacity for zero-shot reasoning, based on an extension of a recently proposed approach to analogical mapping we refer to as Probabilistic Analogical Mapping (PAM). Our RPM model uses deep learning to extract attributed graph representations from pixel-level inputs, and then performs alignment of objects between source and target analogs using gradient descent to optimize a graph-matching objective. This extended version of PAM features a number of new capabilities that underscore the flexibility of the overall approach, including 1) the capacity to discover solutions that emphasize either object similarity or relation similarity, based on the demands of a given problem, 2) the ability to extract a schema representing the overall abstract pattern that characterizes a problem, and 3) the ability to directly infer the answer to a problem, rather than relying on a set of possible answer choices. This work suggests that PAM is a promising framework for modeling human zero-shot reasoning.
A theory for Hebbian learning in recurrent E-I networks
The Stabilized Supralinear Network is a model of recurrently connected excitatory (E) and inhibitory (I) neurons with a supralinear input-output relation. It can explain cortical computations such as response normalization and inhibitory stabilization. However, the network's connectivity is designed by hand, based on experimental measurements. How the recurrent synaptic weights can be learned from the sensory input statistics in a biologically plausible way is unknown. Earlier theoretical work on plasticity focused on single neurons and the balance of excitation and inhibition but did not consider the simultaneous plasticity of recurrent synapses and the formation of receptive fields. Here we present a recurrent E-I network model where all synaptic connections are simultaneously plastic, and E neurons self-stabilize by recruiting co-tuned inhibition. Motivated by experimental results, we employ a local Hebbian plasticity rule with multiplicative normalization for E and I synapses. We develop a theoretical framework that explains how plasticity enables inhibition balanced excitatory receptive fields that match experimental results. We show analytically that sufficiently strong inhibition allows neurons' receptive fields to decorrelate and distribute themselves across the stimulus space. For strong recurrent excitation, the network becomes stabilized by inhibition, which prevents unconstrained self-excitation. In this regime, external inputs integrate sublinearly. As in the Stabilized Supralinear Network, this results in response normalization and winner-takes-all dynamics: when two competing stimuli are presented, the network response is dominated by the stronger stimulus while the weaker stimulus is suppressed. In summary, we present a biologically plausible theoretical framework to model plasticity in fully plastic recurrent E-I networks. While the connectivity is derived from the sensory input statistics, the circuit performs meaningful computations. Our work provides a mathematical framework of plasticity in recurrent networks, which has previously only been studied numerically and can serve as the basis for a new generation of brain-inspired unsupervised machine learning algorithms.
How dendrites help solve biological and machine learning problems
Dendrites are thin processes that extend from the cell body of neurons, the main computing units of the brain. The role of dendrites in complex brain functions has been investigated for several decades, yet their direct involvement in key behaviors such as for example sensory perception has only recently been established. In my presentation I will discuss how computational modelling has helped us illuminate dendritic function. I will present the main findings of a number of projects in lab dealing with dendritic nonlinearities in excitatory and inhibitory and their consequences on neuronal tuning and memory formation, the role of dendrites in solving nonlinear problems in human neurons and recent efforts to advance machine learning algorithms by adopting dendritic features.
Choice engineering and the modeling of operant learning
Organisms modify their behavior in response to its consequences, a phenomenon referred to as operant learning. Contemporary modeling of this learning behavior is based on reinforcement learning algorithms. I will discuss some of the challenges that these models face, and proposed a new approach to model-selection that is based on testing their ability to engineer behavior. Finally, I will present the results of The Choice Engineering Competition – an academic competition that compared the efficacies of qualitative and quantitative models of operant learning in shaping behavior.
Data-driven Artificial Social Intelligence: From Social Appropriateness to Fairness
Designing artificially intelligent systems and interfaces with socio-emotional skills is a challenging task. Progress in industry and developments in academia provide us a positive outlook, however, the artificial social and emotional intelligence of the current technology is still limited. My lab’s research has been pushing the state of the art in a wide spectrum of research topics in this area, including the design and creation of new datasets; novel feature representations and learning algorithms for sensing and understanding human nonverbal behaviours in solo, dyadic and group settings; designing longitudinal human-robot interaction studies for wellbeing; and investigating how to mitigate the bias that creeps into these systems. In this talk, I will present some of my research team’s explorations in these areas including social appropriateness of robot actions, virtual reality based cognitive training with affective adaptation, and bias and fairness in data-driven emotionally intelligent systems.
Machine Learning as a tool for positive impact : case studies from climate change
Climate change is one of our generation's greatest challenges, with increasingly severe consequences on global ecosystems and populations. Machine Learning has the potential to address many important challenges in climate change, from both mitigation (reducing its extent) and adaptation (preparing for unavoidable consequences) aspects. To present the extent of these opportunities, I will describe some of the projects that I am involved in, spanning from generative model to computer vision and natural language processing. There are many opportunities for fundamental innovation in this field, advancing the state-of-the-art in Machine Learning while ensuring that this fundamental progress translates into positive real-world impact.
Global visual salience of competing stimuli
Current computational models of visual salience accurately predict the distribution of fixations on isolated visual stimuli. It is not known, however, whether the global salience of a stimulus, that is its effectiveness in the competition for attention with other stimuli, is a function of the local salience or an independent measure. Further, do task and familiarity with the competing images influence eye movements? In this talk, I will present the analysis of a computational model of the global salience of natural images. We trained a machine learning algorithm to learn the direction of the first saccade of participants who freely observed pairs of images. The pairs balanced the combinations of new and already seen images, as well as task and task-free trials. The coefficients of the model provided a reliable measure of the likelihood of each image to attract the first fixation when seen next to another image, that is their global salience. For example, images of close-up faces and images containing humans were consistently looked first and were assigned higher global salience. Interestingly, we found that global salience cannot be explained by the feature-driven local salience of images, the influence of task and familiarity was rather small and we reproduced the previously reported left-sided bias. This computational model of global salience allows to analyse multiple other aspects of human visual perception of competing stimuli. In the talk, I will also present our latest results from analysing the saccadic reaction time as a function of the global salience of the pair of images.
An inference perspective on meta-learning
While meta-learning algorithms are often viewed as algorithms that learn to learn, an alternative viewpoint frames meta-learning as inferring a hidden task variable from experience consisting of observations and rewards. From this perspective, learning to learn is learning to infer. This viewpoint can be useful in solving problems in meta-RL, which I’ll demonstrate through two examples: (1) enabling off-policy meta-learning, and (2) performing efficient meta-RL from image observations. I’ll also discuss how this perspective leads to an algorithm for few-shot image segmentation.
Fast and deep neuromorphic learning with time-to-first-spike coding
Engineered pattern-recognition systems strive for short time-to-solution and low energy-to-solution characteristics. This represents one of the main driving forces behind the development of neuromorphic devices. For both them and their biological archetypes, this corresponds to using as few spikes as early as possible. The concept of few and early spikes is used as the founding principle in the time-to-first-spike coding scheme. Within this framework, we have developed a spike-timing-based learning algorithm, which we used to train neuronal networks on the mixed-signal neuromorphic platform BrainScaleS-2. We derive, from first principles, error-backpropagation-based learning in networks of leaky integrate-and-fire (LIF) neurons relying only on spike times, for specific configurations of neuronal and synaptic time constants. We explicitly examine applicability to neuromorphic substrates by studying the effects of reduced weight precision and range, as well as of parameter noise. We demonstrate the feasibility of our approach on continuous and discrete data spaces, both in software simulations and on BrainScaleS-2. This narrows the gap between previous models of first-spike-time learning and biological neuronal dynamics and paves the way for fast and energy-efficient neuromorphic applications.
Back-propagation in spiking neural networks
Back-propagation is a powerful supervised learning algorithm in artificial neural networks, because it solves the credit assignment problem (essentially: what should the hidden layers do?). This algorithm has led to the deep learning revolution. But unfortunately, back-propagation cannot be used directly in spiking neural networks (SNN). Indeed, it requires differentiable activation functions, whereas spikes are all-or-none events which cause discontinuities. Here we present two strategies to overcome this problem. The first one is to use a so-called 'surrogate gradient', that is to approximate the derivative of the threshold function with the derivative of a sigmoid. We will present some applications of this method for time series processing (audio, internet traffic, EEG). The second one concerns a specific class of SNNs, which process static inputs using latency coding with at most one spike per neuron. Using approximations, we derived a latency-based back-propagation rule for this sort of networks, called S4NN, and applied it to image classification.
Synthesizing Machine Intelligence in Neuromorphic Computers with Differentiable Programming
The potential of machine learning and deep learning to advance artificial intelligence is driving a quest to build dedicated computers, such as neuromorphic hardware that emulate the biological processes of the brain. While the hardware technologies already exist, their application to real-world tasks is hindered by the lack of suitable programming methods. Advances at the interface of neural computation and machine learning showed that key aspects of deep learning models and tools can be transferred to biologically plausible neural circuits. Building on these advances, I will show that differentiable programming can address many challenges of programming spiking neural networks for solving real-world tasks, and help devise novel continual and local learning algorithms. In turn, these new algorithms pave the road towards systematically synthesizing machine intelligence in neuromorphic hardware without detailed knowledge of the hardware circuits.
E-prop: A biologically inspired paradigm for learning in recurrent networks of spiking neurons
Transformative advances in deep learning, such as deep reinforcement learning, usually rely on gradient-based learning methods such as backpropagation through time (BPTT) as a core learning algorithm. However, BPTT is not argued to be biologically plausible, since it requires to a propagate gradients backwards in time and across neurons. Here, we propose e-prop, a novel gradient-based learning method with local and online weight update rules for recurrent neural networks, and in particular recurrent spiking neural networks (RSNNs). As a result, e-prop has the potential to provide a substantial fraction of the power of deep learning to RSNNs. In this presentation, we will motivate e-prop from the perspective of recent insights in neuroscience and show how these have to be combined to form an algorithm for online gradient descent. The mathematical results will be supported by empirical evidence in supervised and reinforcement learning tasks. We will also discuss how limitations that are inherited from gradient-based learning methods, such as sample-efficiency, can be addressed by considering an evolution-like optimization that enhances learning on particular task families. The emerging learning architecture can be used to learn tasks by a single demonstration, hence enabling one-shot learning.
Workshop on "Spiking neural networks as universal function approximators: Learning algorithms and applications
This is a two-day workshop. Sign up and see titles and abstracts on website.
Understanding machine learning via exactly solvable statistical physics models
The affinity between statistical physics and machine learning has long history, this is reflected even in the machine learning terminology that is in part adopted from physics. I will describe the main lines of this long-lasting friendship in the context of current theoretical challenges and open questions about deep learning. Theoretical physics often proceeds in terms of solvable synthetic models, I will describe the related line of work on solvable models of simple feed-forward neural networks. I will highlight a path forward to capture the subtle interplay between the structure of the data, the architecture of the network, and the learning algorithm.
Decoding of Chemical Information from Populations of Olfactory Neurons
Information is represented in the brain by the coordinated activity of populations of neurons. Recent large-scale neural recording methods in combination with machine learning algorithms are helping understand how sensory processing and cognition emerge from neural population activity. This talk will explore the most popular machine learning methods used to gather meaningful low-dimensional representations from higher-dimensional neural recordings. To illustrate the potential of these approaches, Pedro will present his research in which chemical information is decoded from the olfactory system of the mouse for technological applications. Pedro and co-researchers have successfully extracted odor identity and concentration from olfactory receptor neuron low-dimensional activity trajectories. They have further developed a novel method to identify a shared latent space that allowed decoding of odor information across animals.
Identifying cortical learning algorithms using Brain-Machine Interfaces
Bernstein Conference 2024