Backpropagation
backpropagation
Behavioral Timescale Synaptic Plasticity (BTSP) for biologically plausible credit assignment across multiple layers via top-down gating of dendritic plasticity
A central problem in biological learning is how information about the outcome of a decision or behavior can be used to reliably guide learning across distributed neural circuits while obeying biological constraints. This “credit assignment” problem is commonly solved in artificial neural networks through supervised gradient descent and the backpropagation algorithm. In contrast, biological learning is typically modelled using unsupervised Hebbian learning rules. While these rules only use local information to update synaptic weights, and are sometimes combined with weight constraints to reflect a diversity of excitatory (only positive weights) and inhibitory (only negative weights) cell types, they do not prescribe a clear mechanism for how to coordinate learning across multiple layers and propagate error information accurately across the network. In recent years, several groups have drawn inspiration from the known dendritic non-linearities of pyramidal neurons to propose new learning rules and network architectures that enable biologically plausible multi-layer learning by processing error information in segregated dendrites. Meanwhile, recent experimental results from the hippocampus have revealed a new form of plasticity—Behavioral Timescale Synaptic Plasticity (BTSP)—in which large dendritic depolarizations rapidly reshape synaptic weights and stimulus selectivity with as little as a single stimulus presentation (“one-shot learning”). Here we explore the implications of this new learning rule through a biologically plausible implementation in a rate neuron network. We demonstrate that regulation of dendritic spiking and BTSP by top-down feedback signals can effectively coordinate plasticity across multiple network layers in a simple pattern recognition task. By analyzing hidden feature representations and weight trajectories during learning, we show the differences between networks trained with standard backpropagation, Hebbian learning rules, and BTSP.
Event-based Backpropagation for Exact Gradients in Spiking Neural Networks
Gradient-based optimization powered by the backpropagation algorithm proved to be the pivotal method in the training of non-spiking artificial neural networks. At the same time, spiking neural networks hold the promise for efficient processing of real-world sensory data by communicating using discrete events in continuous time. We derive the backpropagation algorithm for a recurrent network of spiking (leaky integrate-and-fire) neurons with hard thresholds and show that the backward dynamics amount to an event-based backpropagation of errors through time. Our derivation uses the jump conditions for partial derivatives at state discontinuities found by applying the implicit function theorem, allowing us to avoid approximations or substitutions. We find that the gradient exists and is finite almost everywhere in weight space, up to the null set where a membrane potential is precisely tangent to the threshold. Our presented algorithm, EventProp, computes the exact gradient with respect to a general loss function based on spike times and membrane potentials. Crucially, the algorithm allows for an event-based communication scheme in the backward phase, retaining the potential advantages of temporal sparsity afforded by spiking neural networks. We demonstrate the optimization of spiking networks using gradients computed via EventProp and the Yin-Yang and MNIST datasets with either a spike time-based or voltage-based loss function and report competitive performance. Our work supports the rigorous study of gradient-based optimization in spiking neural networks as well as the development of event-based neuromorphic architectures for the efficient training of spiking neural networks. While we consider the leaky integrate-and-fire model in this work, our methodology generalises to any neuron model defined as a hybrid dynamical system.
Data-driven reduction of dendritic morphologies with preserved dendro-somatic responses
There is little consensus on the level of spatial complexity at which dendrites operate. On the one hand, emergent evidence indicates that synapses cluster at micrometer spatial scales. On the other hand, most modelling and network studies ignore dendrites altogether. This dichotomy raises an urgent question: what is the smallest relevant spatial scale for understanding dendritic computation? We have developed a method to construct compartmental models at any level of spatial complexity. Through carefully chosen parameter fits, solvable in the least-squares sense, we obtain accurate reduced compartmental models. Thus, we are able to systematically construct passive as well as active dendrite models at varying degrees of spatial complexity. We evaluate which elements of the dendritic computational repertoire are captured by these models. We show that many canonical elements of the dendritic computational repertoire can be reproduced with few compartments. For instance, for a model to behave as a two-layer network, it is sufficient to fit a reduced model at the soma and at locations at the dendritic tips. In the basal dendrites of an L2/3 pyramidal model, we reproduce the backpropagation of somatic action potentials (APs) with a single dendritic compartment at the tip. Further, we obtain the well-known Ca-spike coincidence detection mechanism in L5 Pyramidal cells with as few as eleven compartments, the requirement being that their spacing along the apical trunk supports AP backpropagation. We also investigate whether afferent spatial connectivity motifs admit simplification by ablating targeted branches and grouping affected synapses onto the next proximal dendrite. We find that voltage in the remaining branches is reproduced if temporal conductance fluctuations stay below a limit that depends on the average difference in input resistance between the ablated branches and the next proximal dendrite. Consequently, when the average conductance load on distal synapses is constant, the dendritic tree can be simplified while appropriately decreasing synaptic weights. When the conductance level fluctuates strongly, for instance through a-priori unpredictable fluctuations in NMDA activation, a constant weight rescale factor cannot be found, and the dendrite cannot be simplified. We have created an open source Python toolbox (NEAT - https://neatdend.readthedocs.io/en/latest/) that automatises the simplification process. A NEST implementation of the reduced models, currently under construction, will enable the simulation of few-compartment models in large-scale networks, thus bridging the gap between cellular and network level neuroscience.
Fast and deep neuromorphic learning with time-to-first-spike coding
Engineered pattern-recognition systems strive for short time-to-solution and low energy-to-solution characteristics. This represents one of the main driving forces behind the development of neuromorphic devices. For both them and their biological archetypes, this corresponds to using as few spikes as early as possible. The concept of few and early spikes is used as the founding principle in the time-to-first-spike coding scheme. Within this framework, we have developed a spike-timing-based learning algorithm, which we used to train neuronal networks on the mixed-signal neuromorphic platform BrainScaleS-2. We derive, from first principles, error-backpropagation-based learning in networks of leaky integrate-and-fire (LIF) neurons relying only on spike times, for specific configurations of neuronal and synaptic time constants. We explicitly examine applicability to neuromorphic substrates by studying the effects of reduced weight precision and range, as well as of parameter noise. We demonstrate the feasibility of our approach on continuous and discrete data spaces, both in software simulations and on BrainScaleS-2. This narrows the gap between previous models of first-spike-time learning and biological neuronal dynamics and paves the way for fast and energy-efficient neuromorphic applications.
E-prop: A biologically inspired paradigm for learning in recurrent networks of spiking neurons
Transformative advances in deep learning, such as deep reinforcement learning, usually rely on gradient-based learning methods such as backpropagation through time (BPTT) as a core learning algorithm. However, BPTT is not argued to be biologically plausible, since it requires to a propagate gradients backwards in time and across neurons. Here, we propose e-prop, a novel gradient-based learning method with local and online weight update rules for recurrent neural networks, and in particular recurrent spiking neural networks (RSNNs). As a result, e-prop has the potential to provide a substantial fraction of the power of deep learning to RSNNs. In this presentation, we will motivate e-prop from the perspective of recent insights in neuroscience and show how these have to be combined to form an algorithm for online gradient descent. The mathematical results will be supported by empirical evidence in supervised and reinforcement learning tasks. We will also discuss how limitations that are inherited from gradient-based learning methods, such as sample-efficiency, can be addressed by considering an evolution-like optimization that enhances learning on particular task families. The emerging learning architecture can be used to learn tasks by a single demonstration, hence enabling one-shot learning.
On temporal coding in spiking neural networks with alpha synaptic function
The timing of individual neuronal spikes is essential for biological brains to make fast responses to sensory stimuli. However, conventional artificial neural networks lack the intrinsic temporal coding ability present in biological networks. We propose a spiking neural network model that encodes information in the relative timing of individual neuron spikes. In classification tasks, the output of the network is indicated by the first neuron to spike in the output layer. This temporal coding scheme allows the supervised training of the network with backpropagation, using locally exact derivatives of the postsynaptic spike times with respect to presynaptic spike times. The network operates using a biologically-plausible alpha synaptic transfer function. Additionally, we use trainable synchronisation pulses that provide bias, add flexibility during training and exploit the decay part of the alpha function. We show that such networks can be trained successfully on noisy Boolean logic tasks and on the MNIST dataset encoded in time. The results show that the spiking neural network outperforms comparable spiking models on MNIST and achieves similar quality to fully connected conventional networks with the same architecture. We also find that the spiking network spontaneously discovers two operating regimes, mirroring the accuracy-speed trade-off observed in human decision-making: a slow regime, where a decision is taken after all hidden neurons have spiked and the accuracy is very high, and a fast regime, where a decision is taken very fast but the accuracy is lower. These results demonstrate the computational power of spiking networks with biological characteristics that encode information in the timing of individual neurons. By studying temporal coding in spiking networks, we aim to create building blocks towards energy-efficient and more complex biologically-inspired neural architectures.
Biological evidence that the cortex does not implement backpropagation
Bernstein Conference 2024
Backpropagation through space, time and the brain
COSYNE 2025
Target learning rather than backpropagation explains learning in the mammalian neocortex
COSYNE 2025