Loss Function
loss function
Meta-learning functional plasticity rules in neural networks
Synaptic plasticity is known to be a key player in the brain’s life-long learning abilities. However, due to experimental limitations, the nature of the local changes at individual synapses and their link with emerging network-level computations remain unclear. I will present a numerical, meta-learning approach to deduce plasticity rules from either neuronal activity data and/or prior knowledge about the network's computation. I will first show how to recover known rules, given a human-designed loss function in rate networks, or directly from data, using an adversarial approach. Then I will present how to scale-up this approach to recurrent spiking networks using simulation-based inference.
No Free Lunch from Deep Learning in Neuroscience: A Case Study through Models of the Entorhinal-Hippocampal Circuit
Research in Neuroscience, as in many scientific disciplines, is undergoing a renaissance based on deep learning. Unique to Neuroscience, deep learning models can be used not only as a tool but interpreted as models of the brain. The central claims of recent deep learning-based models of brain circuits are that they shed light on fundamental functions being optimized or make novel predictions about neural phenomena. We show, through the case-study of grid cells in the entorhinal-hippocampal circuit, that one may get neither. We rigorously examine the claims of deep learning models of grid cells using large-scale hyperparameter sweeps and theory-driven experimentation, and demonstrate that the results of such models are more strongly driven by particular, non-fundamental, and post-hoc implementation choices than fundamental truths about neural circuits or the loss function(s) they might optimize. We discuss why these models cannot be expected to produce accurate models of the brain without the addition of substantial amounts of inductive bias, an informal No Free Lunch result for Neuroscience.
Trading Off Performance and Energy in Spiking Networks
Many engineered and biological systems must trade off performance and energy use, and the brain is no exception. While there are theories on how activity levels are controlled in biological networks through feedback control (homeostasis), it is not clear what the effects on population coding are, and therefore how performance and energy can be traded off. In this talk we will consider this tradeoff in auto-encoding networks, in which there is a clear definition of performance (the coding loss). We first show how SNNs follow a characteristic trade-off curve between activity levels and coding loss, but that standard networks need to be retrained to achieve different tradeoff points. We next formalize this tradeoff with a joint loss function incorporating coding loss (performance) and activity loss (energy use). From this loss we derive a class of spiking networks which coordinates its spiking to minimize both the activity and coding losses -- and as a result can dynamically adjust its coding precision and energy use. The network utilizes several known activity control mechanisms for this --- threshold adaptation and feedback inhibition --- and elucidates their potential function within neural circuits. Using geometric intuition, we demonstrate how these mechanisms regulate coding precision, and thereby performance. Lastly, we consider how these insights could be transferred to trained SNNs. Overall, this work addresses a key energy-coding trade-off which is often overlooked in network studies, expands on our understanding of homeostasis in biological SNNs, as well as provides a clear framework for considering performance and energy use in artificial SNNs.
Event-based Backpropagation for Exact Gradients in Spiking Neural Networks
Gradient-based optimization powered by the backpropagation algorithm proved to be the pivotal method in the training of non-spiking artificial neural networks. At the same time, spiking neural networks hold the promise for efficient processing of real-world sensory data by communicating using discrete events in continuous time. We derive the backpropagation algorithm for a recurrent network of spiking (leaky integrate-and-fire) neurons with hard thresholds and show that the backward dynamics amount to an event-based backpropagation of errors through time. Our derivation uses the jump conditions for partial derivatives at state discontinuities found by applying the implicit function theorem, allowing us to avoid approximations or substitutions. We find that the gradient exists and is finite almost everywhere in weight space, up to the null set where a membrane potential is precisely tangent to the threshold. Our presented algorithm, EventProp, computes the exact gradient with respect to a general loss function based on spike times and membrane potentials. Crucially, the algorithm allows for an event-based communication scheme in the backward phase, retaining the potential advantages of temporal sparsity afforded by spiking neural networks. We demonstrate the optimization of spiking networks using gradients computed via EventProp and the Yin-Yang and MNIST datasets with either a spike time-based or voltage-based loss function and report competitive performance. Our work supports the rigorous study of gradient-based optimization in spiking neural networks as well as the development of event-based neuromorphic architectures for the efficient training of spiking neural networks. While we consider the leaky integrate-and-fire model in this work, our methodology generalises to any neuron model defined as a hybrid dynamical system.
Logical Neural Networks
The work to be presented in this talk proposes a novel framework seamlessly providing key properties of both neural nets (learning) and symbolic logic (knowledge and reasoning). Every neuron has a meaning as a component of a formula in a weighted real-valued logic, yielding a highly interpretable disentangled representation. Inference is omnidirectional rather than focused on predefined target variables, and corresponds to logical reasoning, including classical first-order logic theorem proving as a special case. The model is end-to-end differentiable, and learning minimizes a novel loss function capturing logical contradiction, yielding resilience to inconsistent knowledge. It also enables the open-world assumption by maintaining bounds on truth values which can have probabilistic semantics, yielding resilience to incomplete knowledge.