Sparsity
sparsity
Latest
Spatially-embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings
Brain networks exist within the confines of resource limitations. As a result, a brain network must overcome metabolic costs of growing and sustaining the network within its physical space, while simultaneously implementing its required information processing. To observe the effect of these processes, we introduce the spatially-embedded recurrent neural network (seRNN). seRNNs learn basic task-related inferences while existing within a 3D Euclidean space, where the communication of constituent neurons is constrained by a sparse connectome. We find that seRNNs, similar to primate cerebral cortices, naturally converge on solving inferences using modular small-world networks, in which functionally similar units spatially configure themselves to utilize an energetically-efficient mixed-selective code. As all these features emerge in unison, seRNNs reveal how many common structural and functional brain motifs are strongly intertwined and can be attributed to basic biological optimization processes. seRNNs can serve as model systems to bridge between structural and functional research communities to move neuroscientific understanding forward.
Beyond Biologically Plausible Spiking Networks for Neuromorphic Computing
Biologically plausible spiking neural networks (SNNs) are an emerging architecture for deep learning tasks due to their energy efficiency when implemented on neuromorphic hardware. However, many of the biological features are at best irrelevant and at worst counterproductive when evaluated in the context of task performance and suitability for neuromorphic hardware. In this talk, I will present an alternative paradigm to design deep learning architectures with good task performance in real-world benchmarks while maintaining all the advantages of SNNs. We do this by focusing on two main features – event-based computation and activity sparsity. Starting from the performant gated recurrent unit (GRU) deep learning architecture, we modify it to make it event-based and activity-sparse. The resulting event-based GRU (EGRU) is extremely efficient for both training and inference. At the same time, it achieves performance close to conventional deep learning architectures in challenging tasks such as language modelling, gesture recognition and sequential MNIST.
General purpose event-based architectures for deep learning
Biologically plausible spiking neural networks (SNNs) are an emerging architecture for deep learning tasks due to their energy efficiency when implemented on neuromorphic hardware. However, many of the biological features are at best irrelevant and at worst counterproductive when evaluated in the context of task performance and suitability for neuromorphic hardware. In this talk, I will present an alternative paradigm to design deep learning architectures with good task performance in real-world benchmarks while maintaining all the advantages of SNNs. We do this by focusing on two main features -- event-based computation and activity sparsity. Starting from the performant gated recurrent unit (GRU) deep learning architecture, we modify it to make it event-based and activity-sparse. The resulting event-based GRU (EGRU) is extremely efficient for both training and inference. At the same time, it achieves performance close to conventional deep learning architectures in challenging tasks such as language modelling, gesture recognition and sequential MNIST
Neural Circuit Mechanisms of Pattern Separation in the Dentate Gyrus
The ability to discriminate different sensory patterns by disentangling their neural representations is an important property of neural networks. While a variety of learning rules are known to be highly effective at fine-tuning synapses to achieve this, less is known about how different cell types in the brain can facilitate this process by providing architectural priors that bias the network towards sparse, selective, and discriminable representations. We studied this by simulating a neuronal network modelled on the dentate gyrus—an area characterised by sparse activity associated with pattern separation in spatial memory tasks. To test the contribution of different cell types to these functions, we presented the model with a wide dynamic range of input patterns and systematically added or removed different circuit elements. We found that recruiting feedback inhibition indirectly via recurrent excitatory neurons proved particularly helpful in disentangling patterns, and show that simple alignment principles for excitatory and inhibitory connections are a highly effective strategy.
Why Some Intelligent Agents are Conscious
In this talk I will present an account of how an agent designed or evolved to be intelligent may come to enjoy subjective experiences. First, the agent is stipulated to be capable of (meta)representing subjective ‘qualitative’ sensory information, in the sense that it can easily assess how exactly similar a sensory signal is to all other possible sensory signals. This information is subjective in the sense that it concerns how the different stimuli can be distinguished by the agent itself, rather than how physically similar they are. For this to happen, sensory coding needs to satisfy sparsity and smoothness constraints, which are known to facilitate metacognition and generalization. Second, this qualitative information can under some specific circumstances acquire an ‘assertoric force’. This happens when a certain self-monitoring mechanism decides that the qualitative information reliably tracks the current state of the world, and informs a general symbolic reasoning system of this fact. I will argue that the having of subjective conscious experiences amounts to nothing more than having qualitative sensory information acquiring an assertoric status within one’s belief system. When this happens, the perceptual content presents itself as reflecting the state of the world right now, in ways that seem undeniably rational to the agent. At the same time, without effort, the agent also knows what the perceptual content is like, in terms of how subjectively similar it is to all other possible precepts. I will discuss the computational benefits of this architecture, for which consciousness might have arisen as a byproduct.
Event-based Backpropagation for Exact Gradients in Spiking Neural Networks
Gradient-based optimization powered by the backpropagation algorithm proved to be the pivotal method in the training of non-spiking artificial neural networks. At the same time, spiking neural networks hold the promise for efficient processing of real-world sensory data by communicating using discrete events in continuous time. We derive the backpropagation algorithm for a recurrent network of spiking (leaky integrate-and-fire) neurons with hard thresholds and show that the backward dynamics amount to an event-based backpropagation of errors through time. Our derivation uses the jump conditions for partial derivatives at state discontinuities found by applying the implicit function theorem, allowing us to avoid approximations or substitutions. We find that the gradient exists and is finite almost everywhere in weight space, up to the null set where a membrane potential is precisely tangent to the threshold. Our presented algorithm, EventProp, computes the exact gradient with respect to a general loss function based on spike times and membrane potentials. Crucially, the algorithm allows for an event-based communication scheme in the backward phase, retaining the potential advantages of temporal sparsity afforded by spiking neural networks. We demonstrate the optimization of spiking networks using gradients computed via EventProp and the Yin-Yang and MNIST datasets with either a spike time-based or voltage-based loss function and report competitive performance. Our work supports the rigorous study of gradient-based optimization in spiking neural networks as well as the development of event-based neuromorphic architectures for the efficient training of spiking neural networks. While we consider the leaky integrate-and-fire model in this work, our methodology generalises to any neuron model defined as a hybrid dynamical system.
Becoming what you smell: adaptive sensing in the olfactory system
I will argue that the circuit architecture of the early olfactory system provides an adaptive, efficient mechanism for compressing the vast space of odor mixtures into the responses of a small number of sensors. In this view, the olfactory sensory repertoire employs a disordered code to compress a high dimensional olfactory space into a low dimensional receptor response space while preserving distance relations between odors. The resulting representation is dynamically adapted to efficiently encode the changing environment of volatile molecules. I will show that this adaptive combinatorial code can be efficiently decoded by systematically eliminating candidate odorants that bind to silent receptors. The resulting algorithm for 'estimation by elimination' can be implemented by a neural network that is remarkably similar to the early olfactory pathway in the brain. Finally, I will discuss how diffuse feedback from the central brain to the bulb, followed by unstructured projections back to the cortex, can produce the convergence and divergence of the cortical representation of odors presented in shared or different contexts. Our theory predicts a relation between the diversity of olfactory receptors and the sparsity of their responses that matches animals from flies to humans. It also predicts specific deficits in olfactory behavior that should result from optogenetic manipulation of the olfactory bulb and cortex, and in some disease states.
Multitask performance humans and deep neural networks
Humans and other primates exhibit rich and versatile behaviour, switching nimbly between tasks as the environmental context requires. I will discuss the neural coding patterns that make this possible in humans and deep networks. First, using deep network simulations, I will characterise two distinct solutions to task acquisition (“lazy” and “rich” learning) which trade off learning speed for robustness, and depend on the initial weights scale and network sparsity. I will chart the predictions of these two schemes for a context-dependent decision-making task, showing that the rich solution is to project task representations onto orthogonal planes on a low-dimensional embedding space. Using behavioural testing and functional neuroimaging in humans, we observe BOLD signals in human prefrontal cortex whose dimensionality and neural geometry are consistent with the rich learning regime. Next, I will discuss the problem of continual learning, showing that behaviourally, humans (unlike vanilla neural networks) learn more effectively when conditions are blocked than interleaved. I will show how this counterintuitive pattern of behaviour can be recreated in neural networks by assuming that information is normalised and temporally clustered (via Hebbian learning) alongside supervised training. Together, this work offers a picture of how humans learn to partition knowledge in the service of structured behaviour, and offers a roadmap for building neural networks that adopt similar principles in the service of multitask learning. This is work with Andrew Saxe, Timo Flesch, David Nagy, and others.
sparsity coverage
8 items