Supervised Learning
supervised learning
I-Chun Lin, PhD
The Gatsby Computational Neuroscience Unit is a leading research centre focused on theoretical neuroscience and machine learning. We study (un)supervised and reinforcement learning in brains and machines; inference, coding and neural dynamics; Bayesian and kernel methods, and deep learning; with applications to the analysis of perceptual processing and cognition, neural data, signal and image processing, machine vision, network data and nonparametric hypothesis testing. The Unit provides a unique opportunity for a critical mass of theoreticians to interact closely with one another and with researchers at the Sainsbury Wellcome Centre for Neural Circuits and Behaviour (SWC), the Centre for Computational Statistics and Machine Learning (CSML) and related UCL departments such as Computer Science; Statistical Science; Artificial Intelligence; the ELLIS Unit at UCL; Neuroscience; and the nearby Alan Turing and Francis Crick Institutes. Our PhD programme provides a rigorous preparation for a research career. Students complete a 4-year PhD in either machine learning or theoretical/computational neuroscience, with minor emphasis in the complementary field. Courses in the first year provide a comprehensive introduction to both fields and systems neuroscience. Students are encouraged to work and interact closely with SWC/CSML researchers to take advantage of this uniquely multidisciplinary research environment.
From Spiking Predictive Coding to Learning Abstract Object Representation
In a first part of the talk, I will present Predictive Coding Light (PCL), a novel unsupervised learning architecture for spiking neural networks. In contrast to conventional predictive coding approaches, which only transmit prediction errors to higher processing stages, PCL learns inhibitory lateral and top-down connectivity to suppress the most predictable spikes and passes a compressed representation of the input to higher processing stages. We show that PCL reproduces a range of biological findings and exhibits a favorable tradeoff between energy consumption and downstream classification performance on challenging benchmarks. A second part of the talk will feature our lab’s efforts to explain how infants and toddlers might learn abstract object representations without supervision. I will present deep learning models that exploit the temporal and multimodal structure of their sensory inputs to learn representations of individual objects, object categories, or abstract super-categories such as „kitchen object“ in a fully unsupervised fashion. These models offer a parsimonious account of how abstract semantic knowledge may be rooted in children's embodied first-person experiences.
“Development and application of gaze control models for active perception”
Gaze shifts in humans serve to direct high-resolution vision provided by the fovea towards areas in the environment. Gaze can be considered a proxy for attention or indicator of the relative importance of different parts of the environment. In this talk, we discuss the development of generative models of human gaze in response to visual input. We discuss how such models can be learned, both using supervised learning and using implicit feedback as an agent interacts with the environment, the latter being more plausible in biological agents. We also discuss two ways such models can be used. First, they can be used to improve the performance of artificial autonomous systems, in applications such as autonomous navigation. Second, because these models are contingent on the human’s task, goals, and/or state in the context of the environment, observations of gaze can be used to infer information about user intent. This information can be used to improve human-machine and human robot interaction, by making interfaces more anticipative. We discuss example applications in gaze-typing, robotic tele-operation and human-robot interaction.
Comparing supervised learning dynamics: Deep neural networks match human data efficiency but show a generalisation lag
Recent research has seen many behavioral comparisons between humans and deep neural networks (DNNs) in the domain of image classification. Often, comparison studies focus on the end-result of the learning process by measuring and comparing the similarities in the representations of object categories once they have been formed. However, the process of how these representations emerge—that is, the behavioral changes and intermediate stages observed during the acquisition—is less often directly and empirically compared. In this talk, I'm going to report a detailed investigation of the learning dynamics in human observers and various classic and state-of-the-art DNNs. We develop a constrained supervised learning environment to align learning-relevant conditions such as starting point, input modality, available input data and the feedback provided. Across the whole learning process we evaluate and compare how well learned representations can be generalized to previously unseen test data. Comparisons across the entire learning process indicate that DNNs demonstrate a level of data efficiency comparable to human learners, challenging some prevailing assumptions in the field. However, our results also reveal representational differences: while DNNs' learning is characterized by a pronounced generalisation lag, humans appear to immediately acquire generalizable representations without a preliminary phase of learning training set-specific information that is only later transferred to novel data.
Trends in NeuroAI - SwiFT: Swin 4D fMRI Transformer
Trends in NeuroAI is a reading group hosted by the MedARC Neuroimaging & AI lab (https://medarc.ai/fmri). Title: SwiFT: Swin 4D fMRI Transformer Abstract: Modeling spatiotemporal brain dynamics from high-dimensional data, such as functional Magnetic Resonance Imaging (fMRI), is a formidable task in neuroscience. Existing approaches for fMRI analysis utilize hand-crafted features, but the process of feature extraction risks losing essential information in fMRI scans. To address this challenge, we present SwiFT (Swin 4D fMRI Transformer), a Swin Transformer architecture that can learn brain dynamics directly from fMRI volumes in a memory and computation-efficient manner. SwiFT achieves this by implementing a 4D window multi-head self-attention mechanism and absolute positional embeddings. We evaluate SwiFT using multiple large-scale resting-state fMRI datasets, including the Human Connectome Project (HCP), Adolescent Brain Cognitive Development (ABCD), and UK Biobank (UKB) datasets, to predict sex, age, and cognitive intelligence. Our experimental outcomes reveal that SwiFT consistently outperforms recent state-of-the-art models. Furthermore, by leveraging its end-to-end learning capability, we show that contrastive loss-based self-supervised pre-training of SwiFT can enhance performance on downstream tasks. Additionally, we employ an explainable AI method to identify the brain regions associated with sex classification. To our knowledge, SwiFT is the first Swin Transformer architecture to process dimensional spatiotemporal brain functional data in an end-to-end fashion. Our work holds substantial potential in facilitating scalable learning of functional brain imaging in neuroscience research by reducing the hurdles associated with applying Transformer models to high-dimensional fMRI. Speaker: Junbeom Kwon is a research associate working in Prof. Jiook Cha’s lab at Seoul National University. Paper link: https://arxiv.org/abs/2307.05916
BrainLM Journal Club
Connor Lane will lead a journal club on the recent BrainLM preprint, a foundation model for fMRI trained using self-supervised masked autoencoder training. Preprint: https://www.biorxiv.org/content/10.1101/2023.09.12.557460v1 Tweeprint: https://twitter.com/david_van_dijk/status/1702336882301112631?t=Q2-U92-BpJUBh9C35iUbUA&s=19
Learning to see stuff
Humans are very good at visually recognizing materials and inferring their properties. Without touching surfaces, we can usually tell what they would feel like, and we enjoy vivid visual intuitions about how they typically behave. This is impressive because the retinal image that the visual system receives as input is the result of complex interactions between many physical processes. Somehow the brain has to disentangle these different factors. I will present some recent work in which we show that an unsupervised neural network trained on images of surfaces spontaneously learns to disentangle reflectance, lighting and shape. However, the disentanglement is not perfect, and we find that as a result the network not only predicts the broad successes of human gloss perception, but also the specific pattern of errors that humans exhibit on an image-by-image basis. I will argue this has important implications for thinking about appearance and vision more broadly.
Memory-enriched computation and learning in spiking neural networks through Hebbian plasticity
Memory is a key component of biological neural systems that enables the retention of information over a huge range of temporal scales, ranging from hundreds of milliseconds up to years. While Hebbian plasticity is believed to play a pivotal role in biological memory, it has so far been analyzed mostly in the context of pattern completion and unsupervised learning. Here, we propose that Hebbian plasticity is fundamental for computations in biological neural systems. We introduce a novel spiking neural network (SNN) architecture that is enriched by Hebbian synaptic plasticity. We experimentally show that our memory-equipped SNN model outperforms state-of-the-art deep learning mechanisms in a sequential pattern-memorization task, as well as demonstrate superior out-of-distribution generalization capabilities compared to these models. We further show that our model can be successfully applied to one-shot learning and classification of handwritten characters, improving over the state-of-the-art SNN model. We also demonstrate the capability of our model to learn associations for audio to image synthesis from spoken and handwritten digits. Our SNN model further presents a novel solution to a variety of cognitive question answering tasks from a standard benchmark, achieving comparable performance to both memory-augmented ANN and SNN-based state-of-the-art solutions to this problem. Finally we demonstrate that our model is able to learn from rewards on an episodic reinforcement learning task and attain near-optimal strategy on a memory-based card game. Hence, our results show that Hebbian enrichment renders spiking neural networks surprisingly versatile in terms of their computational as well as learning capabilities. Since local Hebbian plasticity can easily be implemented in neuromorphic hardware, this also suggests that powerful cognitive neuromorphic systems can be build based on this principle.
Mouse visual cortex as a limited resource system that self-learns an ecologically-general representation
Studies of the mouse visual system have revealed a variety of visual brain areas in a roughly hierarchical arrangement, together with a multitude of behavioral capacities, ranging from stimulus-reward associations, to goal-directed navigation, and object-centric discriminations. However, an overall understanding of the mouse’s visual cortex organization, and how this organization supports visual behaviors, remains unknown. Here, we take a computational approach to help address these questions, providing a high-fidelity quantitative model of mouse visual cortex. By analyzing factors contributing to model fidelity, we identified key principles underlying the organization of mouse visual cortex. Structurally, we find that comparatively low-resolution and shallow structure were both important for model correctness. Functionally, we find that models trained with task-agnostic, unsupervised objective functions, based on the concept of contrastive embeddings were substantially better than models trained with supervised objectives. Finally, the unsupervised objective builds a general-purpose visual representation that enables the system to achieve better transfer on out-of-distribution visual, scene understanding and reward-based navigation tasks. Our results suggest that mouse visual cortex is a low-resolution, shallow network that makes best use of the mouse’s limited resources to create a light-weight, general-purpose visual system – in contrast to the deep, high-resolution, and more task-specific visual system of primates.
Learning static and dynamic mappings with local self-supervised plasticity
Animals exhibit remarkable learning capabilities with little direct supervision. Likewise, self-supervised learning is an emergent paradigm in artificial intelligence, closing the performance gap to supervised learning. In the context of biology, self-supervised learning corresponds to a setting where one sense or specific stimulus may serve as a supervisory signal for another. After learning, the latter can be used to predict the former. On the implementation level, it has been demonstrated that such predictive learning can occur at the single neuron level, in compartmentalized neurons that separate and associate information from different streams. We demonstrate the power such self-supervised learning over unsupervised (Hebb-like) learning rules, which depend heavily on stimulus statistics, in two examples: First, in the context of animal navigation where predictive learning can associate internal self-motion information always available to the animal with external visual landmark information, leading to accurate path-integration in the dark. We focus on the well-characterized fly head direction system and show that our setting learns a connectivity strikingly similar to the one reported in experiments. The mature network is a quasi-continuous attractor and reproduces key experiments in which optogenetic stimulation controls the internal representation of heading, and where the network remaps to integrate with different gains. Second, we show that incorporating global gating by reward prediction errors allows the same setting to learn conditioning at the neuronal level with mixed selectivity. At its core, conditioning entails associating a neural activity pattern induced by an unconditioned stimulus (US) with the pattern arising in response to a conditioned stimulus (CS). Solving the generic problem of pattern-to-pattern associations naturally leads to emergent cognitive phenomena like blocking, overshadowing, saliency effects, extinction, interstimulus interval effects etc. Surprisingly, we find that the same network offers a reductionist mechanism for causal inference by resolving the post hoc, ergo propter hoc fallacy.
Hebbian Plasticity Supports Predictive Self-Supervised Learning of Disentangled Representations
Discriminating distinct objects and concepts from sensory stimuli is essential for survival. Our brains accomplish this feat by forming meaningful internal representations in deep sensory networks with plastic synaptic connections. Experience-dependent plasticity presumably exploits temporal contingencies between sensory inputs to build these internal representations. However, the precise mechanisms underlying plasticity remain elusive. We derive a local synaptic plasticity model inspired by self-supervised machine learning techniques that shares a deep conceptual connection to Bienenstock-Cooper-Munro (BCM) theory and is consistent with experimentally observed plasticity rules. We show that our plasticity model yields disentangled object representations in deep neural networks without the need for supervision and implausible negative examples. In response to altered visual experience, our model qualitatively captures neuronal selectivity changes observed in the monkey inferotemporal cortex in-vivo. Our work suggests a plausible learning rule to drive learning in sensory networks while making concrete testable predictions.
Turning spikes to space: The storage capacity of tempotrons with plastic synaptic dynamics
Neurons in the brain communicate through action potentials (spikes) that are transmitted through chemical synapses. Throughout the last decades, the question how networks of spiking neurons represent and process information has remained an important challenge. Some progress has resulted from a recent family of supervised learning rules (tempotrons) for models of spiking neurons. However, these studies have viewed synaptic transmission as static and characterized synaptic efficacies as scalar quantities that change only on slow time scales of learning across trials but remain fixed on the fast time scales of information processing within a trial. By contrast, signal transduction at chemical synapses in the brain results from complex molecular interactions between multiple biochemical processes whose dynamics result in substantial short-term plasticity of most connections. Here we study the computational capabilities of spiking neurons whose synapses are dynamic and plastic, such that each individual synapse can learn its own dynamics. We derive tempotron learning rules for current-based leaky-integrate-and-fire neurons with different types of dynamic synapses. Introducing ordinal synapses whose efficacies depend only on the order of input spikes, we establish an upper capacity bound for spiking neurons with dynamic synapses. We compare this bound to independent synapses, static synapses and to the well established phenomenological Tsodyks-Markram model. We show that synaptic dynamics in principle allow the storage capacity of spiking neurons to scale with the number of input spikes and that this increase in capacity can be traded for greater robustness to input noise, such as spike time jitter. Our work highlights the feasibility of a novel computational paradigm for spiking neural circuits with plastic synaptic dynamics: Rather than being determined by the fixed number of afferents, the dimensionality of a neuron's decision space can be scaled flexibly through the number of input spikes emitted by its input layer.
Finding needles in the neural haystack: unsupervised analyses of noisy data
In modern neuroscience, we often want to extract information from recordings of many neurons in the brain. Unfortunately, the activity of individual neurons is very noisy, making it difficult to relate to cognition and behavior. Thankfully, we can use the correlations across time and neurons to denoise the data we record. In particular, using recent advances in machine learning, we can build models which harness this structure in the data to extract more interpretable signals. In this talk, we present two such methods as well as examples of how they can help us gain further insights into the neural underpinnings of behavior.
STDP and the transfer of rhythmic signals in the brain
Rhythmic activity in the brain has been reported in relation to a wide range of cognitive processes. Changes in the rhythmic activity have been related to pathological states. These observations raise the question of the origin of these rhythms: can the mechanisms responsible for generation of these rhythms and that allow the propagation of the rhythmic signal be acquired via a process of learning? In my talk I will focus on spike timing dependent plasticity (STDP) and examine under what conditions this unsupervised learning rule can facilitate the propagation of rhythmic activity downstream in the central nervous system. Next, the I will apply the theory of STDP to the whisker system and demonstrate how STDP can shape the distribution of preferred phases of firing in a downstream population. Interestingly, in both these cases STDP dynamics does not relax to a fixed-point solution, rather the synaptic weights remain dynamic. Nevertheless, STDP allows for the system to retain its functionality in the face of continuous remodeling of the entire synaptic population.
Back-propagation in spiking neural networks
Back-propagation is a powerful supervised learning algorithm in artificial neural networks, because it solves the credit assignment problem (essentially: what should the hidden layers do?). This algorithm has led to the deep learning revolution. But unfortunately, back-propagation cannot be used directly in spiking neural networks (SNN). Indeed, it requires differentiable activation functions, whereas spikes are all-or-none events which cause discontinuities. Here we present two strategies to overcome this problem. The first one is to use a so-called 'surrogate gradient', that is to approximate the derivative of the threshold function with the derivative of a sigmoid. We will present some applications of this method for time series processing (audio, internet traffic, EEG). The second one concerns a specific class of SNNs, which process static inputs using latency coding with at most one spike per neuron. Using approximations, we derived a latency-based back-propagation rule for this sort of networks, called S4NN, and applied it to image classification.
Deep reinforcement learning and its neuroscientific implications
The last few years have seen some dramatic developments in artificial intelligence research. What implications might these have for neuroscience? Investigations of this question have, to date, focused largely on deep neural networks trained using supervised learning, in tasks such as image classification. However, there is another area of recent AI work which has so far received less attention from neuroscientists, but which may have more profound neuroscientific implications: Deep reinforcement learning. Deep RL offers a rich framework for studying the interplay among learning, representation and decision-making, offering to the brain sciences a new set of research tools and a wide range of novel hypotheses. I’ll provide a high level introduction to deep RL, discuss some recent neuroscience-oriented investigations from my group at DeepMind, and survey some wider implications for research on brain and behavior.
Machine reasoning in histopathologic image analysis
Deep learning is an emerging computational approach inspired by the human brain’s neural connectivity that has transformed machine-based image analysis. By using histopathology as a model of an expert-level pattern recognition exercise, we explore the ability for humans to teach machines to learn and mimic image-recognition and decision making. Moreover, these models also allow exploration into the ability for computers to independently learn salient histological patterns and complex ontological relationships that parallel biological and expert knowledge without the need for explicit direction or supervision. Deciphering the overlap between human and unsupervised machine reasoning may aid in eliminating biases and improving automation and accountability for artificial intelligence-assisted vision tasks and decision-making. Aleksandar Ivanov Title:
Self-supervised learning in neocortical layers: how the present teaches the past
COSYNE 2022
Self-supervised learning in neocortical layers: how the present teaches the past
COSYNE 2022
Supervised learning and interpretation of plasticity rules in spiking neural networks
COSYNE 2022
Supervised learning and interpretation of plasticity rules in spiking neural networks
COSYNE 2022
Back to the present: self-supervised learning in neocortical microcircuits
COSYNE 2023
Pose estimation made better, easier, and faster with video semi-supervised learning on the cloud
COSYNE 2023
Self-timed self-supervised learning
COSYNE 2023
Contrastive-Equivariant self-supervised learning improves alignment with primate visual area IT
COSYNE 2025
Robust unsupervised learning of spike patterns with optimal transport theory
COSYNE 2025
Serotonergic activity in the dorsal raphe nucleus through the lens of unsupervised learning
COSYNE 2025
Self-supervised learning of spiking motifs in neurobiological data
FENS Forum 2024
Self-supervised learning using Geometric Assessment-driven Topological Smoothing (GATS) for neuron tracing and Active Learning Environment (NeuroTrALE)
FENS Forum 2024