Object
object representations
From Spiking Predictive Coding to Learning Abstract Object Representation
In a first part of the talk, I will present Predictive Coding Light (PCL), a novel unsupervised learning architecture for spiking neural networks. In contrast to conventional predictive coding approaches, which only transmit prediction errors to higher processing stages, PCL learns inhibitory lateral and top-down connectivity to suppress the most predictable spikes and passes a compressed representation of the input to higher processing stages. We show that PCL reproduces a range of biological findings and exhibits a favorable tradeoff between energy consumption and downstream classification performance on challenging benchmarks. A second part of the talk will feature our lab’s efforts to explain how infants and toddlers might learn abstract object representations without supervision. I will present deep learning models that exploit the temporal and multimodal structure of their sensory inputs to learn representations of individual objects, object categories, or abstract super-categories such as „kitchen object“ in a fully unsupervised fashion. These models offer a parsimonious account of how abstract semantic knowledge may be rooted in children's embodied first-person experiences.
Sensory cognition
This webinar features presentations from SueYeon Chung (New York University) and Srinivas Turaga (HHMI Janelia Research Campus) on theoretical and computational approaches to sensory cognition. Chung introduced a “neural manifold” framework to capture how high-dimensional neural activity is structured into meaningful manifolds reflecting object representations. She demonstrated that manifold geometry—shaped by radius, dimensionality, and correlations—directly governs a population’s capacity for classifying or separating stimuli under nuisance variations. Applying these ideas as a data analysis tool, she showed how measuring object-manifold geometry can explain transformations along the ventral visual stream and suggested that manifold principles also yield better self-supervised neural network models resembling mammalian visual cortex. Turaga described simulating the entire fruit fly visual pathway using its connectome, modeling 64 key cell types in the optic lobe. His team’s systematic approach—combining sparse connectivity from electron microscopy with simple dynamical parameters—recapitulated known motion-selective responses and produced novel testable predictions. Together, these studies underscore the power of combining connectomic detail, task objectives, and geometric theories to unravel neural computations bridging from stimuli to cognitive functions.
Hebbian Plasticity Supports Predictive Self-Supervised Learning of Disentangled Representations
Discriminating distinct objects and concepts from sensory stimuli is essential for survival. Our brains accomplish this feat by forming meaningful internal representations in deep sensory networks with plastic synaptic connections. Experience-dependent plasticity presumably exploits temporal contingencies between sensory inputs to build these internal representations. However, the precise mechanisms underlying plasticity remain elusive. We derive a local synaptic plasticity model inspired by self-supervised machine learning techniques that shares a deep conceptual connection to Bienenstock-Cooper-Munro (BCM) theory and is consistent with experimentally observed plasticity rules. We show that our plasticity model yields disentangled object representations in deep neural networks without the need for supervision and implausible negative examples. In response to altered visual experience, our model qualitatively captures neuronal selectivity changes observed in the monkey inferotemporal cortex in-vivo. Our work suggests a plausible learning rule to drive learning in sensory networks while making concrete testable predictions.
Smart perception?: Gestalt grouping, perceptual averaging, and memory capacity
It seems we see the world in full detail. However, the eye is not a camera nor is the brain a computer. Incredible metabolic constraints render us unable to encode more than a fraction of information available in each glance. Instead, our illusion of stable and complete perception is accomplished by parsimonious representation relying on natural order inherent in the surrounding environment. I will begin by discussing previous behavioral work from our lab demonstrating one such strategy by which the visual system represents average properties of Gestalt-grouped sets of individual objects, warping individual object representations toward the Gestalt-defined mean. I will then discuss on-going work using a behavioral index of averaging Gestalt-grouped information established in our previous work in conjunction with an ERP-index of VSTM capacity (the CDA) to measure whether the Gestalt-grouping and perceptual averaging strategy acts to boost memory capacity above the classic “four-item” limit. Finally, I will outline our pre-registered study to determine whether this perceptual strategy is indeed engaged in a “smart” manner under normal circumstances, or compromises fidelity for capacity by perceptually-averaging in trials with only four items that could otherwise be individually represented.
The contribution of the dorsal visual pathway to perception and action
The human visual system enables us to recognize objects (e.g., this is a cup) and act upon them (e.g., grasp the cup) with astonishing ease and accuracy. For decades, it was widely accepted that these different functions rely on two separated cortical pathways. The ventral occipitotemporal pathway subserves object recognition, while the dorsal occipitoparietal pathway promotes visually guided actions. In my talk, I will discuss recent evidence from a series of neuropsychological, developmental and neuroimaging studies that were aimed to explore the nature of object representations in the dorsal pathway. The results from these studies highlight the plausible role of the dorsal pathway in object perception and reveal an interplay between shape representations derived by the two pathways. Together, these findings challenge the binary distinction between the two pathways and are consistent with the view that object recognition is not the sole product of ventral pathway computations, but instead relies on a distributed network of regions.