Generalization
generalization
Computational Mechanisms of Predictive Processing in Brains and Machines
Predictive processing offers a unifying view of neural computation, proposing that brains continuously anticipate sensory input and update internal models based on prediction errors. In this talk, I will present converging evidence for the computational mechanisms underlying this framework across human neuroscience and deep neural networks. I will begin with recent work showing that large-scale distributed prediction-error encoding in the human brain directly predicts how sensory representations reorganize through predictive learning. I will then turn to PredNet, a popular predictive coding inspired deep network that has been widely used to model real-world biological vision systems. Using dynamic stimuli generated with our Spatiotemporal Style Transfer algorithm, we demonstrate that PredNet relies primarily on low-level spatiotemporal structure and remains insensitive to high-level content, revealing limits in its generalization capacity. Finally, I will discuss new recurrent vision models that integrate top-down feedback connections with intrinsic neural variability, uncovering a dual mechanism for robust sensory coding in which neural variability decorrelates unit responses, while top-down feedback stabilizes network dynamics. Together, these results outline how prediction error signaling and top-down feedback pathways shape adaptive sensory processing in biological and artificial systems.
Alessio Del Bue
The Italian Institute of Technology (IIT) and the University of Genoa are offering 4 PhD scholarships on Computational Vision, Automatic Recognition, and Learning. Research and training activities will be jointly conducted between the DITEN Department of the University of Genoa and IIT infrastructures in Genoa, at the PAVIS - Pattern Analysis and Computer Vision Research line. The PhD program will focus on various research topics, including 3D scene understanding, multi-modal learning, self-supervised and unsupervised deep learning, generative models for human and scene generation, novel graph operators for learning on large-scale and temporal data, and domain adaptation and generalization.
Justus Piater
The Intelligent and Interactive Systems lab uses machine learning to enhance the flexibility, robustness, generalization and explainability of robots and vision systems, focusing on methods for learning about structure, function, and other concepts that describe the world in actionable ways. Three University-Assistant Positions involve minor teaching duties with negotiable research topics within the lab's scope. One Project Position involves the integration of robotic perception and execution mechanisms for task-oriented object manipulation in everyday environments, with a focus on affordance-driven object part segmentation and object manipulation using reinforcement learning.
Learning representations of specifics and generalities over time
There is a fundamental tension between storing discrete traces of individual experiences, which allows recall of particular moments in our past without interference, and extracting regularities across these experiences, which supports generalization and prediction in similar situations in the future. One influential proposal for how the brain resolves this tension is that it separates the processes anatomically into Complementary Learning Systems, with the hippocampus rapidly encoding individual episodes and the neocortex slowly extracting regularities over days, months, and years. But this does not explain our ability to learn and generalize from new regularities in our environment quickly, often within minutes. We have put forward a neural network model of the hippocampus that suggests that the hippocampus itself may contain complementary learning systems, with one pathway specializing in the rapid learning of regularities and a separate pathway handling the region’s classic episodic memory functions. This proposal has broad implications for how we learn and represent novel information of specific and generalized types, which we test across statistical learning, inference, and category learning paradigms. We also explore how this system interacts with slower-learning neocortical memory systems, with empirical and modeling investigations into how the hippocampus shapes neocortical representations during sleep. Together, the work helps us understand how structured information in our environment is initially encoded and how it then transforms over time.
Relations and Predictions in Brains and Machines
Humans and animals learn and plan with flexibility and efficiency well beyond that of modern Machine Learning methods. This is hypothesized to owe in part to the ability of animals to build structured representations of their environments, and modulate these representations to rapidly adapt to new settings. In the first part of this talk, I will discuss theoretical work describing how learned representations in hippocampus enable rapid adaptation to new goals by learning predictive representations, while entorhinal cortex compresses these predictive representations with spectral methods that support smooth generalization among related states. I will also cover recent work extending this account, in which we show how the predictive model can be adapted to the probabilistic setting to describe a broader array of generalization results in humans and animals, and how entorhinal representations can be modulated to support sample generation optimized for different behavioral states. In the second part of the talk, I will overview some of the ways in which we have combined many of the same mathematical concepts with state-of-the-art deep learning methods to improve efficiency and performance in machine learning applications like physical simulation, relational reasoning, and design.
Analogical Reasoning and Generalization for Interactive Task Learning in Physical Machines
Humans are natural teachers; learning through instruction is one of the most fundamental ways that we learn. Interactive Task Learning (ITL) is an emerging research agenda that studies the design of complex intelligent robots that can acquire new knowledge through natural human teacher-robot learner interactions. ITL methods are particularly useful for designing intelligent robots whose behavior can be adapted by humans collaborating with them. In this talk, I will summarize our recent findings on the structure that human instruction naturally has and motivate an intelligent system design that can exploit their structure. The system – AILEEN – is being developed using the common model of cognition. Architectures that implement the Common Model of Cognition - Soar, ACT-R, and Sigma - have a prominent place in research on cognitive modeling as well as on designing complex intelligent agents. However, they miss a critical piece of intelligent behavior – analogical reasoning and generalization. I will introduce a new memory – concept memory – that integrates with a common model of cognition architecture and supports ITL.
Memory-enriched computation and learning in spiking neural networks through Hebbian plasticity
Memory is a key component of biological neural systems that enables the retention of information over a huge range of temporal scales, ranging from hundreds of milliseconds up to years. While Hebbian plasticity is believed to play a pivotal role in biological memory, it has so far been analyzed mostly in the context of pattern completion and unsupervised learning. Here, we propose that Hebbian plasticity is fundamental for computations in biological neural systems. We introduce a novel spiking neural network (SNN) architecture that is enriched by Hebbian synaptic plasticity. We experimentally show that our memory-equipped SNN model outperforms state-of-the-art deep learning mechanisms in a sequential pattern-memorization task, as well as demonstrate superior out-of-distribution generalization capabilities compared to these models. We further show that our model can be successfully applied to one-shot learning and classification of handwritten characters, improving over the state-of-the-art SNN model. We also demonstrate the capability of our model to learn associations for audio to image synthesis from spoken and handwritten digits. Our SNN model further presents a novel solution to a variety of cognitive question answering tasks from a standard benchmark, achieving comparable performance to both memory-augmented ANN and SNN-based state-of-the-art solutions to this problem. Finally we demonstrate that our model is able to learn from rewards on an episodic reinforcement learning task and attain near-optimal strategy on a memory-based card game. Hence, our results show that Hebbian enrichment renders spiking neural networks surprisingly versatile in terms of their computational as well as learning capabilities. Since local Hebbian plasticity can easily be implemented in neuromorphic hardware, this also suggests that powerful cognitive neuromorphic systems can be build based on this principle.
Flexible multitask computation in recurrent networks utilizes shared dynamical motifs
Flexible computation is a hallmark of intelligent behavior. Yet, little is known about how neural networks contextually reconfigure for different computations. Humans are able to perform a new task without extensive training, presumably through the composition of elementary processes that were previously learned. Cognitive scientists have long hypothesized the possibility of a compositional neural code, where complex neural computations are made up of constituent components; however, the neural substrate underlying this structure remains elusive in biological and artificial neural networks. Here we identified an algorithmic neural substrate for compositional computation through the study of multitasking artificial recurrent neural networks. Dynamical systems analyses of networks revealed learned computational strategies that mirrored the modular subtask structure of the task-set used for training. Dynamical motifs such as attractors, decision boundaries and rotations were reused across different task computations. For example, tasks that required memory of a continuous circular variable repurposed the same ring attractor. We show that dynamical motifs are implemented by clusters of units and are reused across different contexts, allowing for flexibility and generalization of previously learned computation. Lesioning these clusters resulted in modular effects on network performance: a lesion that destroyed one dynamical motif only minimally perturbed the structure of other dynamical motifs. Finally, modular dynamical motifs could be reconfigured for fast transfer learning. After slow initial learning of dynamical motifs, a subsequent faster stage of learning reconfigured motifs to perform novel tasks. This work contributes to a more fundamental understanding of compositional computation underlying flexible general intelligence in neural systems. We present a conceptual framework that establishes dynamical motifs as a fundamental unit of computation, intermediate between the neuron and the network. As more whole brain imaging studies record neural activity from multiple specialized systems simultaneously, the framework of dynamical motifs will guide questions about specialization and generalization across brain regions.
Neuroscience of socioeconomic status and poverty: Is it actionable?
SES neuroscience, using imaging and other methods, has revealed generalizations of interest for population neuroscience and the study of individual differences. But beyond its scientific interest, SES is a topic of societal importance. Does neuroscience offer any useful insights for promoting socioeconomic justice and reducing the harms of poverty? In this talk I will use research from my own lab and others’ to argue that SES neuroscience has the potential to contribute to policy in this area, although its application is premature at present. I will also attempt to forecast the ways in which practical solutions to the problems of poverty may emerge from SES neuroscience. Bio: Martha Farah has conducted groundbreaking research on face and object recognition, visual attention, mental imagery, and semantic memory and - in more recent times - has been at the forefront of interdisciplinary research into neuroscience and society. This deals with topics such as using fMRI for lie detection, ethics of cognitive enhancement, and effects of social deprivation on brain development.
Emergence of homochirality in large molecular systems
The question of the origin of homochirality of living matter, or the dominance of one handedness for all molecules of life across the entire biosphere, is a long-standing puzzle in the research on the Origin of Life. In the fifties, Frank proposed a mechanism to explain homochirality based on the properties of a simple autocatalytic network containing only a few chemical species. Following this work, chemists struggled to find experimental realizations of this model, possibly due to a lack of proper methods to identify autocatalysis [1]. In any case, a model based on a few chemical species seems rather limited, because prebiotic earth is likely to have consisted of complex ‘soups’ of chemicals. To include this aspect of the problem, we recently proposed a mechanism based on certain features of large out-of-equilibrium chemical networks [2]. We showed that a phase transition towards an homochiral state is likely to occur as the number of chiral species in the system becomes large or as the amount of free energy injected into the system increases. Through an analysis of large chemical databases, we showed that there is no need for very large molecules for chiral species to dominate over achiral ones; it already happens when molecules contain about 10 heavy atoms. We also analyzed the various conventions used to measure chirality and discussed the relative chiral signs adopted by different groups of molecules [3]. We then proposed a generalization of Frank’s model for large chemical networks, which we characterized using random matrix theory. This analysis includes sparse networks, suggesting that the emergence of homochirality is a robust and generic transition. References: [1] A. Blokhuis, D. Lacoste, and P. Nghe, PNAS (2020), 117, 25230. [2] G. Laurent, D. Lacoste, and P. Gaspard, PNAS (2021) 118 (3) e2012741118. [3] G. Laurent, D. Lacoste, and P. Gaspard, Proc. R. Soc. A 478:20210590 (2022).
Parametric control of flexible timing through low-dimensional neural manifolds
Biological brains possess an exceptional ability to infer relevant behavioral responses to a wide range of stimuli from only a few examples. This capacity to generalize beyond the training set has been proven particularly challenging to realize in artificial systems. How neural processes enable this capacity to extrapolate to novel stimuli is a fundamental open question. A prominent but underexplored hypothesis suggests that generalization is facilitated by a low-dimensional organization of collective neural activity, yet evidence for the underlying neural mechanisms remains wanting. Combining network modeling, theory and neural data analysis, we tested this hypothesis in the framework of flexible timing tasks, which rely on the interplay between inputs and recurrent dynamics. We first trained recurrent neural networks on a set of timing tasks while minimizing the dimensionality of neural activity by imposing low-rank constraints on the connectivity, and compared the performance and generalization capabilities with networks trained without any constraint. We then examined the trained networks, characterized the dynamical mechanisms underlying the computations, and verified their predictions in neural recordings. Our key finding is that low-dimensional dynamics strongly increases the ability to extrapolate to inputs outside of the range used in training. Critically, this capacity to generalize relies on controlling the low-dimensional dynamics by a parametric contextual input. We found that this parametric control of extrapolation was based on a mechanism where tonic inputs modulate the dynamics along non-linear manifolds in activity space while preserving their geometry. Comparisons with neural recordings in the dorsomedial frontal cortex of macaque monkeys performing flexible timing tasks confirmed the geometric and dynamical signatures of this mechanism. Altogether, our results tie together a number of previous experimental findings and suggest that the low-dimensional organization of neural dynamics plays a central role in generalizable behaviors.
Why Some Intelligent Agents are Conscious
In this talk I will present an account of how an agent designed or evolved to be intelligent may come to enjoy subjective experiences. First, the agent is stipulated to be capable of (meta)representing subjective ‘qualitative’ sensory information, in the sense that it can easily assess how exactly similar a sensory signal is to all other possible sensory signals. This information is subjective in the sense that it concerns how the different stimuli can be distinguished by the agent itself, rather than how physically similar they are. For this to happen, sensory coding needs to satisfy sparsity and smoothness constraints, which are known to facilitate metacognition and generalization. Second, this qualitative information can under some specific circumstances acquire an ‘assertoric force’. This happens when a certain self-monitoring mechanism decides that the qualitative information reliably tracks the current state of the world, and informs a general symbolic reasoning system of this fact. I will argue that the having of subjective conscious experiences amounts to nothing more than having qualitative sensory information acquiring an assertoric status within one’s belief system. When this happens, the perceptual content presents itself as reflecting the state of the world right now, in ways that seem undeniably rational to the agent. At the same time, without effort, the agent also knows what the perceptual content is like, in terms of how subjectively similar it is to all other possible precepts. I will discuss the computational benefits of this architecture, for which consciousness might have arisen as a byproduct.
Novel word generalization in comparison designs: How do young children align stimuli when they learn object nouns and relational nouns?
It is well established that the opportunity to compare learning stimuli in a novel word learning/extension task elicits a larger number of conceptually relevant generalizations than standard no-comparison conditions. I will present results suggesting that the effectiveness of comparison depends on factors such as semantic distance, number of training items, dimension distinctiveness and interactions with age. I will address these issues in the case of familiar and unfamiliar object nouns and relational nouns. The alignment strategies followed by children during learning and at test (i.e., when learning items are compared and how children reach a solution) will be described with eye-tracking data. We will also assess the extent to which children’s performance in these tasks are associated with executive functions (inhibition and flexibility) and world knowledge. Finally, we will consider these issues in children with cognitive deficits (Intellectual deficiency, DLD)
Abstraction doesn't happen all at once (despite what some models of concept learning suggest)
In the past few years, there has been growing evidence that the basic ability for relational generalization starts in early infancy, with 3-month-olds seeming to learn relational abstractions with little training. Further, work with toddlers seem to suggest that relational generalizations are no more difficult than those based on objects, and they can readily consider both simultaneously. Likewise, causal learning research with adults suggests that people infer causal relationships at multiple levels of abstraction simultaneously as they learn about novel causal systems. These findings all appear counter to theories of concept learning that posit when concepts are first learned they tend to be concrete (tied to specific contexts and features) and abstraction proceeds incrementally as learners encounter more examples. The current talk will not question the veracity of any of these findings but will present several others from my and others’ research on relational learning that suggests that when the perceptual or conceptual content becomes more complex, patterns of incremental abstraction re-emerge. Further, the specific contexts and task parameters that support or hinder abstraction reveal the underlying cognitive processes. I will then consider whether the models that posit simultaneous, immediate learning at multiple levels of abstraction can accommodate these more complex patterns.
On the implicit bias of SGD in deep learning
Tali's work emphasized the tradeoff between compression and information preservation. In this talk I will explore this theme in the context of deep learning. Artificial neural networks have recently revolutionized the field of machine learning. However, we still do not have sufficient theoretical understanding of how such models can be successfully learned. Two specific questions in this context are: how can neural nets be learned despite the non-convexity of the learning problem, and how can they generalize well despite often having more parameters than training data. I will describe our recent work showing that gradient-descent optimization indeed leads to 'simpler' models, where simplicity is captured by lower weight norm and in some cases clustering of weight vectors. We demonstrate this for several teacher and student architectures, including learning linear teachers with ReLU networks, learning boolean functions and learning convolutional pattern detection architectures.
Towards a Theory of Human Visual Reasoning
Many tasks that are easy for humans are difficult for machines. In particular, while humans excel at tasks that require generalising across problems, machine systems notably struggle. One such task that has received a good amount of attention is the Synthetic Visual Reasoning Test (SVRT). The SVRT consists of a range of problems where simple visual stimuli must be categorised into one of two categories based on an unknown rule that must be induced. Conventional machine learning approaches perform well only when trained to categorise based on a single rule and are unable to generalise without extensive additional training to tasks with any additional rules. Multiple theories of higher-level cognition posit that humans solve such tasks using structured relational representations. Specifically, people learn rules based on structured representations that generalise to novel instances quickly and easily. We believe it is possible to model this approach in a single system which learns all the required relational representations from scratch and performs tasks such as SVRT in a single run. Here, we present a system which expands the DORA/LISA architecture and augments the existing model with principally novel components, namely a) visual reasoning based on the established theories of recognition by components; b) the process of learning complex relational representations by synthesis (in addition to learning by analysis). The proposed augmented model matches human behaviour on SVRT problems. Moreover, the proposed system stands as perhaps a more realistic account of human cognition, wherein rather than using tools that has been shown successful in the machine learning field to inform psychological theorising, we use established psychological theories to inform developing a machine system.
Children's relational noun generalization strategies
A common result is that comparison settings (i.e., several stimuli introduced simultaneously) favor conceptualization and generalization. However still little is known of the solving strategies used by children to compare and generalize novel words. Understanding the temporal dynamics of children’s solving strategies may help assess which processes underlie generalization. We tested children in noun and relational noun generalization tasks and collected eye tracking data. To analyze and interpret the data we followed predictions made by existing models of analogical reasoning and generalization. The data reveals clear patterns of exploration in which participants compare learning items before searching for a solution. Analyses of the beginning of trials show that early comparisons favor generalization and that errors may be caused by a lake of early comparison. Children then pursue their search in different ways according to the task. In this presentation I will present the generalization strategies revealed by eye tracking, compare the strategies from both tasks and confront them to existing models.
Beyond the binding problem: From basic affordances to symbolic thought
Human cognitive abilities seem qualitatively different from the cognitive abilities of other primates, a difference Penn, Holyoak, and Povinelli (2008) attribute to role-based relational reasoning—inferences and generalizations based on the relational roles to which objects (and other relations) are bound, rather than just the features of the objects themselves. Role-based relational reasoning depends on the ability to dynamically bind arguments to relational roles. But dynamic binding cannot be sufficient for relational thinking: Some non-human animals solve the dynamic binding problem, at least in some domains; and many non-human species generalize affordances to completely novel objects and scenes, a kind of universal generalization that likely depends on dynamic binding. If they can solve the dynamic binding problem, then why can they not reason about relations? What are they missing? I will present simulations with the LISA model of analogical reasoning (Hummel & Holyoak, 1997, 2003) suggesting that the missing pieces are multi-role integration (the capacity to combine multiple role bindings into complete relations) and structure mapping (the capacity to map different systems of role bindings onto one another). When LISA is deprived of either of these capacities, it can still generalize affordances universally, but it cannot reason symbolically; granted both abilities, LISA enjoys the full power of relational (symbolic) thought. I speculate that one reason it may have taken relational reasoning so long to evolve is that it required evolution to solve both problems simultaneously, since neither multi-role integration nor structure mapping appears to confer any adaptive advantage over simple role binding on its own.
Analogical encodings and recodings
This talk will focus on the idea that the kind of similarity driving analogical retrieval is determined by the kind of features encoded regarding the source and the target cue situations. Emphasis will be put on educational perspectives in order to show the influence of world semantics on learners’ problem representations and solving strategies, as well as the difficulties arising from semantic incongruence between representations and strategies. Special attention will be given to the recoding of semantically incongruent representations, a crucial step that learners struggle with, in order to illustrate a promising path for going beyond informal strategies.
Dopaminergic modulation of synaptic plasticity in learning and psychiatric disorders
Transient changes in dopamine activity in response to reward and punishment have been known to regulate reward-related learning. However, the cellular basis that detects the transient dopamine signaling has long been unclear. Using two-photon microscopy and optogenetics, I have shown that transient increases and decreases of dopamine modulate plasticity of dopamine D1 and D2 receptor-expressing cells in the nucleus accumbens, respectively. At the behavioral level, I characterized that these D1 and D2 cells cooperatively tune learning by generalization and discrimination learning. Interestingly, disturbance of the dopamine signaling impaired D2 cell plasticity and discrimination learning, which was analogous to salience misattribution seen in subjects with schizophrenia.
Transforming task representations
Humans can adapt to a novel task on our first try. By contrast, artificial intelligence systems often require immense amounts of data to adapt. In this talk, I will discuss my recent work (https://www.pnas.org/content/117/52/32970) on creating deep learning systems that can adapt on their first try by exploiting relationships between tasks. Specifically, the approach is based on transforming a representation for a known task to produce a representation for the novel task, by inferring and then using a higher order function that captures a relationship between the tasks. This approach can be interpreted as a type of analogical reasoning. I will show that task transformation can allow systems to adapt to novel tasks on their first try in domains ranging from card games, to mathematical objects, to image classification and reinforcement learning. I will discuss the analogical interpretation of this approach, an analogy between levels of abstraction within the model architecture that I refer to as homoiconicity, and what this work might suggest about using deep-learning models to infer analogies more generally.
Understanding how a hippocampal inhibitory microcircuit contributes to memory consolidation and generalization
Generalizing theories of cerebellum-like learning
Since the theories of Marr, Ito, and Albus, the cerebellum has provided an attractive well-characterized model system to investigate biological mechanisms of learning. In recent years, theories have been developed that provide a normative account for many features of the anatomy and function of cerebellar cortex and cerebellum-like systems, including the distribution of parallel fiber-Purkinje cell synaptic weights, the expansion in neuron number of the granule cell layer and their synaptic in-degree, and sparse coding by granule cells. Typically, these theories focus on the learning of random mappings between uncorrelated inputs and binary outputs, an assumption that may be reasonable for certain forms of associative conditioning but is also quite far from accounting for the important role the cerebellum plays in the control of smooth movements. I will discuss in-progress work with Marjorie Xie, Samuel Muscinelli, and Kameron Decker Harris generalizing these learning theories to correlated inputs and general classes of smooth input-output mappings. Our studies build on earlier work in theoretical neuroscience as well as recent advances in the kernel theory of wide neural networks. They illuminate the role of pre-expansion structures in processing input stimuli and the significance of sparse granule cell activity. If there is time, I will also discuss preliminary work with Jack Lindsey extending these theories beyond cerebellum-like structures to recurrent networks.
Cross Domain Generalisation in Humans and Machines
Recent advances in deep learning have produced models that far outstrip human performance in a number of domains. However, where machine learning approaches still fall far short of human-level performance is in the capacity to transfer knowledge across domains. While a human learner will happily apply knowledge acquired in one domain (e.g., mathematics) to a different domain (e.g., cooking; a vinaigrette is really just a ratio between edible fat and acid), machine learning models still struggle profoundly at such tasks. I will present a case that human intelligence might be (at least partially) usefully characterised by our ability to transfer knowledge widely, and a framework that we have developed for learning representations that support such transfer. The model is compared to current machine learning approaches.
Surprising generalizations in the neural implementation of Hebrew and English word reading
Generalization guided exploration
How do people learn in real-world environments where the space of possible actions can be vast or even infinite? The study of human learning has made rapid progress in past decades, from discovering the neural substrate of reward prediction errors, to building AI capable of mastering the game of Go. Yet this line of research has primarily focused on learning through repeated interactions with the same stimuli. How are humans able to rapidly adapt to novel situations and learn from such sparse examples? I propose a theory of how generalization guides human learning, by making predictions about which unobserved options are most promising to explore. Inspired by Roger Shepard’s law of generalization, I show how a Bayesian function learning model provides a mechanism for generalizing limited experiences to a wide set of novel possibilities, based on the simple principle that similar actions produce similar outcomes. This model of generalization generates predictions about the expected reward and underlying uncertainty of unexplored options, where both are vital components in how people actively explore the world. This model allows us to explain developmental differences in the explorative behavior of children, and suggests a general principle of learning across spatial, conceptual, and structured domains.
Making neural nets simple enough to succeed at universal relational generalization
Traditional brain-style (connectionist) approaches basically hit a wall when it comes to relational cognition. As an alternative to the well-known approaches of structured connectionism and deep learning, I present an engine for relational pattern recognition based on minimalist reinterpretations of first principles of connectionism. Results of computational experiments will be discussed on problems testing relational learning and universal generalization.
The geometry of abstraction in hippocampus and pre-frontal cortex
The curse of dimensionality plagues models of reinforcement learning and decision-making. The process of abstraction solves this by constructing abstract variables describing features shared by different specific instances, reducing dimensionality and enabling generalization in novel situations. Here we characterized neural representations in monkeys performing a task where a hidden variable described the temporal statistics of stimulus-response-outcome mappings. Abstraction was defined operationally using the generalization performance of neural decoders across task conditions not used for training. This type of generalization requires a particular geometric format of neural representations. Neural ensembles in dorsolateral pre-frontal cortex, anterior cingulate cortex and hippocampus, and in simulated neural networks, simultaneously represented multiple hidden and explicit variables in a format reflecting abstraction. Task events engaging cognitive operations modulated this format. These findings elucidate how the brain and artificial systems represent abstract variables, variables critical for generalization that in turn confers cognitive flexibility.
Abstraction and Analogy in Natural and Artificial Intelligence
In 1955, John McCarthy and colleagues proposed an AI summer research project with the following aim: “An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves.” More than six decades later, all of these research topics remain open and actively investigated in the AI community. While AI has made dramatic progress over the last decade in areas such as vision, natural language processing, and robotics, current AI systems still almost entirely lack the ability to form humanlike concepts and abstractions. Some cognitive scientists have proposed that analogy-making is a central mechanism for conceptual abstraction and understanding in humans. Douglas Hofstadter called analogy-making “the core of cognition”, and Hofstadter and co-author Emmanuel Sander noted, “Without concepts there can be no thought, and without analogies there can be no concepts.” In this talk I will reflect on the role played by analogy-making at all levels of intelligence, and on prospects for developing AI systems with humanlike abilities for abstraction and analogy.
Brain dynamics underlying memory for continuous natural events
The world confronts our senses with a continuous stream of rapidly changing information. Yet, we experience life as a series of episodes or events, and in memory these pieces seem to become even further organized. How do we recall and give structure to this complex information? Recent studies have begun to examine these questions using naturalistic stimuli and behavior: subjects view audiovisual movies and then freely recount aloud their memories of the events. We find brain activity patterns that are unique to individual episodes, and which reappear during verbal recollection; robust generalization of these patterns across people; and memory effects driven by the structure of links between events in a narrative. These findings construct a picture of how we comprehend and recall real-world events that unfold continuously across time.
Neural coding in the auditory cortex - "Emergent Scientists Seminar Series
Dr Jennifer Lawlor Title: Tracking changes in complex auditory scenes along the cortical pathway Complex acoustic environments, such as a busy street, are characterised by their everchanging dynamics. Despite their complexity, listeners can readily tease apart relevant changes from irrelevant variations. This requires continuously tracking the appropriate sensory evidence while discarding noisy acoustic variations. Despite the apparent simplicity of this perceptual phenomenon, the neural basis of the extraction of relevant information in complex continuous streams for goal-directed behavior is currently not well understood. As a minimalistic model for change detection in complex auditory environments, we designed broad-range tone clouds whose first-order statistics change at a random time. Subjects (humans or ferrets) were trained to detect these changes.They were faced with the dual-task of estimating the baseline statistics and detecting a potential change in those statistics at any moment. To characterize the extraction and encoding of relevant sensory information along the cortical hierarchy, we first recorded the brain electrical activity of human subjects engaged in this task using electroencephalography. Human performance and reaction times improved with longer pre-change exposure, consistent with improved estimation of baseline statistics. Change-locked and decision-related EEG responses were found in a centro-parietal scalp location, whose slope depended on change size, consistent with sensory evidence accumulation. To further this investigation, we performed a series of electrophysiological recordings in the primary auditory cortex (A1), secondary auditory cortex (PEG) and frontal cortex (FC) of the fully trained behaving ferret. A1 neurons exhibited strong onset responses and change-related discharges specific to neuronal tuning. PEG population showed reduced onset-related responses, but more categorical change-related modulations. Finally, a subset of FC neurons (dlPFC/premotor) presented a generalized response to all change-related events only during behavior. We show using a Generalized Linear Model (GLM) that the same subpopulation in FC encodes sensory and decision signals, suggesting that FC neurons could operate conversion of sensory evidence to perceptual decision. All together, these area-specific responses suggest a behavior-dependent mechanism of sensory extraction and generalization of task-relevant event. Aleksandar Ivanov Title: How does the auditory system adapt to different environments: A song of echoes and adaptation
Is Rule Learning Like Analogy?
Humans’ ability to perceive and abstract relational structure is fundamental to our learning. It allows us to acquire knowledge all the way from linguistic grammar to spatial knowledge to social structures. How does a learner begin to perceive structure in the world? Why do we sometimes fail to see structural commonalities across events? To begin to answer these questions, I attempt to bridge two large, yet somewhat separate research traditions in understanding human’s structural abstraction: rule learning (Marcus et al., 1999) and analogical learning (Gentner, 1989). On the one hand, rule learning research has shown humans’ domain-general ability and ease—as early as 7-month-olds—to abstract structure from a limited experience. On the other hand, analogical learning works have shown robust constraints in structural abstraction: young learners prefer object similarity over relational similarity. To understand this seeming paradox between ease and difficulty, we conducted a series of studies using the classic rule learning paradigm (Marcus et al., 1999) but with an analogical (object vs. relation) twist. Adults were presented with 2-minute sentences or events (syllables or shapes) containing a rule. At test, they had to choose between rule abstraction and object matches—the same syllable or shape they saw before. Surprisingly, while in the absence of object matches adults were perfectly capable of abstracting the rule, their ability to do so declined sharply when object matches were present. Our initial results suggest that rule learning ability may be subject to the usual constraints and signatures of analogical learning: preference to object similarity can dampen rule generalization. Humans’ abstraction is also concrete at the same time.
Thinking Fast and Slow in AlphaZero and the Brain
In his bestseller 'Thinking, Fast and Slow', Daniel Kahneman popularized the idea that there are two fundamentally different process of thought: a 'System 1' process that is unconscious and instinctive, and a 'System 2' process that is deliberative and requires conscious attention. There is a growing recognition that machine learning is mostly stuck at the 'System 1' level of cognition, and that moving to 'System 2' methods are key to solving long-standing challenges such as out-of-distribution generalization. In this talk, AlphaZero will be used as a case-study of the power of combining 'System 1' and 'System 2' processes. The similarities and differences between AlphaZero and human learning will be explored, along with drawing lessons for the future of machine learning.
Analogy in Cognitive Architecture
Cognitive architectures are attempts to build larger-scale models of minds. This talk will explore how structure-mapping models of analogical matching, retrieval, and generalization are used in the Companion cognitive architecture. Examples will include modeling conceptual change, learning by reading, and analogical Q/A training.
The geometry of abstraction in artificial and biological neural networks
The curse of dimensionality plagues models of reinforcement learning and decision-making. The process of abstraction solves this by constructing abstract variables describing features shared by different specific instances, reducing dimensionality and enabling generalization in novel situations. We characterized neural representations in monkeys performing a task where a hidden variable described the temporal statistics of stimulus-response-outcome mappings. Abstraction was defined operationally using the generalization performance of neural decoders across task conditions not used for training. This type of generalization requires a particular geometric format of neural representations. Neural ensembles in dorsolateral pre-frontal cortex, anterior cingulate cortex and hippocampus, and in simulated neural networks, simultaneously represented multiple hidden and explicit variables in a format reflecting abstraction. Task events engaging cognitive operations modulated this format. These findings elucidate how the brain and artificial systems represent abstract variables, variables critical for generalization that in turn confers cognitive flexibility.
Rational thoughts in neural codes
First, we describe a new method for inferring the mental model of an animal performing a natural task. We use probabilistic methods to compute the most likely mental model based on an animal’s sensory observations and actions. This also reveals dynamic beliefs that would be optimal according to the animal’s internal model, and thus provides a practical notion of “rational thoughts.” Second, we construct a neural coding framework by which these rational thoughts, their computational dynamics, and actions can be identified within the manifold of neural activity. We illustrate the value of this approach by training an artificial neural network to perform a generalization of a widely used foraging task. We analyze the network’s behaviour to find rational thoughts, and successfully recover the neural properties that implemented those thoughts, providing a way of interpreting the complex neural dynamics of the artificial brain. Joint work with Zhengwei Wu, Minhae Kwon, Saurabh Daptardar, and Paul Schrater.
Do better object recognition models improve the generalization gap in neural predictivity?
COSYNE 2022
Beyond accuracy: robustness and generalization properties of biologically plausible learning rules
COSYNE 2022
Hippocampal spatio-temporal cognitive maps adaptively guide reward generalization
COSYNE 2022
Hippocampal spatio-temporal cognitive maps adaptively guide reward generalization
COSYNE 2022
Sensory feedback can drive adaptation in motor cortex and facilitate generalization
COSYNE 2022
Sensory feedback can drive adaptation in motor cortex and facilitate generalization
COSYNE 2022
Abstract structure and generalization in sensorimotor networks configured with semantic-based instruction embeddings
COSYNE 2023
Generalization from one exemplar in mice and neurons
COSYNE 2023
Hippocampal CA2 modulates its geometry to solve the memory-generalization tradeoff for social memory
COSYNE 2023
Statistical learning yields generalization and naturalistic behaviors in transitive inference
COSYNE 2023
Distributed engrams enable parallel memory generalization and discrimination across brain regions
COSYNE 2025
The geometry and role of sequential activity in sensory processing and perceptual generalization
COSYNE 2025
Harnessing cortical space for generalization in a spiking neural network of working memory
COSYNE 2025
Probing the dynamics of neural representations that support generalization under continual learning
COSYNE 2025
The role of mixed selectivity and representation learning for compositional generalization
COSYNE 2025
The Role of Neural Variability in Supporting Few-shot Generalization in Cortex
COSYNE 2025
How sequential curricula enhance visual learning generalization: The role of subspace dimensionality
COSYNE 2025
Topology-aware, unbiased grid coding for rapid task generalization
COSYNE 2025
Cognitive factors of susceptibility to contextual fear generalization
FENS Forum 2024
Criticality and generalization in hippocampal subregions reflect relationship predicted by the free-energy principle
FENS Forum 2024
No evidence of thalamic contribution to seizure generalization
FENS Forum 2024
Fruit fraction discrimination and concept generalization in an Asian elephant (Elephas maximus)
FENS Forum 2024
Memory generalization and overgeneralization in sleep
FENS Forum 2024
Network-level disruptions in vulnerable individuals contribute to enhanced fear generalization in a rodent model of PTSD
FENS Forum 2024
A neural circuit connects aversive memory generalization to depression-like behaviors
FENS Forum 2024
Neuronal determinants of contextual fear memory generalization: From normal to pathological fear
FENS Forum 2024
Sex-specific effects in fear memory generalization in IL-6 knockout mice
FENS Forum 2024