Concept Learning
concept learning
Implications of Vector-space models of Relational Concepts
Vector-space models are used frequently to compare similarity and dimensionality among entity concepts. What happens when we apply these models to relational concepts? What is the evidence that such models do apply to relational concepts? If we use such a model, then one implication is that maximizing surface feature variation should improve relational concept learning. For example, in STEM instruction, the effectiveness of teaching by analogy is often limited by students’ focus on superficial features of the source and target exemplars. However, in contrast to the prediction of the vector-space computational model, the strategy of progressive alignment (moving from perceptually similar to different targets) has been suggested to address this issue (Gentner & Hoyos, 2017), and human behavioral evidence has shown benefits from progressive alignment. Here I will present some preliminary data that supports the computational approach. Participants were explicitly instructed to match stimuli based on relations while perceptual similarity of stimuli varied parametrically. We found that lower perceptual similarity reduced accurate relational matching. This finding demonstrates that perceptual similarity may interfere with relational judgements, but also hints at why progressive alignment maybe effective. These are preliminary, exploratory data and I to hope receive feedback on the framework and to start a discussion in a group on the utility of vector-space models for relational concepts in general.
Geometry of concept learning
Understanding Human ability to learn novel concepts from just a few sensory experiences is a fundamental problem in cognitive neuroscience. I will describe a recent work with Ben Sorcher and Surya Ganguli (PNAS, October 2022) in which we propose a simple, biologically plausible, and mathematically tractable neural mechanism for few-shot learning of naturalistic concepts. We posit that the concepts that can be learned from few examples are defined by tightly circumscribed manifolds in the neural firing-rate space of higher-order sensory areas. Discrimination between novel concepts is performed by downstream neurons implementing ‘prototype’ decision rule, in which a test example is classified according to the nearest prototype constructed from the few training examples. We show that prototype few-shot learning achieves high few-shot learning accuracy on natural visual concepts using both macaque inferotemporal cortex representations and deep neural network (DNN) models of these representations. We develop a mathematical theory that links few-shot learning to the geometric properties of the neural concept manifolds and demonstrate its agreement with our numerical simulations across different DNNs as well as different layers. Intriguingly, we observe striking mismatches between the geometry of manifolds in intermediate stages of the primate visual pathway and in trained DNNs. Finally, we show that linguistic descriptors of visual concepts can be used to discriminate images belonging to novel concepts, without any prior visual experience of these concepts (a task known as ‘zero-shot’ learning), indicated a remarkable alignment of manifold representations of concepts in visual and language modalities. I will discuss ongoing effort to extend this work to other high level cognitive tasks.
A multi-level account of hippocampal function in concept learning from behavior to neurons
A complete neuroscience requires multi-level theories that address phenomena ranging from higher-level cognitive behaviors to activities within a cell. Unfortunately, we don't have cognitive models of behavior whose components can be decomposed into the neural dynamics that give rise to behavior, leaving an explanatory gap. Here, we decompose SUSTAIN, a clustering model of concept learning, into neuron-like units (SUSTAIN-d; decomposed). Instead of abstract constructs (clusters), SUSTAIN-d has a pool of neuron-like units. With millions of units, a key challenge is how to bridge from abstract constructs such as clusters to neurons, whilst retaining high-level behavior. How does the brain coordinate neural activity during learning? Inspired by algorithms that capture flocking behavior in birds, we introduce a neural flocking learning rule to coordinate units that collectively form higher-level mental constructs ("virtual clusters"), neural representations (concept, place and grid cell-like assemblies), and parallels recurrent hippocampal activity. The decomposed model shows how brain-scale neural populations coordinate to form assemblies encoding concept and spatial representations, and why many neurons are required for robust performance. Our account provides a multi-level explanation for how cognition and symbol-like representations are supported by coordinated neural assemblies formed through learning.
Abstraction doesn't happen all at once (despite what some models of concept learning suggest)
In the past few years, there has been growing evidence that the basic ability for relational generalization starts in early infancy, with 3-month-olds seeming to learn relational abstractions with little training. Further, work with toddlers seem to suggest that relational generalizations are no more difficult than those based on objects, and they can readily consider both simultaneously. Likewise, causal learning research with adults suggests that people infer causal relationships at multiple levels of abstraction simultaneously as they learn about novel causal systems. These findings all appear counter to theories of concept learning that posit when concepts are first learned they tend to be concrete (tied to specific contexts and features) and abstraction proceeds incrementally as learners encounter more examples. The current talk will not question the veracity of any of these findings but will present several others from my and others’ research on relational learning that suggests that when the perceptual or conceptual content becomes more complex, patterns of incremental abstraction re-emerge. Further, the specific contexts and task parameters that support or hinder abstraction reveal the underlying cognitive processes. I will then consider whether the models that posit simultaneous, immediate learning at multiple levels of abstraction can accommodate these more complex patterns.