Learners
learners
Comparing supervised learning dynamics: Deep neural networks match human data efficiency but show a generalisation lag
Recent research has seen many behavioral comparisons between humans and deep neural networks (DNNs) in the domain of image classification. Often, comparison studies focus on the end-result of the learning process by measuring and comparing the similarities in the representations of object categories once they have been formed. However, the process of how these representations emerge—that is, the behavioral changes and intermediate stages observed during the acquisition—is less often directly and empirically compared. In this talk, I'm going to report a detailed investigation of the learning dynamics in human observers and various classic and state-of-the-art DNNs. We develop a constrained supervised learning environment to align learning-relevant conditions such as starting point, input modality, available input data and the feedback provided. Across the whole learning process we evaluate and compare how well learned representations can be generalized to previously unseen test data. Comparisons across the entire learning process indicate that DNNs demonstrate a level of data efficiency comparable to human learners, challenging some prevailing assumptions in the field. However, our results also reveal representational differences: while DNNs' learning is characterized by a pronounced generalisation lag, humans appear to immediately acquire generalizable representations without a preliminary phase of learning training set-specific information that is only later transferred to novel data.
Learning by Analogy in Mathematics
Analogies between old and new concepts are common during classroom instruction. While previous studies of transfer focus on how features of initial learning guide later transfer to new problem solving, less is known about how to best support analogical transfer from previous learning while children are engaged in new learning episodes. Such research may have important implications for teaching and learning in mathematics, which often includes analogies between old and new information. Some existing research promotes supporting learners' explicit connections across old and new information within an analogy. In this talk, I will present evidence that instructors can invite implicit analogical reasoning through warm-up activities designed to activate relevant prior knowledge. Warm-up activities "close the transfer space" between old and new learning without additional direct instruction.
AI-assisted language learning: Assessing learners who memorize and reason by analogy
Vocabulary learning applications like Duolingo have millions of users around the world, but yet are based on very simple heuristics to choose teaching material to provide to their users. In this presentation, we will discuss the possibility to develop more advanced artificial teachers, which would be based on modeling of the learner’s inner characteristics. In the case of teaching vocabulary, understanding how the learner memorizes is enough. When it comes to picking grammar exercises, it becomes essential to assess how the learner reasons, in particular by analogy. This second application will illustrate how analogical and case-based reasoning can be employed in an alternative way in education: not as the teaching algorithm, but as a part of the learner’s model.
Learning from others, helping others learn: Cognitive foundations of distinctively human social learning
Learning does not occur in isolation. From parent-child interactions to formal classroom environments, humans explore, learn, and communicate in rich, diverse social contexts. Rather than simply observing and copying their conspecifics, humans engage in a range of epistemic practices that actively recruit those around them. What makes human social learning so distinctive, powerful, and smart? In this talk, I will present a series of studies that reveal the remarkably sophisticated inferential abilities that young children show not only in how they learn from others but also in how they help others learn. Children interact with others as learners and as teachers to learn and communicate about the world, about others, and even about the self. The results collectively paint a picture of human social learning that is far more than copying and imitation: It is active, bidirectional, and cooperative. I will end by discussing ongoing work that extends this picture beyond what we typically call “social learning”, with implications for building better machines that learn from and interact with humans.
The Limits of Causal Reasoning in Human and Machine Learning
A key purpose of causal reasoning by individuals and by collectives is to enhance action, to give humans yet more control over their environment. As a result, causal reasoning serves as the infrastructure of both thought and discourse. Humans represent causal systems accurately in some ways, but also show some systematic biases (we tend to neglect causal pathways other than the one we are thinking about). Even when accurate, people’s understanding of causal systems tends to be superficial; we depend on our communities for most of our causal knowledge and reasoning. Nevertheless, we are better causal reasoners than machines. Modern machine learners do not come close to matching human abilities.
Abstraction doesn't happen all at once (despite what some models of concept learning suggest)
In the past few years, there has been growing evidence that the basic ability for relational generalization starts in early infancy, with 3-month-olds seeming to learn relational abstractions with little training. Further, work with toddlers seem to suggest that relational generalizations are no more difficult than those based on objects, and they can readily consider both simultaneously. Likewise, causal learning research with adults suggests that people infer causal relationships at multiple levels of abstraction simultaneously as they learn about novel causal systems. These findings all appear counter to theories of concept learning that posit when concepts are first learned they tend to be concrete (tied to specific contexts and features) and abstraction proceeds incrementally as learners encounter more examples. The current talk will not question the veracity of any of these findings but will present several others from my and others’ research on relational learning that suggests that when the perceptual or conceptual content becomes more complex, patterns of incremental abstraction re-emerge. Further, the specific contexts and task parameters that support or hinder abstraction reveal the underlying cognitive processes. I will then consider whether the models that posit simultaneous, immediate learning at multiple levels of abstraction can accommodate these more complex patterns.
Achieving Abstraction: Early Competence & the Role of the Learning Context
Children's emerging ability to acquire and apply relational same-different concepts is often cited as a defining feature of human cognition, providing the foundation for abstract thought. Yet, young learners often struggle to ignore irrelevant surface features to attend to structural similarity instead. I will argue that young children have--and retain--genuine relational concepts from a young age, but tend to neglect abstract similarity due to a learned bias to attend to objects and their properties. Critically, this account predicts that differences in the structure of children's environmental input should lead to differences in the type of hypotheses they privilege and apply. I will review empirical support for this proposal that has (1) evaluated the robustness of early competence in relational reasoning, (2) identified cross-cultural differences in relational and object bias, and (3) provided evidence that contextual factors play a causal role in relational reasoning. Together, these studies suggest that the development of abstract thought may be more malleable and context-sensitive than initially believed.
Analogical encodings and recodings
This talk will focus on the idea that the kind of similarity driving analogical retrieval is determined by the kind of features encoded regarding the source and the target cue situations. Emphasis will be put on educational perspectives in order to show the influence of world semantics on learners’ problem representations and solving strategies, as well as the difficulties arising from semantic incongruence between representations and strategies. Special attention will be given to the recoding of semantically incongruent representations, a crucial step that learners struggle with, in order to illustrate a promising path for going beyond informal strategies.
Bridging brain and cognition: A multilayer network analysis of brain structural covariance and general intelligence in a developmental sample of struggling learners
Network analytic methods that are ubiquitous in other areas, such as systems neuroscience, have recently been used to test network theories in psychology, including intelligence research. The network or mutualism theory of intelligence proposes that the statistical associations among cognitive abilities (e.g. specific abilities such as vocabulary or memory) stem from causal relations among them throughout development. In this study, we used network models (specifically LASSO) of cognitive abilities and brain structural covariance (grey and white matter) to simultaneously model brain-behavior relationships essential for general intelligence in a large (behavioral, N=805; cortical volume, N=246; fractional anisotropy, N=165), developmental (ages 5-18) cohort of struggling learners (CALM). We found that mostly positive, small partial correlations pervade both our cognitive and neural networks. Moreover, calculating node centrality (absolute strength and bridge strength) and using two separate community detection algorithms (Walktrap and Clique Percolation), we found convergent evidence that subsets of both cognitive and neural nodes play an intermediary role between brain and behavior. We discuss implications and possible avenues for future studies.
Recurrent network dynamics lead to interference in sequential learning
Learning in real life is often sequential: A learner first learns task A, then task B. If the tasks are related, the learner may adapt the previously learned representation instead of generating a new one from scratch. Adaptation may ease learning task B but may also decrease the performance on task A. Such interference has been observed in experimental and machine learning studies. In the latter case, it is mediated by correlations between weight updates for the different tasks. In typical applications, like image classification with feed-forward networks, these correlated weight updates can be traced back to input correlations. For many neuroscience tasks, however, networks need to not only transform the input, but also generate substantial internal dynamics. Here we illuminate the role of internal dynamics for interference in recurrent neural networks (RNNs). We analyze RNNs trained sequentially on neuroscience tasks with gradient descent and observe forgetting even for orthogonal tasks. We find that the degree of interference changes systematically with tasks properties, especially with emphasis on input-driven over autonomously generated dynamics. To better understand our numerical observations, we thoroughly analyze a simple model of working memory: For task A, a network is presented with an input pattern and trained to generate a fixed point aligned with this pattern. For task B, the network has to memorize a second, orthogonal pattern. Adapting an existing representation corresponds to the rotation of the fixed point in phase space, as opposed to the emergence of a new one. We show that the two modes of learning – rotation vs. new formation – are directly linked to recurrent vs. input-driven dynamics. We make this notion precise in a further simplified, analytically tractable model, where learning is restricted to a 2x2 matrix. In our analysis of trained RNNs, we also make the surprising observation that, across different tasks, larger random initial connectivity reduces interference. Analyzing the fixed point task reveals the underlying mechanism: The random connectivity strongly accelerates the learning mode of new formation, and has less effect on rotation. The prior thus wins the race to zero loss, and interference is reduced. Altogether, our work offers a new perspective on sequential learning in recurrent networks, and the emphasis on internally generated dynamics allows us to take the history of individual learners into account.
Context and Comparison During Open-Ended Induction
A key component of humans' striking creativity in solving problems is our ability to construct novel descriptions to help us characterize novel categories. Bongard problems, which challenge the problem solver to come up with a rule for distinguishing visual scenes that fall into two categories, provide an elegant test of this ability. Bongard problems are challenging for both human and machine category learners because only a handful of example scenes are presented for each category, and they often require the open-ended creation of new descriptions. A new sub-type of Bongard problem called Physical Bongard Problems (PBPs) is introduced, which require solvers to perceive and predict the physical spatial dynamics implicit in the depicted scenes. The PATHS (Perceiving And Testing Hypotheses on Structures) computational model which can solve many PBPs is presented, and compared to human performance on the same problems. PATHS and humans are similarly affected by the ordering of scenes within a PBP, with spatially and temporally juxtaposed scenes promoting category learning when they are similar and belong to different categories, or dissimilar and belong to the same category. The core theoretical commitments of PATHS which we believe to also exemplify human open-ended category learning are a) the continual perception of new scene descriptions over the course of category learning; b) the context-dependent nature of that perceptual process, in which the scenes establish the context for one another; c) hypothesis construction by combining descriptions into logical expressions; and d) bi-directional interactions between perceiving new aspects of scenes and constructing hypotheses for the rule that distinguishes categories.
European University for Brain and Technology Virtual Opening
The European University for Brain and Technology, NeurotechEU, is opening its doors on the 16th of December. From health & healthcare to learning & education, Neuroscience has a key role in addressing some of the most pressing challenges that we face in Europe today. Whether the challenge is the translation of fundamental research to advance the state of the art in prevention, diagnosis or treatment of brain disorders or explaining the complex interactions between the brain, individuals and their environments to design novel practices in cities, schools, hospitals, or companies, brain research is already providing solutions for society at large. There has never been a branch of study that is as inter- and multi-disciplinary as Neuroscience. From the humanities, social sciences and law to natural sciences, engineering and mathematics all traditional disciplines in modern universities have an interest in brain and behaviour as a subject matter. Neuroscience has a great promise to become an applied science, to provide brain-centred or brain-inspired solutions that could benefit the society and kindle a new economy in Europe. The European University of Brain and Technology (NeurotechEU) aims to be the backbone of this new vision by bringing together eight leading universities, 250+ partner research institutions, companies, societal stakeholders, cities, and non-governmental organizations to shape education and training for all segments of society and in all regions of Europe. We will educate students across all levels (bachelor’s, master’s, doctoral as well as life-long learners) and train the next generation multidisciplinary scientists, scholars and graduates, provide them direct access to cutting-edge infrastructure for fundamental, translational and applied research to help Europe address this unmet challenge.
Infant Relational Learning - Interactions with Visual and Linguistic Factors
Humans are incredible learners, a talent supported by our ability to detect and transfer relational similarities between items and events. Spotting these common relations despite perceptual differences is challenging, yet there’s evidence that this ability begins early, with infants as young as 3 months discriminating same and different (Anderson et al., 2018; Ferry et al., 2015). How? To understand the underlying mechanisms, I examine how learning outcomes in the first year correspond with changes in input and in infant age. I discuss the commonalities in this process with that seen in older children and adults, as well as differences due to interactions with other maturing processes like language and visual attention.
Childhood as a solution to explore-exploit tensions
I argue that the evolution of our life history, with its distinctively long, protected human childhood allows an early period of broad hypothesis search and exploration, before the demands of goal-directed exploitation set in. This cognitive profile is also found in other animals and is associated with early behaviours such as neophilia and play. I relate this developmental pattern to computational ideas about explore-exploit trade-offs, search and sampling, and to neuroscience findings. I also present several lines of new empirical evidence suggesting that young human learners are highly exploratory, both in terms of their search for external information and their search through hypothesis spaces. In fact, they are sometimes more exploratory than older learners and adults.
What can we further learn from the brain for artificial intelligence?
Deep learning is a prime example of how brain-inspired computing can benefit development of artificial intelligence. But what else can we learn from the brain for bringing AI and robotics to the next level? Energy efficiency and data efficiency are the major features of the brain and human cognition that today’s deep learning has yet to deliver. The brain can be seen as a multi-agent system of heterogeneous learners using different representations and algorithms. The flexible use of reactive, model-free control and model-based “mental simulation” appears to be the basis for computational and data efficiency of the brain. How the brain efficiently acquires and flexibly combines prediction and control modules is a major open problem in neuroscience and its solution should help developments of more flexible and autonomous AI and robotics.
Is Rule Learning Like Analogy?
Humans’ ability to perceive and abstract relational structure is fundamental to our learning. It allows us to acquire knowledge all the way from linguistic grammar to spatial knowledge to social structures. How does a learner begin to perceive structure in the world? Why do we sometimes fail to see structural commonalities across events? To begin to answer these questions, I attempt to bridge two large, yet somewhat separate research traditions in understanding human’s structural abstraction: rule learning (Marcus et al., 1999) and analogical learning (Gentner, 1989). On the one hand, rule learning research has shown humans’ domain-general ability and ease—as early as 7-month-olds—to abstract structure from a limited experience. On the other hand, analogical learning works have shown robust constraints in structural abstraction: young learners prefer object similarity over relational similarity. To understand this seeming paradox between ease and difficulty, we conducted a series of studies using the classic rule learning paradigm (Marcus et al., 1999) but with an analogical (object vs. relation) twist. Adults were presented with 2-minute sentences or events (syllables or shapes) containing a rule. At test, they had to choose between rule abstraction and object matches—the same syllable or shape they saw before. Surprisingly, while in the absence of object matches adults were perfectly capable of abstracting the rule, their ability to do so declined sharply when object matches were present. Our initial results suggest that rule learning ability may be subject to the usual constraints and signatures of analogical learning: preference to object similarity can dampen rule generalization. Humans’ abstraction is also concrete at the same time.
Networks thinking themselves
Human learners acquire not only disconnected bits of information, but complex interconnected networks of relational knowledge. The capacity for such learning naturally depends on the architecture of the knowledge network itself, and also on the architecture of the computational unit – the brain – that encodes and processes the information. Here, I will discuss emerging work assessing network constraints on the learnability of relational knowledge, and the neural correlates of that learning.
'Reusers' and 'Unlearners' display distinct effects of forgetting on reversal learning in neural networks
Bernstein Conference 2024
Relationships between trace elements and cognitive behaviors in different strains of rats: Sprague Dawley: swift learners, yet sensitive souls
FENS Forum 2024