ePoster

Continual learning using dendritic modulations on view-invariant feedforward weights

Viet Anh Khoa Tran, Emre Neftci, Willem Wybo
Bernstein Conference 2024(2024)
Goethe University, Frankfurt, Germany

Conference

Bernstein Conference 2024

Goethe University, Frankfurt, Germany

Resources

Authors & Affiliations

Viet Anh Khoa Tran, Emre Neftci, Willem Wybo

Abstract

The brain is remarkably adept at learning from a continuous stream of data without significantly forgetting previously learnt skills. Conventional machine learning models struggle at continual learning, as weight updates that optimize the current task interfere with previously learnt tasks. A simple remedy to catastrophic forgetting is freezing a network pretrained on a set of base tasks, and training task-specific readouts on this shared trunk. However, this assumes that representations in the frozen network are separable under new tasks, therefore leading to sub-par performance. To continually learn on novel task data, previous methods suggest weight consolidation - preserving weights that are most impactful for the performance of previous tasks - and memory-based approaches - where the network is allowed to see a subset of images from previous tasks. For biological networks, prior work showed that dendritic top-down modulations provide a powerful mechanism to solve complex tasks while initial feedforward weights solely extract generic view-invariant features (A). This view aligns with the ‘neural collapse’ phenomenon from supervised machine learning, as the optimal solution for such algorithms is to be invariant to task-irrelevant features that are potentially relevant for other tasks (B). Instead, we posit that feature extraction can be learned solely by optimizing the networks to attract representations of smoothly moving visual stimuli, akin to contrastive self- supervised learning methods (C). We propose a continual learner that optimizes the feedforward weights towards view-invariant representations while training task-specific modulations in a supervised manner towards separable class clusters, which we train in a standard task-incremental setting (C). We show that this simple approach avoids catastrophic forgetting of class clusters, as opposed to training the whole network in a supervised manner, while also outperforming (1) task-specific readout without modulations and (2) frozen feedforward weights (D). This suggests that (1) top-down modulations are necessary and sufficient to shift the representations towards separable clusters and that (2) the SSL objective learns novel features based on the newly presented objects while maintaining features relevant to previous tasks, without requiring specific synaptic consolidation mechanisms.

Unique ID: bernstein-24/continual-learning-using-dendritic-0d046f2e