Topic spotlight
TopicWorld Wide

image classification

Discover seminars, jobs, and research tagged with image classification across World Wide.
8 curated items7 Seminars1 ePoster
Updated about 1 year ago
8 items · image classification
8 results
SeminarPsychology

Comparing supervised learning dynamics: Deep neural networks match human data efficiency but show a generalisation lag

Lukas Huber
University of Bern
Sep 22, 2024

Recent research has seen many behavioral comparisons between humans and deep neural networks (DNNs) in the domain of image classification. Often, comparison studies focus on the end-result of the learning process by measuring and comparing the similarities in the representations of object categories once they have been formed. However, the process of how these representations emerge—that is, the behavioral changes and intermediate stages observed during the acquisition—is less often directly and empirically compared. In this talk, I'm going to report a detailed investigation of the learning dynamics in human observers and various classic and state-of-the-art DNNs. We develop a constrained supervised learning environment to align learning-relevant conditions such as starting point, input modality, available input data and the feedback provided. Across the whole learning process we evaluate and compare how well learned representations can be generalized to previously unseen test data. Comparisons across the entire learning process indicate that DNNs demonstrate a level of data efficiency comparable to human learners, challenging some prevailing assumptions in the field. However, our results also reveal representational differences: while DNNs' learning is characterized by a pronounced generalisation lag, humans appear to immediately acquire generalizable representations without a preliminary phase of learning training set-specific information that is only later transferred to novel data.

SeminarPsychology

Error Consistency between Humans and Machines as a function of presentation duration

Thomas Klein
Eberhard Karls Universität Tübingen
Jun 30, 2024

Within the last decade, Deep Artificial Neural Networks (DNNs) have emerged as powerful computer vision systems that match or exceed human performance on many benchmark tasks such as image classification. But whether current DNNs are suitable computational models of the human visual system remains an open question: While DNNs have proven to be capable of predicting neural activations in primate visual cortex, psychophysical experiments have shown behavioral differences between DNNs and human subjects, as quantified by error consistency. Error consistency is typically measured by briefly presenting natural or corrupted images to human subjects and asking them to perform an n-way classification task under time pressure. But for how long should stimuli ideally be presented to guarantee a fair comparison with DNNs? Here we investigate the influence of presentation time on error consistency, to test the hypothesis that higher-level processing drives behavioral differences. We systematically vary presentation times of backward-masked stimuli from 8.3ms to 266ms and measure human performance and reaction times on natural, lowpass-filtered and noisy images. Our experiment constitutes a fine-grained analysis of human image classification under both image corruptions and time pressure, showing that even drastically time-constrained humans who are exposed to the stimuli for only two frames, i.e. 16.6ms, can still solve our 8-way classification task with success rates way above chance. We also find that human-to-human error consistency is already stable at 16.6ms.

SeminarNeuroscienceRecording

NMC4 Short Talk: Image embeddings informed by natural language improve predictions and understanding of human higher-level visual cortex

Aria Wang
Carnegie Mellon University
Nov 30, 2021

To better understand human scene understanding, we extracted features from images using CLIP, a neural network model of visual concept trained with supervision from natural language. We then constructed voxelwise encoding models to explain whole brain responses arising from viewing natural images from the Natural Scenes Dataset (NSD) - a large-scale fMRI dataset collected at 7T. Our results reveal that CLIP, as compared to convolution based image classification models such as ResNet or AlexNet, as well as language models such as BERT, gives rise to representations that enable better prediction performance - up to a 0.86 correlation with test data and an r-square of 0.75 - in higher-level visual cortex in humans. Moreover, CLIP representations explain distinctly unique variance in these higher-level visual areas as compared to models trained with only images or text. Control experiments show that the improvement in prediction observed with CLIP is not due to architectural differences (transformer vs. convolution) or to the encoding of image captions per se (vs. single object labels). Together our results indicate that CLIP and, more generally, multimodal models trained jointly on images and text, may serve as better candidate models of representation in human higher-level visual cortex. The bridge between language and vision provided by jointly trained models such as CLIP also opens up new and more semantically-rich ways of interpreting the visual brain.

SeminarNeuroscienceRecording

Recurrent network dynamics lead to interference in sequential learning

Friedrich Schuessler
Barak lab, Technion, Haifa, Israel
Apr 28, 2021

Learning in real life is often sequential: A learner first learns task A, then task B. If the tasks are related, the learner may adapt the previously learned representation instead of generating a new one from scratch. Adaptation may ease learning task B but may also decrease the performance on task A. Such interference has been observed in experimental and machine learning studies. In the latter case, it is mediated by correlations between weight updates for the different tasks. In typical applications, like image classification with feed-forward networks, these correlated weight updates can be traced back to input correlations. For many neuroscience tasks, however, networks need to not only transform the input, but also generate substantial internal dynamics. Here we illuminate the role of internal dynamics for interference in recurrent neural networks (RNNs). We analyze RNNs trained sequentially on neuroscience tasks with gradient descent and observe forgetting even for orthogonal tasks. We find that the degree of interference changes systematically with tasks properties, especially with emphasis on input-driven over autonomously generated dynamics. To better understand our numerical observations, we thoroughly analyze a simple model of working memory: For task A, a network is presented with an input pattern and trained to generate a fixed point aligned with this pattern. For task B, the network has to memorize a second, orthogonal pattern. Adapting an existing representation corresponds to the rotation of the fixed point in phase space, as opposed to the emergence of a new one. We show that the two modes of learning – rotation vs. new formation – are directly linked to recurrent vs. input-driven dynamics. We make this notion precise in a further simplified, analytically tractable model, where learning is restricted to a 2x2 matrix. In our analysis of trained RNNs, we also make the surprising observation that, across different tasks, larger random initial connectivity reduces interference. Analyzing the fixed point task reveals the underlying mechanism: The random connectivity strongly accelerates the learning mode of new formation, and has less effect on rotation. The prior thus wins the race to zero loss, and interference is reduced. Altogether, our work offers a new perspective on sequential learning in recurrent networks, and the emphasis on internally generated dynamics allows us to take the history of individual learners into account.

SeminarNeuroscience

Deep reinforcement learning and its neuroscientific implications

Matt Botvinick
DeepMind
Jul 17, 2020

The last few years have seen some dramatic developments in artificial intelligence research. What implications might these have for neuroscience? Investigations of this question have, to date, focused largely on deep neural networks trained using supervised learning, in tasks such as image classification. However, there is another area of recent AI work which has so far received less attention from neuroscientists, but which may have more profound neuroscientific implications: Deep reinforcement learning. Deep RL offers a rich framework for studying the interplay among learning, representation and decision-making, offering to the brain sciences a new set of research tools and a wide range of novel hypotheses. I’ll provide a high level introduction to deep RL, discuss some recent neuroscience-oriented investigations from my group at DeepMind, and survey some wider implications for research on brain and behavior.

ePoster

Saccade Mechanisms for Image Classification, Object Detection and Tracking

Zachary Daniels

Neuromatch 5