ventral stream
Latest
Building System Models of Brain-Like Visual Intelligence with Brain-Score
Research in the brain and cognitive sciences attempts to uncover the neural mechanisms underlying intelligent behavior in domains such as vision. Due to the complexities of brain processing, studies necessarily had to start with a narrow scope of experimental investigation and computational modeling. I argue that it is time for our field to take the next step: build system models that capture a range of visual intelligence behaviors along with the underlying neural mechanisms. To make progress on system models, we propose integrative benchmarking – integrating experimental results from many laboratories into suites of benchmarks that guide and constrain those models at multiple stages and scales. We show-case this approach by developing Brain-Score benchmark suites for neural (spike rates) and behavioral experiments in the primate visual ventral stream. By systematically evaluating a wide variety of model candidates, we not only identify models beginning to match a range of brain data (~50% explained variance), but also discover that models’ brain scores are predicted by their object categorization performance (up to 70% ImageNet accuracy). Using the integrative benchmarks, we develop improved state-of-the-art system models that more closely match shallow recurrent neuroanatomy and early visual processing to predict primate temporal processing and become more robust, and require fewer supervised synaptic updates. Taken together, these integrative benchmarks and system models are first steps to modeling the complexities of brain processing in an entire domain of intelligence.
What does the primary visual cortex tell us about object recognition?
Object recognition relies on the complex visual representations in cortical areas at the top of the ventral stream hierarchy. While these are thought to be derived from low-level stages of visual processing, this has not been shown, yet. Here, I describe the results of two projects exploring the contributions of primary visual cortex (V1) processing to object recognition using artificial neural networks (ANNs). First, we developed hundreds of ANN-based V1 models and evaluated how their single neurons approximate those in the macaque V1. We found that, for some models, single neurons in intermediate layers are similar to their biological counterparts, and that the distributions of their response properties approximately match those in V1. Furthermore, we observed that models that better matched macaque V1 were also more aligned with human behavior, suggesting that object recognition is derived from low-level. Motivated by these results, we then studied how an ANN’s robustness to image perturbations relates to its ability to predict V1 responses. Despite their high performance in object recognition tasks, ANNs can be fooled by imperceptibly small, explicitly crafted perturbations. We observed that ANNs that better predicted V1 neuronal activity were also more robust to adversarial attacks. Inspired by this, we developed VOneNets, a new class of hybrid ANN vision models. Each VOneNet contains a fixed neural network front-end that simulates primate V1 followed by a neural network back-end adapted from current computer vision models. After training, VOneNets were substantially more robust, outperforming state-of-the-art methods on a set of perturbations. While current neural network architectures are arguably brain-inspired, these results demonstrate that more precisely mimicking just one stage of the primate visual system leads to new gains in computer vision applications and results in better models of the primate ventral stream and object recognition behavior.
NMC4 Short Talk: Directly interfacing brain and deep networks exposes non-hierarchical visual processing
A recent approach to understanding the mammalian visual system is to show correspondence between the sequential stages of processing in the ventral stream with layers in a deep convolutional neural network (DCNN), providing evidence that visual information is processed hierarchically, with successive stages containing ever higher-level information. However, correspondence is usually defined as shared variance between brain region and model layer. We propose that task-relevant variance is a stricter test: If a DCNN layer corresponds to a brain region, then substituting the model’s activity with brain activity should successfully drive the model’s object recognition decision. Using this approach on three datasets (human fMRI and macaque neuron firing rates) we found that in contrast to the hierarchical view, all ventral stream regions corresponded best to later model layers. That is, all regions contain high-level information about object category. We hypothesised that this is due to recurrent connections propagating high-level visual information from later regions back to early regions, in contrast to the exclusively feed-forward connectivity of DCNNs. Using task-relevant correspondence with a late DCNN layer akin to a tracer, we used Granger causal modelling to show late-DCNN correspondence in IT drives correspondence in V4. Our analysis suggests, effectively, that no ventral stream region can be appropriately characterised as ‘early’ beyond 70ms after stimulus presentation, challenging hierarchical models. More broadly, we ask what it means for a model component and brain region to correspond: beyond quantifying shared variance, we must consider the functional role in the computation. We also demonstrate that using a DCNN to decode high-level conceptual information from ventral stream produces a general mapping from brain to model activation space, which generalises to novel classes held-out from training data. This suggests future possibilities for brain-machine interface with high-level conceptual information, beyond current designs that interface with the sensorimotor periphery.
Towards a neurally mechanistic understanding of visual cognition
I am interested in developing a neurally mechanistic understanding of how primate brains represent the world through its visual system and how such representations enable a remarkable set of intelligent behaviors. In this talk, I will primarily highlight aspects of my current research that focuses on dissecting the brain circuits that support core object recognition behavior (primates’ ability to categorize objects within hundreds of milliseconds) in non-human primates. On the one hand, my work empirically examines how well computational models of the primate ventral visual pathways embed knowledge of the visual brain function (e.g., Bashivan*, Kar*, DiCarlo, Science, 2019). On the other hand, my work has led to various functional and architectural insights that help improve such brain models. For instance, we have exposed the necessity of recurrent computations in primate core object recognition (Kar et al., Nature Neuroscience, 2019), one that is strikingly missing from most feedforward artificial neural network models. Specifically, we have observed that the primate ventral stream requires fast recurrent processing via ventrolateral PFC for robust core object recognition (Kar and DiCarlo, Neuron, 2021). In addition, I have been currently developing various chemogenetic strategies to causally target specific bidirectional neural circuits in the macaque brain during multiple object recognition tasks to further probe their relevance during this behavior. I plan to transform these data and insights into tangible progress in neuroscience via my collaboration with various computational groups and building improved brain models of object recognition. I hope to end the talk with a brief glimpse of some of my planned future work!
ventral stream coverage
4 items