ePoster

Predictive processing of natural images by V1 firing rates revealed by self-supervised deep neural networks

Cem Uran,Alina Peter,Andreea Lazar,William Barnes,Johanna Klon-Lipok,Katharine A Shapcott,Rasmus Roese,Pascal Fries,Wolf Singer,Martin Vinck
COSYNE 2022(2022)
Lisbon, Portugal

Conference

COSYNE 2022

Lisbon, Portugal

Resources

Authors & Affiliations

Cem Uran,Alina Peter,Andreea Lazar,William Barnes,Johanna Klon-Lipok,Katharine A Shapcott,Rasmus Roese,Pascal Fries,Wolf Singer,Martin Vinck

Abstract

The responses of neurons in primary visual cortex depend strongly on the spatial context in which stimuli are embedded. This context-dependence has been theorized to reflect efficient and/or predictive coding of natural scenes. Critical tests of these theories in a generic form for natural images are currently lacking , because it is unclear how predictions, and measures of predictability, should be operationalized for natural scenes. Further, it is unclear whether there is just one type or level of predictability given the hierarchical processing in the primate visual system. Here, we trained neural networks that learn both linear and non-linear natural scene statistics (i.e. priors) across a very large number of images in a self-supervised manner. We hypothesize that this training will ensure that the networks develop a similar internal model as the primate visual system uses to generate predictions. The natural scene statistics contain low-level (pixel structure) to high-level (object information) features. Biological neurons, with encoding properties shaped by natural scene priors, could encode sensory predictions or prediction errors for this broad spectrum of features. Here, we derived measures to assess predictability in natural images in order to investigate the contextual modulation of firing rates, and distinguished between lower- and higher-order image features using convolutional neural networks for object recognition. We performed parallel recordings from awake macaque V1 viewing natural scenes of different sizes. Surprisingly, we found that firing rates are only weakly modulated by structural (pixel-wise) predictability and image compressibility. We find that the main factor determining a decrease in V1 firing rates is the contextual predictability of higher-level features of stimuli falling into the RF. These higher-order features correlated strongly with human perceptual similarity judgements, and with image salience. Our model provides improved prediction of surround modulation compared to state-of-the-art models based on Gabor filters. Our findings suggest that V1 neurons encode higher-order mismatch signals about features that are relevant for object recognition.

Unique ID: cosyne-22/predictive-processing-natural-images-f4ec73a5