ePoster

Human-like Behavior and Neural Representations Emerge in a Goal-driven Model of Overt Visual Search for Natural Objects

Motahareh Pourrahimi, Irina Rish, Pouya Bashivan
Bernstein Conference 2024(2024)
Goethe University, Frankfurt, Germany

Conference

Bernstein Conference 2024

Goethe University, Frankfurt, Germany

Resources

Authors & Affiliations

Motahareh Pourrahimi, Irina Rish, Pouya Bashivan

Abstract

Visual search, the act of finding a target among multiple visually presented items, is a key paradigm in studying visual attention. While much is known about the brain networks underlying visual search, our understanding of the neural computations driving this behavior is limited, resulting in challenges in simulating such behavior in-silico. To address this gap, we developed an image-computable general-architecture visual search model (GVSM). We trained an artificial neural network to perform naturalistic visual search [Fig. A-B] using the same stimulus configuration as human psychophysics experiments [1] while incorporating eccentricity-dependent visual acuity and a dynamic internal representation that allows for the integration of information across fixations. GVSM consists of a retinal module approximating the biological eccentricity-dependent visual acuity, a convolutional neural network simulating the the ventral visual pathway [2-3], and a recurrent neural network (RNN) that takes on the role of the fronto-parietal network in guiding fixations. After training, GVSM demonstrated strong generalization in search performance to novel object categories while exhibiting high behavioral consistency with human subjects [Fig. C-D] without training on eye-tracking data. Further analysis of the RNN’s population activity revealed a retinocentric representation of the priority map [E-F], akin to those described in macaques [4-8], that persisted in time and was updated with each saccade alongside encoding of the cued object category in a separate subspace. The priority at different locations could be encoded in separate subspaces regardless of their physical location (discontinuous representation). Alternatively, the priority of locations in closer proximity in the visual field could be encoded in more aligned subspaces in the RNN’s hidden state space (continuous representation) [Fig. G]. Identified retinocentric priority map was encoded in a continuous topographic representation [Fig. I] with the generalization accuracy of the priority decoder trained on one location and tested on another location decreasing with increasing the distance between the two locations in the visual field [Fig. J]. Altogether, we present a neurally-plausible image-computable goal-driven model of visual search that replicates human behavior and neural signatures, providing a useful tool for proposing and testing hypotheses about the underlying neural computations of visual search.

Unique ID: bernstein-24/human-like-behavior-neural-representations-ada6d601