Resources
Authors & Affiliations
Rene Larisch, Fred Hamker
Abstract
Detecting a specific object in a visual scene is a task that the human visual system performs every day. To
perform this target-specific object detection, it is necessary to filter out all unwanted objects during the
processing, so that only the specific object can be detected. For this purpose, attention-related mechanisms
along the visual pathway have been proposed to allow focusing on the desired object. Beuth and Hamker
(2015) [1], using a biologically inspired recurrent model simulating lower and higher layers of the visual cortex,
the frontal eye field, and the prefrontal cortex, showed how the representation of visual information in higher
cortical areas is modulated by two types of attention: 1) Feature-based attention enhances neural activity with
respect to the represented features of the target stimulus. 2) Spatial attention enhances neural responses to
stimuli within the attended area or inhibits the surrounding area [1]. While the combined effect of these two
attention-mechanisms can explain several experiments (such as surround suppression or biased competition
[1]), the limited flexibility of the Hebbian-like learning rule used allows the model to find the desired object in a
target-specific object detection task only when the objects were presented on a black background [2].
In recent years, deep neural networks optimized with the backpropagation algorithm have demonstrated high
performance in object recognition [3] and detection [4], making them a suitable alternative for lower visual
areas. To ensure a good representation for the higher visual areas, we use an autoencoder trained only to
reconstruct single objects with different backgrounds.
Our results show that this provides a useful visual processing of the lower visual areas in the attention model.
By combining a deep neural network with a biologically motivated model of attention, we demonstrate its
performance on a target-specific object detection task on multi-object ensembles with unknown background
data.