Resources
Authors & Affiliations
Zeyuan Ye, Ralf Wessel, Tom Franken
Abstract
To make sense of visual scenes, the brain must segment objects from background. This is thought to be facilitated by border ownership (BOS) neurons in the primate visual cortex. These neurons encode whether a local border in a visual scene is part of an object on one or the other side of the border. It is unclear how these signals emerge in neural networks without a teaching signal of what is foreground and what is background. Here we discovered that brain-like BOS units emerge in PredNet, a deep artificial neural network that is not trained to segment objects but simply to predict future frames of natural videos. Similar to neurophysiology studies, we examined the responses of PredNet units to scenes with square objects, and find that a significant number of these units respond differently to an identical contrast border in their receptive field depending on which side of the border belongs to a foreground object, independent of local contrast. Moreover, we find that BOS units in PredNet share several other properties with BOS neurons in the brain, including: (1) they are tolerant to changes of square size, position, and orientation; (2) they are modulated by isolated fragments in a way that is consistent with their preference for BOS; (3) their BOS signals persist after removing the cues for BOS. Finally, we find that ablating BOS units affects prediction accuracy more than ablating the same number of units that are not selective for BOS. Overall, we find that BOS units with brain-like properties emerge in an artificial neural network to support prediction of complex videos. Our findings suggest that BOS neurons, which have always been assumed to segment scenes, a classical ventral stream operation that is typically studied under static conditions, may be especially useful to predict complex dynamic visual input.