Resources
Authors & Affiliations
Rahul Ramesh, Anthony Bisulco, Ronald DiTullio, Linran Wei, Vijay Balasubramanian, Kostas Daniilidis, Pratik Chaudhari
Abstract
Natural images contain statistical redundancies that produce long-range spatial correlations and regularities such as edges, contours and textures (Simoncelli, 2003; van der Schaaf and van Hateren,1996; Hermundstad et al., 2014). The efficient coding hypothesis suggests that the brain optimizes sensory representation by removing such redundancy in input stimuli (Barlow et al., 1961). However, critics argue that an efficient code that simply encodes the inputs may not necessarily be useful for the tasks performed by an organism (Simoncelli and Olshausen, 2001; Barlow, 2001), i.e., neural systems should not just be adapted to the statistics of their inputs but also to the nature of tasks (DiCarlo and Cox, 2007).
We show that many perception tasks, visual tasks such as recognition, semantic segmentation, optical flow and monocular depth prediction, and audition tasks such as vocalization discrimination, are highly redundant functions of their inputs. Images projected into the spectral, Fourier or wavelet domains can be used to solve these tasks remarkably well, regardless of whether we use the top subspace where data varies the most, some intermediate subspace with moderate variability, or the bottom subspace where data varies the least. Even a random subspace works non-trivially well. We show that this phenomenon occurs because different input subspaces carry redundant information relevant to the task.
We argue that natural tasks exhibit such redundancies because they are the only ones that agents with bounded resources can readily learn. Animals learning a new task start with maladapted feature spaces which they refine to improve performance through learning, either during an individuals lifetime, or by selection over generations. Throughout this process of improvement the task must remain doable, suggesting that the evidence required to perform them must be spread across any set of features.