Visual
visual representation
Mapping the Brain‘s Visual Representations Using Deep Learning
The Effects of Negative Emotions on Mental Representation of Faces
Face detection is an initial step of many social interactions involving a comparison between a visual input and a mental representation of faces, built from previous experience. Whilst emotional state was found to affect the way humans attend to faces, little research has explored the effects of emotions on the mental representation of faces. Here, we examined the specific perceptual modulation of geometric properties of the mental representations associated with state anxiety and state depression on face detection, and to compare their emotional expression. To this end, we used an adaptation of the reverse correlation technique inspired by Gosselin and Schyns’, (2003) ‘Superstitious Approach’, to construct visual representations of observers’ mental representations of faces and to relate these to their mental states. In two sessions, on separate days, participants were presented with ‘colourful’ noise stimuli and asked to detect faces, which they were told were present. Based on the noise fragments that were identified as faces, we reconstructed the pictorial mental representation utilised by each participant in each session. We found a significant correlation between the size of the mental representation of faces and participants’ level of depression. Our findings provide a preliminary insight about the way emotions affect appearance expectation of faces. To further understand whether the facial expressions of participants’ mental representations reflect their emotional state, we are conducting a validation study with a group of naïve observers who are asked to classify the reconstructed face images by emotion. Thus, we assess whether the faces communicate participants’ emotional states to others.
Mouse visual cortex as a limited resource system that self-learns an ecologically-general representation
Studies of the mouse visual system have revealed a variety of visual brain areas in a roughly hierarchical arrangement, together with a multitude of behavioral capacities, ranging from stimulus-reward associations, to goal-directed navigation, and object-centric discriminations. However, an overall understanding of the mouse’s visual cortex organization, and how this organization supports visual behaviors, remains unknown. Here, we take a computational approach to help address these questions, providing a high-fidelity quantitative model of mouse visual cortex. By analyzing factors contributing to model fidelity, we identified key principles underlying the organization of mouse visual cortex. Structurally, we find that comparatively low-resolution and shallow structure were both important for model correctness. Functionally, we find that models trained with task-agnostic, unsupervised objective functions, based on the concept of contrastive embeddings were substantially better than models trained with supervised objectives. Finally, the unsupervised objective builds a general-purpose visual representation that enables the system to achieve better transfer on out-of-distribution visual, scene understanding and reward-based navigation tasks. Our results suggest that mouse visual cortex is a low-resolution, shallow network that makes best use of the mouse’s limited resources to create a light-weight, general-purpose visual system – in contrast to the deep, high-resolution, and more task-specific visual system of primates.
Analogy and Spatial Cognition: How and Why they matter for STEM learning
Space is the universal donor for relations" (Gentner, 2014). This quote is the foundation of my talk. I will explore how and why visual representations and analogies are related, and why. I will also explore how considering the relation between analogy and spatial reasoning can shed light on why and how spatial thinking is correlated with learning in STEM fields. For example, I will consider children’s numbers sense and learning of the number line from the perspective of analogical reasoning.
On the contributions of retinal direction selectivity to cortical motion processing in mice
Cells preferentially responding to visual motion in a particular direction are said to be direction-selective, and these were first identified in the primary visual cortex. Since then, direction-selective responses have been observed in the retina of several species, including mice, indicating motion analysis begins at the earliest stage of the visual hierarchy. Yet little is known about how retinal direction selectivity contributes to motion processing in the visual cortex. In this talk, I will present our experimental efforts to narrow this gap in our knowledge. To this end, we used genetic approaches to disrupt direction selectivity in the retina and mapped neuronal responses to visual motion in the visual cortex of mice using intrinsic signal optical imaging and two-photon calcium imaging. In essence, our work demonstrates that direction selectivity computed at the level of the retina causally serves to establish specialized motion responses in distinct areas of the mouse visual cortex. This finding thus compels us to revisit our notions of how the brain builds complex visual representations and underscores the importance of the processing performed in the periphery of sensory systems.
What does the primary visual cortex tell us about object recognition?
Object recognition relies on the complex visual representations in cortical areas at the top of the ventral stream hierarchy. While these are thought to be derived from low-level stages of visual processing, this has not been shown, yet. Here, I describe the results of two projects exploring the contributions of primary visual cortex (V1) processing to object recognition using artificial neural networks (ANNs). First, we developed hundreds of ANN-based V1 models and evaluated how their single neurons approximate those in the macaque V1. We found that, for some models, single neurons in intermediate layers are similar to their biological counterparts, and that the distributions of their response properties approximately match those in V1. Furthermore, we observed that models that better matched macaque V1 were also more aligned with human behavior, suggesting that object recognition is derived from low-level. Motivated by these results, we then studied how an ANN’s robustness to image perturbations relates to its ability to predict V1 responses. Despite their high performance in object recognition tasks, ANNs can be fooled by imperceptibly small, explicitly crafted perturbations. We observed that ANNs that better predicted V1 neuronal activity were also more robust to adversarial attacks. Inspired by this, we developed VOneNets, a new class of hybrid ANN vision models. Each VOneNet contains a fixed neural network front-end that simulates primate V1 followed by a neural network back-end adapted from current computer vision models. After training, VOneNets were substantially more robust, outperforming state-of-the-art methods on a set of perturbations. While current neural network architectures are arguably brain-inspired, these results demonstrate that more precisely mimicking just one stage of the primate visual system leads to new gains in computer vision applications and results in better models of the primate ventral stream and object recognition behavior.
Memory for Latent Representations: An Account of Working Memory that Builds on Visual Knowledge for Efficient and Detailed Visual Representations
Visual knowledge obtained from our lifelong experience of the world plays a critical role in our ability to build short-term memories. We propose a mechanistic explanation of how working memory (WM) representations are built from the latent representations of visual knowledge and can then be reconstructed. The proposed model, Memory for Latent Representations (MLR), features a variational autoencoder with an architecture that corresponds broadly to the human visual system and an activation-based binding pool of neurons that binds items’ attributes to tokenized representations. The simulation results revealed that shape information for stimuli that the model was trained on, can be encoded and retrieved efficiently from latents in higher levels of the visual hierarchy. On the other hand, novel patterns that are completely outside the training set can be stored from a single exposure using only latents from early layers of the visual system. Moreover, the representation of a given stimulus can have multiple codes, representing specific visual features such as shape or color, in addition to categorical information. Finally, we validated our model by testing a series of predictions against behavioral results acquired from WM tasks. The model provides a compelling demonstration of visual knowledge yielding the formation of compact visual representation for efficient memory encoding.
Visual processing beyond (rapid) serial visual presentations
A Cortical Circuit for Audio-Visual Predictions
Team work makes sensory streams work: our senses work together, learn from each other, and stand in for one another, the result of which is perception and understanding. Learned associations between stimuli in different sensory modalities can shape the way we perceive these stimuli (Mcgurk and Macdonald, 1976). During audio-visual associative learning, auditory cortex is thought to underlie multi-modal plasticity in visual cortex (McIntosh et al., 1998; Mishra et al., 2007; Zangenehpour and Zatorre, 2010). However, it is not well understood how processing in visual cortex is altered by an auditory stimulus that is predictive of a visual stimulus and what the mechanisms are that mediate such experience-dependent, audio-visual associations in sensory cortex. Here we describe a neural mechanism by which an auditory input can shape visual representations of behaviorally relevant stimuli through direct interactions between auditory and visual cortices. We show that the association of an auditory stimulus with a visual stimulus in a behaviorally relevant context leads to an experience-dependent suppression of visual responses in primary visual cortex (V1). Auditory cortex axons carry a mixture of auditory and retinotopically-matched visual input to V1, and optogenetic stimulation of these axons selectively suppresses V1 neurons responsive to the associated visual stimulus after, but not before, learning. Our results suggest that cross-modal associations can be stored in long-range cortical connections and that with learning these cross-modal connections function to suppress the responses to predictable input.
Top-down Modulation in Human Visual Cortex
Human vision flaunts a remarkable ability to recognize objects in the surrounding environment even in the absence of complete visual representation of these objects. This process is done almost intuitively and it was not until scientists had to tackle this problem in computer vision that they noticed its complexity. While current advances in artificial vision systems have made great strides exceeding human level in normal vision tasks, it has yet to achieve a similar robustness level. One cause of this robustness is the extensive connectivity that is not limited to a feedforward hierarchical pathway similar to the current state-of-the-art deep convolutional neural networks but also comprises recurrent and top-down connections. They allow the human brain to enhance the neural representations of degraded images in concordance with meaningful representations stored in memory. The mechanisms by which these different pathways interact are still not understood. In this seminar, studies concerning the effect of recurrent and top-down modulation on the neural representations resulting from viewing blurred images will be presented. Those studies attempted to uncover the role of recurrent and top-down connections in human vision. The results presented challenge the notion of predictive coding as a mechanism for top-down modulation of visual information during natural vision. They show that neural representation enhancement (sharpening) appears to be a more dominant process of different levels of visual hierarchy. They also show that inference in visual recognition is achieved through a Bayesian process between incoming visual information and priors from deeper processing regions in the brain.
Using navigational information to learn visual representations
COSYNE 2022
Using navigational information to learn visual representations
COSYNE 2022
Learning a visual representation by maximizing manifold capacity
COSYNE 2023
Visual representation of different levels of abstraction along the mouse visual hierarchy
COSYNE 2023
Evidence for compositionality in fMRI visual representations
COSYNE 2025