Visual Areas
visual areas
Trends in NeuroAI - Meta's MEG-to-image reconstruction
Trends in NeuroAI is a reading group hosted by the MedARC Neuroimaging & AI lab (https://medarc.ai/fmri). Title: Brain-optimized inference improves reconstructions of fMRI brain activity Abstract: The release of large datasets and developments in AI have led to dramatic improvements in decoding methods that reconstruct seen images from human brain activity. We evaluate the prospect of further improving recent decoding methods by optimizing for consistency between reconstructions and brain activity during inference. We sample seed reconstructions from a base decoding method, then iteratively refine these reconstructions using a brain-optimized encoding model that maps images to brain activity. At each iteration, we sample a small library of images from an image distribution (a diffusion model) conditioned on a seed reconstruction from the previous iteration. We select those that best approximate the measured brain activity when passed through our encoding model, and use these images for structural guidance during the generation of the small library in the next iteration. We reduce the stochasticity of the image distribution at each iteration, and stop when a criterion on the "width" of the image distribution is met. We show that when this process is applied to recent decoding methods, it outperforms the base decoding method as measured by human raters, a variety of image feature metrics, and alignment to brain activity. These results demonstrate that reconstruction quality can be significantly improved by explicitly aligning decoding distributions to brain activity distributions, even when the seed reconstruction is output from a state-of-the-art decoding algorithm. Interestingly, the rate of refinement varies systematically across visual cortex, with earlier visual areas generally converging more slowly and preferring narrower image distributions, relative to higher-level brain areas. Brain-optimized inference thus offers a succinct and novel method for improving reconstructions and exploring the diversity of representations across visual brain areas. Speaker: Reese Kneeland is a Ph.D. student at the University of Minnesota working in the Naselaris lab. Paper link: https://arxiv.org/abs/2312.07705
Restructuring cortical feedback circuits
We hardly notice when there is a speck on our glasses, the obstructed visual information seems to be magically filled in. The mechanistic basis for this fundamental perceptual phenomenon has, however, remained obscure. What enables neurons in the visual system to respond to context when the stimulus is not available? While feedforward information drives the activity in cortex, feedback information is thought to provide contextual signals that are merely modulatory. We have made the discovery that mouse primary visual cortical neurons are strongly driven by feedback projections from higher visual areas when their feedforward sensory input from the retina is missing. This drive is so strong that it makes visual cortical neurons fire as much as if they were receiving a direct sensory input. These signals are likely used to predict input from the feedforward pathway. Preliminary results show that these feedback projections are strongly influenced by experience and learning.
Hierarchical transformation of visual event timing representations in the human brain: response dynamics in early visual cortex and timing-tuned responses in association cortices
Quantifying the timing (duration and frequency) of brief visual events is vital to human perception, multisensory integration and action planning. For example, this allows us to follow and interact with the precise timing of speech and sports. Here we investigate how visual event timing is represented and transformed across the brain’s hierarchy: from sensory processing areas, through multisensory integration areas, to frontal action planning areas. We hypothesized that the dynamics of neural responses to sensory events in sensory processing areas allows derivation of event timing representations. This would allow higher-level processes such as multisensory integration and action planning to use sensory timing information, without the need for specialized central pacemakers or processes. Using 7T fMRI and neural model-based analyses, we found responses that monotonically increase in amplitude with visual event duration and frequency, becoming increasingly clear from primary visual cortex to lateral occipital visual field maps. Beginning in area MT/V5, we found a gradual transition from monotonic to tuned responses, with response amplitudes peaking at different event timings in different recording sites. While monotonic response components were limited to the retinotopic location of the visual stimulus, timing-tuned response components were independent of the recording sites' preferred visual field positions. These tuned responses formed a network of topographically organized timing maps in superior parietal, postcentral and frontal areas. From anterior to posterior timing maps, multiple events were increasingly integrated, response selectivity narrowed, and responses focused increasingly on the middle of the presented timing range. These results suggest that responses to event timing are transformed from the human brain’s sensory areas to the association cortices, with the event’s temporal properties being increasingly abstracted from the response dynamics and locations of early sensory processing. The resulting abstracted representation of event timing is then propagated through areas implicated in multisensory integration and action planning.
Feedback controls what we see
We hardly notice when there is a speck on our glasses, the obstructed visual information seems to be magically filled in. The visual system uses visual context to predict the content of the stimulus. What enables neurons in the visual system to respond to context when the stimulus is not available? In cortex, sensory processing is based on a combination of feedforward information arriving from sensory organs, and feedback information that originates in higher-order areas. Whereas feedforward information drives the activity in cortex, feedback information is thought to provide contextual signals that are merely modulatory. We have made the exciting discovery that mouse primary visual cortical neurons are strongly driven by feedback projections from higher visual areas, in particular when their feedforward sensory input from the retina is missing. This drive is so strong that it makes visual cortical neurons fire as much as if they were receiving a direct sensory input.
How sleep contributes to visual perceptual learning
Sleep is crucial for the continuity and development of life. Sleep-related problems can alter brain function, and cause potentially severe psychological and behavioral consequences. However, the role of sleep in our mind and behavior is far from clear. In this talk, I will present our research on how sleep may play a role in visual perceptual learning (VPL) by using simultaneous magnetic resonance spectroscopy and polysomnography in human subjects. We measured the concentrations of neurotransmitters in the early visual areas during sleep and obtained the excitation/inhibition (E/I) ratio which represents the amount of plasticity in the visual system. We found that the E/I ratio significantly increased during NREM sleep while it decreased during REM sleep. The E/I ratio during NREM sleep was correlated with offline performance gains by sleep, while the E/I ratio during REM sleep was correlated with the amount of learning stabilization. These suggest that NREM sleep increases plasticity, while REM sleep decreases it to solidify once enhanced learning. NREM and REM sleep may play complementary roles, reflected by significantly different neurochemical processing, in VPL.
A novel form of retinotopy in area V2 highlights location-dependent feature selectivity in the visual system
Topographic maps are a prominent feature of brain organization, reflecting local and large-scale representation of the sensory surface. Traditionally, such representations in early visual areas are conceived as retinotopic maps preserving ego-centric retinal spatial location while ensuring that other features of visual input are uniformly represented for every location in space. I will discuss our recent findings of a striking departure from this simple mapping in the secondary visual area (V2) of the tree shrew that is best described as a sinusoidal transformation of the visual field. This sinusoidal topography is ideal for achieving uniform coverage in an elongated area like V2 as predicted by mathematical models designed for wiring minimization, and provides a novel explanation for stripe-like patterns of intra-cortical connections and functional response properties in V2. Our findings suggest that cortical circuits flexibly implement solutions to sensory surface representation, with dramatic consequences for large-scale cortical organization. Furthermore our work challenges the framework of relatively independent encoding of location and features in the visual system, showing instead location-dependent feature sensitivity produced by specialized processing of different features in different spatial locations. In the second part of the talk, I will propose that location-dependent feature sensitivity is a fundamental organizing principle of the visual system that achieves efficient representation of positional regularities in visual input, and reflects the evolutionary selection of sensory and motor circuits to optimally represent behaviorally relevant information. The relevant papers can be found here: V2 retinotopy (Sedigh-Sarvestani et al. Neuron, 2021) Location-dependent feature sensitivity (Sedigh-Sarvestani et al. Under Review, 2022)
NMC4 Short Talk: Hypothesis-neutral response-optimized models of higher-order visual cortex reveal strong semantic selectivity
Modeling neural responses to naturalistic stimuli has been instrumental in advancing our understanding of the visual system. Dominant computational modeling efforts in this direction have been deeply rooted in preconceived hypotheses. In contrast, hypothesis-neutral computational methodologies with minimal apriorism which bring neuroscience data directly to bear on the model development process are likely to be much more flexible and effective in modeling and understanding tuning properties throughout the visual system. In this study, we develop a hypothesis-neutral approach and characterize response selectivity in the human visual cortex exhaustively and systematically via response-optimized deep neural network models. First, we leverage the unprecedented scale and quality of the recently released Natural Scenes Dataset to constrain parametrized neural models of higher-order visual systems and achieve novel predictive precision, in some cases, significantly outperforming the predictive success of state-of-the-art task-optimized models. Next, we ask what kinds of functional properties emerge spontaneously in these response-optimized models? We examine trained networks through structural ( feature visualizations) as well as functional analysis (feature verbalizations) by running `virtual' fMRI experiments on large-scale probe datasets. Strikingly, despite no category-level supervision, since the models are solely optimized for brain response prediction from scratch, the units in the networks after optimization act as detectors for semantic concepts like `faces' or `words', thereby providing one of the strongest evidences for categorical selectivity in these visual areas. The observed selectivity in model neurons raises another question: are the category-selective units simply functioning as detectors for their preferred category or are they a by-product of a non-category-specific visual processing mechanism? To investigate this, we create selective deprivations in the visual diet of these response-optimized networks and study semantic selectivity in the resulting `deprived' networks, thereby also shedding light on the role of specific visual experiences in shaping neuronal tuning. Together with this new class of data-driven models and novel model interpretability techniques, our study illustrates that DNN models of visual cortex need not be conceived as obscure models with limited explanatory power, rather as powerful, unifying tools for probing the nature of representations and computations in the brain.
NMC4 Short Talk: Image embeddings informed by natural language improve predictions and understanding of human higher-level visual cortex
To better understand human scene understanding, we extracted features from images using CLIP, a neural network model of visual concept trained with supervision from natural language. We then constructed voxelwise encoding models to explain whole brain responses arising from viewing natural images from the Natural Scenes Dataset (NSD) - a large-scale fMRI dataset collected at 7T. Our results reveal that CLIP, as compared to convolution based image classification models such as ResNet or AlexNet, as well as language models such as BERT, gives rise to representations that enable better prediction performance - up to a 0.86 correlation with test data and an r-square of 0.75 - in higher-level visual cortex in humans. Moreover, CLIP representations explain distinctly unique variance in these higher-level visual areas as compared to models trained with only images or text. Control experiments show that the improvement in prediction observed with CLIP is not due to architectural differences (transformer vs. convolution) or to the encoding of image captions per se (vs. single object labels). Together our results indicate that CLIP and, more generally, multimodal models trained jointly on images and text, may serve as better candidate models of representation in human higher-level visual cortex. The bridge between language and vision provided by jointly trained models such as CLIP also opens up new and more semantically-rich ways of interpreting the visual brain.
Interactions between visual cortical neurons that give rise to conscious perception
I will discuss the mechanisms that determine whether a weak visual stimulus will reach consciousness or not. If the stimulus is simple, early visual cortex acts as a relay station that sends the information to higher visual areas. If the stimulus arrives at a minimal strength, it will be stored in working memory and can be reported. However, during more complex visual perceptions, which for example depend on the segregation of a figure from the background, early visual cortex’ role goes beyond a simply relay. It now acts as a cognitive blackboard and conscious perception depends on it. Our results inspire new approaches to create a visual prosthesis for the blind, by creating a direct interface with the visual brain. I will discuss how high-channel-number interfaces with the visual cortex might be used to restore a rudimentary form of vision in blind individuals.
Age-related dedifferentiation across representational levels and their relation to memory performance
Episodic memory performance decreases with advancing age. According to theoretical models, such memory decline might be a consequence of age-related reductions in the ability to form distinct neural representations of our past. In this talk, I want to present our new age-comparative fMRI study investigating age-related neural dedifferentiation across different representational levels. By combining univariate analyses and searchlight pattern similarity analyses, we found that older adults show reduced category selective processing in higher visual areas, less specific item representations in occipital regions and less stable item representations. Dedifferentiation on all these representational levels was related to memory performance, with item specificity being the strongest contributor. Overall, our results emphasize that age-related dedifferentiation can be observed across the entire cortical hierarchy which may selectively impair memory performance depending on the memory task.
Neural mechanisms of active vision in the marmoset monkey
Human vision relies on rapid eye movements (saccades) 2-3 times every second to bring peripheral targets to central foveal vision for high resolution inspection. This rapid sampling of the world defines the perception-action cycle of natural vision and profoundly impacts our perception. Marmosets have similar visual processing and eye movements as humans, including a fovea that supports high-acuity central vision. Here, I present a novel approach developed in my laboratory for investigating the neural mechanisms of visual processing using naturalistic free viewing and simple target foraging paradigms. First, we establish that it is possible to map receptive fields in the marmoset with high precision in visual areas V1 and MT without constraints on fixation of the eyes. Instead, we use an off-line correction for eye position during foraging combined with high resolution eye tracking. This approach allows us to simultaneously map receptive fields, even at the precision of foveal V1 neurons, while also assessing the impact of eye movements on the visual information encoded. We find that the visual information encoded by neurons varies dramatically across the saccade to fixation cycle, with most information localized to brief post-saccadic transients. In a second study we examined if target selection prior to saccades can predictively influence how foveal visual information is subsequently processed in post-saccadic transients. Because every saccade brings a target to the fovea for detailed inspection, we hypothesized that predictive mechanisms might prime foveal populations to process the target. Using neural decoding from laminar arrays placed in foveal regions of area MT, we find that the direction of motion for a fixated target can be predictively read out from foveal activity even before its post-saccadic arrival. These findings highlight the dynamic and predictive nature of visual processing during eye movements and the utility of the marmoset as a model of active vision. Funding sources: NIH EY030998 to JM, Life Sciences Fellowship to JY
Stereo vision in humans and insects
Stereopsis – deriving information about distance by comparing views from two eyes – is widespread in vertebrates but so far known in only class of invertebrates, the praying mantids. Understanding stereopsis which has evolved independently in such a different nervous system promises to shed light on the constraints governing any stereo system. Behavioral experiments indicate that insect stereopsis is functionally very different from that studied in vertebrates. Vertebrate stereopsis depends on matching up the pattern of contrast in the two eyes; it works in static scenes, and may have evolved in order to break camouflage rather than to detect distances. Insect stereopsis matches up regions of the image where the luminance is changing; it is insensitive to the detailed pattern of contrast and operates to detect the distance to a moving target. Work from my lab has revealed a network of neurons within the mantis brain which are tuned to binocular disparity, including some that project to early visual areas. This is in contrast to previous theories which postulated that disparity was computed only at a single, late stage, where visual information is passed down to motor neurons. Thus, despite their very different properties, the underlying neural mechanisms supporting vertebrate and insect stereopsis may be computationally more similar than has been assumed.
Interactions between neurons during visual perception and restoring them in blindness
I will discuss the mechanisms that determine whether a weak visual stimulus will reach consciousness or not. If the stimulus is simple, early visual cortex acts as a relay station that sends the information to higher visual areas. If the stimulus arrives at a minimal strength, it will be stored in working memory. However, during more complex visual perceptions, which for example depend on the segregation of a figure from the background, early visual cortex’ role goes beyond a simply relay. It now acts as a cognitive blackboard and conscious perception depends on it. Our results also inspire new approaches to create a visual prosthesis for the blind, by creating a direct interface with the visual cortex. I will discuss how high-channel-number interfaces with the visual cortex might be used to restore a rudimentary form of vision in blind individuals.
High precision coding in visual cortex
Individual neurons in visual cortex provide the brain with unreliable estimates of visual features. It is not known if the single-neuron variability is correlated across large neural populations, thus impairing the global encoding of stimuli. We recorded simultaneously from up to 50,000 neurons in mouse primary visual cortex (V1) and in higher-order visual areas and measured stimulus discrimination thresholds of 0.35 degrees and 0.37 degrees respectively in an orientation decoding task. These neural thresholds were almost 100 times smaller than the behavioral discrimination thresholds reported in mice. This discrepancy could not be explained by stimulus properties or arousal states. Furthermore, the behavioral variability during a sensory discrimination task could not be explained by neural variability in primary visual cortex. Instead behavior-related neural activity arose dynamically across a network of non-sensory brain areas. These results imply that sensory perception in mice is limited by downstream decoders, not by neural noise in sensory representations.
High precision coding in visual cortex
Single neurons in visual cortex provide unreliable measurements of visual features due to their high trial-to-trial variability. It is not known if this “noise” extends its effects over large neural populations to impair the global encoding of stimuli. We recorded simultaneously from ∼20,000 neurons in mouse primary visual cortex (V1) and found that the neural populations had discrimination thresholds of ∼0.34° in an orientation decoding task. These thresholds were nearly 100 times smaller than those reported behaviourally in mice. The discrepancy between neural and behavioural discrimination could not be explained by the types of stimuli we used, by behavioural states or by the sequential nature of perceptual learning tasks. Furthermore, higher-order visual areas lateral to V1 could be decoded equally well. These results imply that the limits of sensory perception in mice are not set by neural noise in sensory cortex, but by the limitations of downstream decoders.
Cortico-cortical feedback to visual areas can explain reactivation of latent memories during working memory retention
Bernstein Conference 2024
Unifying model of contextual modulation with feedback from higher visual areas
COSYNE 2022
Unifying model of contextual modulation with feedback from higher visual areas
COSYNE 2022
On the relationship between attention, gamma-frequency and inter-areal synchrony in macaque’s visual areas V1 and V4
FENS Forum 2024
Time-shift dependent analysis of gamma phase coherence between macaque visual areas V1 and V4
FENS Forum 2024