Visual Field
visual field
Vision for perception versus vision for action: dissociable contributions of visual sensory drives from primary visual cortex and superior colliculus neurons to orienting behaviors
The primary visual cortex (V1) directly projects to the superior colliculus (SC) and is believed to provide sensory drive for eye movements. Consistent with this, a majority of saccade-related SC neurons also exhibit short-latency, stimulus-driven visual responses, which are additionally feature-tuned. However, direct neurophysiological comparisons of the visual response properties of the two anatomically-connected brain areas are surprisingly lacking, especially with respect to active looking behaviors. I will describe a series of experiments characterizing visual response properties in primate V1 and SC neurons, exploring feature dimensions like visual field location, spatial frequency, orientation, contrast, and luminance polarity. The results suggest a substantial, qualitative reformatting of SC visual responses when compared to V1. For example, SC visual response latencies are actively delayed, independent of individual neuron tuning preferences, as a function of increasing spatial frequency, and this phenomenon is directly correlated with saccadic reaction times. Such “coarse-to-fine” rank ordering of SC visual response latencies as a function of spatial frequency is much weaker in V1, suggesting a dissociation of V1 responses from saccade timing. Consistent with this, when we next explored trial-by-trial correlations of individual neurons’ visual response strengths and visual response latencies with saccadic reaction times, we found that most SC neurons exhibited, on a trial-by-trial basis, stronger and earlier visual responses for faster saccadic reaction times. Moreover, these correlations were substantially higher for visual-motor neurons in the intermediate and deep layers than for more superficial visual-only neurons. No such correlations existed systematically in V1. Thus, visual responses in SC and V1 serve fundamentally different roles in active vision: V1 jumpstarts sensing and image analysis, but SC jumpstarts moving. I will finish by demonstrating, using V1 reversible inactivation, that, despite reformatting of signals from V1 to the brainstem, V1 is still a necessary gateway for visually-driven oculomotor responses to occur, even for the most reflexive of eye movement phenomena. This is a fundamental difference from rodent studies demonstrating clear V1-independent processing in afferent visual pathways bypassing the geniculostriate one, and it demonstrates the importance of multi-species comparisons in the study of oculomotor control.
Euclidean coordinates are the wrong prior for primate vision
The mapping from the visual field to V1 can be approximated by a log-polar transform. In this domain, scale is a left-right shift, and rotation is an up-down shift. When fed into a standard shift-invariant convolutional network, this provides scale and rotation invariance. However, translation invariance is lost. In our model, this is compensated for by multiple fixations on an object. Due to the high concentration of cones in the fovea with the dropoff of resolution in the periphery, fully 10 degrees of visual angle take up about half of V1, with the remaining 170 degrees (or so) taking up the other half. This layout provides the basis for the central and peripheral pathways. Simulations with this model closely match human performance in scene classification, and competition between the pathways leads to the peripheral pathway being used for this task. Remarkably, in spite of the property of rotation invariance, this model can explain the inverted face effect. We suggest that the standard method of using image coordinates is the wrong prior for models of primate vision.
Motion processing across visual field locations in zebrafish
Hierarchical transformation of visual event timing representations in the human brain: response dynamics in early visual cortex and timing-tuned responses in association cortices
Quantifying the timing (duration and frequency) of brief visual events is vital to human perception, multisensory integration and action planning. For example, this allows us to follow and interact with the precise timing of speech and sports. Here we investigate how visual event timing is represented and transformed across the brain’s hierarchy: from sensory processing areas, through multisensory integration areas, to frontal action planning areas. We hypothesized that the dynamics of neural responses to sensory events in sensory processing areas allows derivation of event timing representations. This would allow higher-level processes such as multisensory integration and action planning to use sensory timing information, without the need for specialized central pacemakers or processes. Using 7T fMRI and neural model-based analyses, we found responses that monotonically increase in amplitude with visual event duration and frequency, becoming increasingly clear from primary visual cortex to lateral occipital visual field maps. Beginning in area MT/V5, we found a gradual transition from monotonic to tuned responses, with response amplitudes peaking at different event timings in different recording sites. While monotonic response components were limited to the retinotopic location of the visual stimulus, timing-tuned response components were independent of the recording sites' preferred visual field positions. These tuned responses formed a network of topographically organized timing maps in superior parietal, postcentral and frontal areas. From anterior to posterior timing maps, multiple events were increasingly integrated, response selectivity narrowed, and responses focused increasingly on the middle of the presented timing range. These results suggest that responses to event timing are transformed from the human brain’s sensory areas to the association cortices, with the event’s temporal properties being increasingly abstracted from the response dynamics and locations of early sensory processing. The resulting abstracted representation of event timing is then propagated through areas implicated in multisensory integration and action planning.
Dynamic spatial processing in insect vision
How does the visual system of insects function in vastly different light intensities, process separate parts of the visual field in parallel, and cope with eye sizes that differ between individuals? This talk will give you the answers we receive from our unique(ly adorable) model system: hawkmoths.
A novel form of retinotopy in area V2 highlights location-dependent feature selectivity in the visual system
Topographic maps are a prominent feature of brain organization, reflecting local and large-scale representation of the sensory surface. Traditionally, such representations in early visual areas are conceived as retinotopic maps preserving ego-centric retinal spatial location while ensuring that other features of visual input are uniformly represented for every location in space. I will discuss our recent findings of a striking departure from this simple mapping in the secondary visual area (V2) of the tree shrew that is best described as a sinusoidal transformation of the visual field. This sinusoidal topography is ideal for achieving uniform coverage in an elongated area like V2 as predicted by mathematical models designed for wiring minimization, and provides a novel explanation for stripe-like patterns of intra-cortical connections and functional response properties in V2. Our findings suggest that cortical circuits flexibly implement solutions to sensory surface representation, with dramatic consequences for large-scale cortical organization. Furthermore our work challenges the framework of relatively independent encoding of location and features in the visual system, showing instead location-dependent feature sensitivity produced by specialized processing of different features in different spatial locations. In the second part of the talk, I will propose that location-dependent feature sensitivity is a fundamental organizing principle of the visual system that achieves efficient representation of positional regularities in visual input, and reflects the evolutionary selection of sensory and motor circuits to optimally represent behaviorally relevant information. The relevant papers can be found here: V2 retinotopy (Sedigh-Sarvestani et al. Neuron, 2021) Location-dependent feature sensitivity (Sedigh-Sarvestani et al. Under Review, 2022)
Neural network models of binocular depth perception
Our visual experience of living in a three-dimensional world is created from the information contained in the two-dimensional images projected into our eyes. The overlapping visual fields of the two eyes mean that their images are highly correlated, and that the small differences that are present represent an important cue to depth. Binocular neurons encode this information in a way that both maximises efficiency and optimises disparity tuning for the depth structures that are found in our natural environment. Neural network models provide a clear account of how these binocular neurons encode the local binocular disparity in images. These models can be expanded to multi-layer models that are sensitive to salient features of scenes, such as the orientations and discontinuities between surfaces. These deep neural network models have also shown the importance of binocular disparity for the segmentation of images into separate objects, in addition to the estimation of distance. These results demonstrate the usefulness of machine learning approaches as a tool for understanding biological vision.
What Art can tell us about the Brain
Artists have been doing experiments on vision longer than neurobiologists. Some major works of art have provided insights as to how we see; some of these insights are so undamental that they can be understood in terms of the underlying neurobiology. For example, artists have long realized that color and luminance can play independent roles in visual perception. Picasso said, "Colors are only symbols. Reality is to be found in luminance alone." This observation has a parallel in the functional subdivision of our visual systems, where color and luminance are processed by the evolutionarily newer, primate-specific What system, and the older, colorblind, Where (or How) system. Many techniques developed over the centuries by artists can be understood in terms of the parallel organization of our visual systems. I will explore how the segregation of color and luminance processing are the basis for why some Impressionist paintings seem to shimmer, why some op art paintings seem to move, some principles of Matisse's use of color, and how the Impressionists painted "air". Central and peripheral vision are distinct, and I will show how the differences in resolution across our visual field make the Mona Lisa's smile elusive, and produce a dynamic illusion in Pointillist paintings, Chuck Close paintings, and photomosaics. I will explore how artists have figured out important features about how our brains extract relevant information about faces and objects, and I will discuss why learning disabilities may be associated with artistic talent.
Faces influence saccade programming
Several studies have showed that face stimuli elicit extremely fast and involuntary saccadic responses toward them, relative to other categories of visual stimuli. In the talk, I will mainly focus on a quite recent research done in our team that investigated to what extent face stimuli influence the programming and execution of saccades. In this research, two experiments were performed using a saccadic choice task: two images (one with a face, one with a vehicle) were simultaneously displayed in the left and right visual fields of participants who had to execute a saccade toward the image (Experiment 1) or toward a cross added in the center of the image (Experiment 2) containing a target stimulus (a face or a vehicle). As expected participants were faster to execute a saccade toward a face than toward a vehicle and did less errors. We also observed shorter saccades toward vehicle than face targets, even if participants were explicitly asked to perform their saccades toward a specific location (Experiment 2). Further analyses, that I will detailed in the talk, showed that error saccades might be interrupted in mid-fight to initiate a concurrently programmed corrective saccade.
How do we find what we are looking for? The Guided Search 6.0 model
The talk will give a tour of Guided Search 6.0 (GS6), the latest evolution of Guided Search. Part 1 describes The Mechanics of Search. Because we cannot recognize more than a few items at a time, selective attention is used to prioritize items for processing. Selective attention to an item allows its features to be bound together into a representation that can be matched to a target template in memory or rejected as a distractor. The binding and recognition of an attended object is modeled as a diffusion process taking > 150 msec/item. Since selection occurs more frequently than that, it follows that multiple items are undergoing recognition at the same time, though asynchronously, making GS6 a hybrid serial and parallel model. If a target is not found, search terminates when an accumulating quitting signal reaches a threshold. Part 2 elaborates on the five sources of Guidance that are combined into a spatial “priority map” to guide the deployment of attention (hence “guided search”). These are (1) top-down and (2) bottom-up feature guidance, (3) prior history (e.g. priming), (4) reward, and (5) scene syntax and semantics. In GS6, the priority map is a dynamic attentional landscape that evolves over the course of search. In part, this is because the visual field is inhomogeneous. Part 3: That inhomogeneity imposes spatial constraints on search that described by three types of “functional visual field” (FVFs): (1) a resolution FVF, (2) an FVF governing exploratory eye movements, and (3) an FVF governing covert deployments of attention. Finally, in Part 4, we will consider that the internal representation of the search target, the “search template” is really two templates: a guiding template and a target template. Put these pieces together and you have GS6.
Crowding and the Architecture of the Visual System
Classically, vision is seen as a cascade of local, feedforward computations. This framework has been tremendously successful, inspiring a wide range of ground-breaking findings in neuroscience and computer vision. Recently, feedforward Convolutional Neural Networks (ffCNNs), inspired by this classic framework, have revolutionized computer vision and been adopted as tools in neuroscience. However, despite these successes, there is much more to vision. I will present our work using visual crowding and related psychophysical effects as probes into visual processes that go beyond the classic framework. In crowding, perception of a target deteriorates in clutter. We focus on global aspects of crowding, in which perception of a small target is strongly modulated by the global configuration of elements across the visual field. We show that models based on the classic framework, including ffCNNs, cannot explain these effects for principled reasons and identify recurrent grouping and segmentation as a key missing ingredient. Then, we show that capsule networks, a recent kind of deep learning architecture combining the power of ffCNNs with recurrent grouping and segmentation, naturally explain these effects. We provide psychophysical evidence that humans indeed use a similar recurrent grouping and segmentation strategy in global crowding effects. In crowding, visual elements interfere across space. To study how elements interfere over time, we use the Sequential Metacontrast psychophysical paradigm, in which perception of visual elements depends on elements presented hundreds of milliseconds later. We psychophysically characterize the temporal structure of this interference and propose a simple computational model. Our results support the idea that perception is a discrete process. Together, the results presented here provide stepping-stones towards a fuller understanding of the visual system by suggesting architectural changes needed for more human-like neural computations.
Motion processing across visual field locations in zebrafish
Animals are able to perceive self-motion and navigate in their environment using optic flow information. They often perform visually guided stabilization behaviors like the optokinetic (OKR) or optomotor response (OMR) in order to maintain their eye and body position relative to the moving surround. But how does the animal manage to perform appropriate behavioral response and how are processing tasks divided between the various non-cortical visual brain areas? Experiments have shown that the zebrafish pretectum, which is homologous to the mammalian accessory optic system, is involved in the OKR and OMR. The optic tectum (superior colliculus in mammals) is involved in processing of small stimuli, e.g. during prey capture. We have previously shown that many pretectal neurons respond selectively to rotational or translational motion. These neurons are likely detectors for specific optic flow patterns and mediate behavioral choices of the animal based on optic flow information. We investigate the motion feature extraction of brain structures that receive input from retinal ganglion cells to identify the visual computations that underlie behavioral decisions during prey capture, OKR, OMR and other visually mediate behaviors. Our study of receptive fields shows that receptive field sizes in pretectum (large) and tectum (small) are very different and that pretectal responses are diverse and anatomically organized. Since calcium indicators are slow and receptive fields for motion stimuli are difficult to measure, we also develop novel stimuli and statistical methods to infer the neuronal computations of visual brain areas.
A Rare Visuospatial Disorder
Cases with visuospatial abnormalities provide opportunities for understanding the underlying cognitive mechanisms. Three cases of visual mirror-reversal have been reported: AH (McCloskey, 2009), TM (McCloskey, Valtonen, & Sherman, 2006) and PR (Pflugshaupt et al., 2007). This research reports a fourth case, BS -- with focal occipital cortical dysgenesis -- who displays highly unusual visuospatial abnormalities. They initially produced mirror reversal errors similar to those of AH, who -- like the patient in question -- showed a selective developmental deficit. Extensive examination of BS revealed phenomena such as: mirror reversal errors (sometimes affecting only parts of the visual fields) in both horizontal and vertical planes; subjective representation of visual objects and words in distinct left and right visual fields; subjective duplication of objects of visual attention (not due to diplopia); uncertainty regarding the canonical upright orientation of everyday objects; mirror reversals during saccadic eye movements on oculomotor tasks; and failure to integrate visual with other sensory inputs (e.g., they feel themself moving backwards when visual information shows they are moving forward). Fewer errors are produced under conditions of certain visual variables. These and other findings have led the researchers to conclude that BS draws upon a subjective representation of visual space that is structured phenomenally much as it is anatomically in early visual cortex (i.e., rotated through 180 degrees, split into left and right fields, etc.). Despite this, BS functions remarkably well in their everyday life, apparently due to extensive compensatory mechanisms deployed at higher (executive) processing levels beyond the visual modality.