Segmentation
segmentation
Ruben Coen-Cagli
The Laboratory for Computational Neuroscience (Coen-Cagli lab) invites applications for a postdoctoral position at Albert Einstein College of Medicine (Einstein) in the Bronx, New York City. The position is available immediately, it is funded for two years through a NIH training grant to the Rose F. Kennedy IDDRC at Einstein, and targets eligible candidates interested in careers in the biomedical sciences focused on the neurobiological underpinnings of neurodevelopmental disorders associated with intellectual disability and autism. The candidate will have the opportunity learn and apply an integrated approach that leverages innovative experiments and computational modeling of perceptual grouping and segmentation developed by the Coen-Cagli lab, to test theories of sensory processing in autism, in collaboration with the Cognitive Neurophysiology Laboratory (Molholm lab) at Einstein.
Continuity and segmentation - two ends of a spectrum or independent processes?
Learning through the eyes and ears of a child
Young children have sophisticated representations of their visual and linguistic environment. Where do these representations come from? How much knowledge arises through generic learning mechanisms applied to sensory data, and how much requires more substantive (possibly innate) inductive biases? We examine these questions by training neural networks solely on longitudinal data collected from a single child (Sullivan et al., 2020), consisting of egocentric video and audio streams. Our principal findings are as follows: 1) Based on visual only training, neural networks can acquire high-level visual features that are broadly useful across categorization and segmentation tasks. 2) Based on language only training, networks can acquire meaningful clusters of words and sentence-level syntactic sensitivity. 3) Based on paired visual and language training, networks can acquire word-referent mappings from tens of noisy examples and align their multi-modal conceptual systems. Taken together, our results show how sophisticated visual and linguistic representations can arise through data-driven learning applied to one child’s first-person experience.
Learning with less labels for medical image segmentation
Accurate segmentation of medical images is a key step in developing Computer-Aided Diagnosis (CAD) and automating various clinical tasks such as image-guided interventions. The success of state-of-the-art methods for medical image segmentation is heavily reliant upon the availability of a sizable amount of labelled data. If the required quantity of labelled data for learning cannot be reached, the technology turns out to be fragile. The principle of consensus tells us that as humans, when we are uncertain how to act in a situation, we tend to look to others to determine how to respond. In this webinar, Dr Mehrtash Harandi will show how to model the principle of consensus to learn to segment medical data with limited labelled data. In doing so, we design multiple segmentation models that collaborate with each other to learn from labelled and unlabelled data collectively.
Probabilistic computation in natural vision
A central goal of vision science is to understand the principles underlying the perception and neural coding of the complex visual environment of our everyday experience. In the visual cortex, foundational work with artificial stimuli, and more recent work combining natural images and deep convolutional neural networks, have revealed much about the tuning of cortical neurons to specific image features. However, a major limitation of this existing work is its focus on single-neuron response strength to isolated images. First, during natural vision, the inputs to cortical neurons are not isolated but rather embedded in a rich spatial and temporal context. Second, the full structure of population activity—including the substantial trial-to-trial variability that is shared among neurons—determines encoded information and, ultimately, perception. In the first part of this talk, I will argue for a normative approach to study encoding of natural images in primary visual cortex (V1), which combines a detailed understanding of the sensory inputs with a theory of how those inputs should be represented. Specifically, we hypothesize that V1 response structure serves to approximate a probabilistic representation optimized to the statistics of natural visual inputs, and that contextual modulation is an integral aspect of achieving this goal. I will present a concrete computational framework that instantiates this hypothesis, and data recorded using multielectrode arrays in macaque V1 to test its predictions. In the second part, I will discuss how we are leveraging this framework to develop deep probabilistic algorithms for natural image and video segmentation.
The processing of price during purchase decision making: Are there neural differences among prosocial and non-prosocial consumers?
International organizations, governments and companies are increasingly committed to developing measures that encourage adoption of sustainable consumption patterns among the population. However, their success requires a deep understanding of the everyday purchasing decision process and the elements that shape it. Price is an element that stands out. Prior research concluded that the influence of price on purchase decisions varies across consumer profiles. Yet no consumer behavior study to date has assessed the differences of price processing among consumers adopting sustainable habits (prosocial) as opposed to those who have not (non-prosocial). This is the first study to resort to neuroimaging tools to explore the underlying neural mechanisms that reveal the effect of price on prosocial and non-prosocial consumers. Self-reported findings indicate that prosocial consumers place greater value on collective costs and benefits while non-prosocial consumers place a greater weight on price. The neural data gleaned from this analysis offers certain explanations as to the origin of the differences. Non-prosocial (vs. prosocial) consumers, in fact, exhibit a greater activation in brain areas involved with reward, valuation and choice when evaluating price information. These findings could steer managers to improve market segmentation and assist institutions in their design of campaigns fostering environmentally sustainable behaviors
Neural network models of binocular depth perception
Our visual experience of living in a three-dimensional world is created from the information contained in the two-dimensional images projected into our eyes. The overlapping visual fields of the two eyes mean that their images are highly correlated, and that the small differences that are present represent an important cue to depth. Binocular neurons encode this information in a way that both maximises efficiency and optimises disparity tuning for the depth structures that are found in our natural environment. Neural network models provide a clear account of how these binocular neurons encode the local binocular disparity in images. These models can be expanded to multi-layer models that are sensitive to salient features of scenes, such as the orientations and discontinuities between surfaces. These deep neural network models have also shown the importance of binocular disparity for the segmentation of images into separate objects, in addition to the estimation of distance. These results demonstrate the usefulness of machine learning approaches as a tool for understanding biological vision.
The processing of price during purchase decision making: Are there neural differences among prosocial and non-prosocial consumers?
International organizations, governments and companies are increasingly committed to developing measures that encourage adoption of sustainable consumption patterns among the population. However, their success requires a deep understanding of the everyday purchasing decision process and the elements that shape it. Price is an element that stands out. Prior research concluded that the influence of price on purchase decisions varies across consumer profiles. Yet no consumer behavior study to date has assessed the differences of price processing among consumers adopting sustainable habits (prosocial) as opposed to those who have not (non-prosocial). This is the first study to resort to neuroimaging tools to explore the underlying neural mechanisms that reveal the effect of price on prosocial and non-prosocial consumers. Self-reported findings indicate that prosocial consumers place greater value on collective costs and benefits while non-prosocial consumers place a greater weight on price. The neural data gleaned from this analysis offers certain explanations as to the origin of the differences. Non-prosocial (vs. prosocial) consumers, in fact, exhibit a greater activation in brain areas involved with reward, valuation and choice when evaluating price information. These findings could steer managers to improve market segmentation and assist institutions in their design of campaigns fostering environmentally sustainable behaviors
Introducing YAPiC: An Open Source tool for biologists to perform complex image segmentation with deep learning
Robust detection of biological structures such as neuronal dendrites in brightfield micrographs, tumor tissue in histological slides, or pathological brain regions in MRI scans is a fundamental task in bio-image analysis. Detection of those structures requests complex decision making which is often impossible with current image analysis software, and therefore typically executed by humans in a tedious and time-consuming manual procedure. Supervised pixel classification based on Deep Convolutional Neural Networks (DNNs) is currently emerging as the most promising technique to solve such complex region detection tasks. Here, a self-learning artificial neural network is trained with a small set of manually annotated images to eventually identify the trained structures from large image data sets in a fully automated way. While supervised pixel classification based on faster machine learning algorithms like Random Forests are nowadays part of the standard toolbox of bio-image analysts (e.g. Ilastik), the currently emerging tools based on deep learning are still rarely used. There is also not much experience in the community how much training data has to be collected, to obtain a reasonable prediction result with deep learning based approaches. Our software YAPiC (Yet Another Pixel Classifier) provides an easy-to-use Python- and command line interface and is purely designed for intuitive pixel classification of multidimensional images with DNNs. With the aim to integrate well in the current open source ecosystem, YAPiC utilizes the Ilastik user interface in combination with a high performance GPU server for model training and prediction. Numerous research groups at our institute have already successfully applied YAPiC for a variety of tasks. From our experience, a surprisingly low amount of sparse label data is needed to train a sufficiently working classifier for typical bioimaging applications. Not least because of this, YAPiC has become the "standard weapon” for our core facility to detect objects in hard-to-segement images. We would like to present some use cases like cell classification in high content screening, tissue detection in histological slides, quantification of neural outgrowth in phase contrast time series, or actin filament detection in transmission electron microscopy.
Suite2p: a multipurpose functional segmentation pipeline for cellular imaging
The combination of two-photon microscopy recordings and powerful calcium-dependent fluorescent sensors enables simultaneous recording of unprecedentedly large populations of neurons. While these sensors have matured over several generations of development, computational methods to process their fluorescence are often inefficient and the results hard to interpret. Here we introduce Suite2p: a fast, accurate, parameter-free and complete pipeline that registers raw movies, detects active and/or inactive cells (using Cellpose), extracts their calcium traces and infers their spike times. Suite2p runs faster than real time on standard workstations and outperforms state-of-the-art methods on newly developed ground-truth benchmarks for motion correction and cell detection.
What is Foraging?
Foraging research aims at describing, understanding, and predicting resource-gathering behaviour. Optimal Foraging Theory (OFT) is a sub-discipline that emphasises that these aims can be aided by segmenting foraging behaviour into discrete problems that can be formally described and examined with mathematical maximization techniques. Examples of such segmentation are found in the isolated treatment of issues such as patch residence time, prey selection, information gathering, risky choice, intertemporal decision making, resource allocation, competition, memory updating, group structure, and so on. Since foragers face these problems simultaneously rather than in isolation, it is unsurprising that OFT models are ‘always wrong but sometimes useful’. I will argue that a progressive optimal foraging research program should have a defined strategy for dealing with predictive failure of models. Further, I will caution against searching for brain structures responsible for solving isolated foraging problems.
The When, Where and What of visual memory formation
The eyes send a continuous stream of about two million nerve fibers to the brain, but only a fraction of this information is stored as visual memories. This talk will detail three neurocomputational models that attempt an understanding how the visual system makes on-the-fly decisions about how to encode that information. First, the STST family of models (Bowman & Wyble 2007; Wyble, Potter, Bowman & Nieuwenstein 2011) proposes mechanisms for temporal segmentation of continuous input. The conclusion of this work is that the visual system has mechanisms for rapidly creating brief episodes of attention that highlight important moments in time, and also separates each episode from temporally adjacent neighbors to benefit learning. Next, the RAGNAROC model (Wyble et al. 2019) describes a decision process for determining the spatial focus (or foci) of attention in a spatiotopic field and the neural mechanisms that provide enhancement of targets and suppression of highly distracting information. This work highlights the importance of integrating behavioral and electrophysiological data to provide empirical constraints on a neurally plausible model of spatial attention. The model also highlights how a neural circuit can make decisions in a continuous space, rather than among discrete alternatives. Finally, the binding pool (Swan & Wyble 2014; Hedayati, O’Donnell, Wyble in Prep) provides a mechanism for selectively encoding specific attributes (i.e. color, shape, category) of a visual object to be stored in a consolidated memory representation. The binding pool is akin to a holographic memory system that layers representations of select latent representations corresponding to different attributes of a given object. Moreover, it can bind features into distinct objects by linking them to token placeholders. Future work looks toward combining these models into a coherent framework for understanding the full measure of on-the-fly attentional mechanisms and how they improve learning.
Crowding and the Architecture of the Visual System
Classically, vision is seen as a cascade of local, feedforward computations. This framework has been tremendously successful, inspiring a wide range of ground-breaking findings in neuroscience and computer vision. Recently, feedforward Convolutional Neural Networks (ffCNNs), inspired by this classic framework, have revolutionized computer vision and been adopted as tools in neuroscience. However, despite these successes, there is much more to vision. I will present our work using visual crowding and related psychophysical effects as probes into visual processes that go beyond the classic framework. In crowding, perception of a target deteriorates in clutter. We focus on global aspects of crowding, in which perception of a small target is strongly modulated by the global configuration of elements across the visual field. We show that models based on the classic framework, including ffCNNs, cannot explain these effects for principled reasons and identify recurrent grouping and segmentation as a key missing ingredient. Then, we show that capsule networks, a recent kind of deep learning architecture combining the power of ffCNNs with recurrent grouping and segmentation, naturally explain these effects. We provide psychophysical evidence that humans indeed use a similar recurrent grouping and segmentation strategy in global crowding effects. In crowding, visual elements interfere across space. To study how elements interfere over time, we use the Sequential Metacontrast psychophysical paradigm, in which perception of visual elements depends on elements presented hundreds of milliseconds later. We psychophysically characterize the temporal structure of this interference and propose a simple computational model. Our results support the idea that perception is a discrete process. Together, the results presented here provide stepping-stones towards a fuller understanding of the visual system by suggesting architectural changes needed for more human-like neural computations.
An inference perspective on meta-learning
While meta-learning algorithms are often viewed as algorithms that learn to learn, an alternative viewpoint frames meta-learning as inferring a hidden task variable from experience consisting of observations and rewards. From this perspective, learning to learn is learning to infer. This viewpoint can be useful in solving problems in meta-RL, which I’ll demonstrate through two examples: (1) enabling off-policy meta-learning, and (2) performing efficient meta-RL from image observations. I’ll also discuss how this perspective leads to an algorithm for few-shot image segmentation.
A journey through connectomics: from manual tracing to the first fully automated basal ganglia connectomes
The "mind of the worm", the first electron microscopy-based connectome of C. elegans, was an early sign of where connectomics is headed, followed by a long time of little progress in a field held back by the immense manual effort required for data acquisition and analysis. This changed over the last few years with several technological breakthroughs, which allowed increases in data set sizes by several orders of magnitude. Brain tissue can now be imaged in 3D up to a millimeter in size at nanometer resolution, revealing tissue features from synapses to the mitochondria of all contained cells. These breakthroughs in acquisition technology were paralleled by a revolution in deep-learning segmentation techniques, that equally reduced manual analysis times by several orders of magnitude, to the point where fully automated reconstructions are becoming useful. Taken together, this gives neuroscientists now access to the first wiring diagrams of thousands of automatically reconstructed neurons connected by millions of synapses, just one line of program code away. In this talk, I will cover these developments by describing the past few years' technological breakthroughs and discuss remaining challenges. Finally, I will show the potential of automated connectomics for neuroscience by demonstrating how hypotheses in reinforcement learning can now be tackled through virtual experiments in synaptic wiring diagrams of the songbird basal ganglia.
Human reconstruction of local image structure from natural scenes
Retinal projections often poorly represent the structure of the physical world: well-defined boundaries within the eye may correspond to irrelevant features of the physical world, while critical features of the physical world may be nearly invisible at the retinal projection. Visual cortex is equipped with specialized mechanisms for sorting these two types of features according to their utility in interpreting the scene, however we know little or nothing about their perceptual computations. I will present novel paradigms for the characterization of these processes in human vision, alongside examples of how the associated empirical results can be combined with targeted models to shape our understanding of the underlying perceptual mechanisms. Although the emerging view is far from complete, it challenges compartmentalized notions of bottom-up/top-down object segmentation, and suggests instead that these two modes are best viewed as an integrated perceptual mechanism.
Semi-supervised sequence modeling for improved behavior segmentation
COSYNE 2022
Semi-supervised sequence modeling for improved behavior segmentation
COSYNE 2022
Compartment-specific stability in CA3 pyramidal neuron dendrites revealed by automatic segmentation
COSYNE 2025
Benchmarking deep-learning based whole-brain MRI segmentation tools for morphometry
FENS Forum 2024
Characterization of dendritic spine morphology through a segmentation-clusterization approach
FENS Forum 2024
DLC2action: A flexible, powerful, and easy-to-use toolbox for action segmentation
FENS Forum 2024
Modulation of event segmentation dynamics through catecholamines: Exploring the role of learning and stimulus novelty
FENS Forum 2024