Computer Vision
computer vision
Dr. Tatsuo Okubo
We are a new group at the Chinese Institute for Brain Research (CIBR), Beijing, which focuses on using modern data science and machine learning tools on neuroscience data. We collaborate with various labs within CIBR to develop models and analysis pipelines to accelerate neuroscience research. We are looking for enthusiastic and talented machine learning engineers and data scientists to join this effort. Example projects include (but not limited to) extracting hidden states from population neural activity, automating behavioral classification from videos, and segmenting neurons from confocal images using deep learning.
Erik C. Johnson
The Intelligent Systems Center at JHU/APL is an interdisciplinary research center for neuroscientists, AI researchers, and roboticists. Please see the individual listings for specific postings and application instructions. Postings for Neuroscience-Inspired AI researchers and Computational Neuroscience researchers may also be posted soon. https://prdtss.jhuapl.edu/jobs/senior-neural-decoding-researcher-2219 https://prdtss.jhuapl.edu/jobs/senior-reinforcement-learning-researcher-615 https://prdtss.jhuapl.edu/jobs/senior-computer-vision-researcher-2242 https://prdtss.jhuapl.edu/jobs/artificial-intelligence-software-developer-2255
Albert Cardona
To work within the group of Dr Albert Cardona at the MRC Laboratory of Molecular Biology (LMB), within a programme aimed at whole brain connectomics from volume electron microscopy. Specifically, we are seeking to recruit a data scientist with at least a year of experience with densely labelled volume electron microscopy data of nervous tissue. In particular, the candidate will be experienced in developing and applying machine learning frameworks for synapse detection and segmentation, neuron segmentation and proofreading, and quantification of neuronal structures in nanometre-resolution data sets imaged with volume electron microscopy, for the purpose of mapping neuronal wiring diagrams from volume electron microscopy. The ideal candidate will have an academic track record in the form of authored publications in the arXiv, computer vision conferences, and scientific journals, as well as accessible source code repositories demonstrating past work. The ideal candidate will have experience with the python programming language (at version 3+), and in the use of machine learning libraries with python bindings such as keras or pytorch, and has written code available in accessible source code repositories where it can be evaluated by third parties, and has deployed their code to both CPU and GPU clusters, and single servers with multiple GPUs. The ideal candidate has applied all of the above towards the generation of over-segmentations of neuronal structures, and is familiar with approaches for post-processing (proofreading) to automatically agglomerate over-segmented neuron fragments into full arbors, using biologically grounded approaches such as microtobule or endoplasmatic reticulum segmentation for validation.
Prof. (Dr.) Swagatam Das
We are seeking highly qualified and motivated individuals for the positions of Assistant and Associate Professors in Artificial Intelligence (AI) and Machine Learning (ML). The successful candidate will join our esteemed faculty in the Institute for Advancing Intelligence (IAI), TCG Centre for Research and Education in Science and Technology (CREST), Kolkata, India, and contribute to our commitment to excellence in research, teaching, and academic services.
Uri
The new lab at UCSD, directed by Uri, is opening positions for machine learning/computer vision scientists. The lab is part of a new “Technology Sandbox” at UCSD, which includes a ThermoFisher cryoEM and mass spec center, a Nikon Imaging Center, and computational resources.
Justus Piater, Antonio Rodríguez-Sánchez, Samuele Tosatto
This is a university doctoral position that involves minor teaching duties. The precise research topics are negotiable within the scope of active research at IIS, including machine learning and growing levels of AI for computer vision and robotics. Of particular interest are topics in representation learning and causality for out-of-distribution situations.
N/A
The Research Training Group 2853 “Neuroexplicit Models of Language, Vision, and Action” is looking for 3 PhD students and 1 postdoc. Neuroexplicit models combine neural and human-interpretable (“explicit”) models in order to overcome the limitations that each model class has separately. They include neurosymbolic models, which combine neural and symbolic models, but also e.g. combinations of neural and physics-based models. In the RTG, we will improve the state of the art in natural language processing (“Language”), computer vision (“Vision”), and planning and reinforcement learning (“Action”) through the use of neuroexplicit models and investigate the cross-cutting design principles of effective neuroexplicit models (“Foundations”).
Birkan Tunc
We are seeking postdoctoral fellows with interest and experience in computational approaches for quantifying human social behavior. This research is conducted at the University of Pennsylvania and the Center for Autism Research at Children’s Hospital of Philadelphia, as a part of multiple NIH grants. The applicant will be part of a big multidisciplinary team that develops AI tools to study human behavior (facial and bodily movements) during social interactions. Our research is a unique blend of machine learning, computer vision, cognitive science, bioinformatics, and mental health conditions. The fellow will be responsible for all or some of the following tasks, depending on the expertise: - Developing computer vision techniques (e.g., face analysis, body movement analysis, gesture analysis) - Developing signal processing methodologies to analyze biological and behavioral signals (e.g., head movements, joint movements) - Developing time series analysis techniques to extract patterns in biological and behavioral signals (e.g., coordination and causality in movements of multiple people) - Validating developed tools using in-house clinical data, as well as publicly available datasets - Performing pattern recognition on collected data (i.e., classification, regression, clustering, feature learning)
Birkan Tunc
We are seeking postdoctoral fellows with interest and experience in computational approaches for quantifying human social behavior. This research is conducted at the University of Pennsylvania and the Center for Autism Research at Children’s Hospital of Philadelphia, as a part of multiple NIH grants. The applicant will be part of a big multidisciplinary team that develops AI tools to study human behavior (facial and bodily movements) during social interactions. Our research is a unique blend of machine learning, computer vision, cognitive science, bioinformatics, and mental health conditions. The fellow will be responsible for all or some of the following tasks, depending on the expertise: Developing computer vision techniques (e.g., face analysis, body movement analysis, gesture analysis), Developing signal processing methodologies to analyze biological and behavioral signals (e.g., head movements, joint movements), Developing time series analysis techniques to extract patterns in biological and behavioral signals (e.g., coordination and causality in movements of multiple people), Validating developed tools using in-house clinical data, as well as publicly available datasets, Performing pattern recognition on collected data (i.e., classification, regression, clustering, feature learning)
Georgios N. Yannakakis
Join our AI research group at the Institute of Digital Games - University of Malta. We have a number of research posts (research associates, PhD students and Postdoctoral fellows) open currently. Be part of a research team that builds the next generation AI algorithms that play, feel and design games. We are looking for excellent candidates with a good grasp of as many of the following areas as possible: deep/shallow learning, affect annotation and modelling, human-computer interaction, computer vision, behaviour cloning, procedural content generation, generative systems.
Bharath Ramesh
The International Centre for Neuromorphic Systems, Western Sydney University, invites both domestic and international students to apply for the world’s first Master of Neuromorphic Engineering courses. We offer several programs, including a Graduate Certificate, a Graduate Diploma, a 1.5-year industry-oriented degree and a two-year research-oriented Master’s course in Neuromorphic Engineering. We seek dedicated, curious and open-minded scientists, engineers, physicists, electronics tinkerers, hardware and software hackers, and roboticists from diverse backgrounds. The course builds on the research background of our Neuromorphic Engineering and Event-Based Processing research staff. Successful applicants will receive significant mentorship. Mentors and course instructors will equip students with special digital vision and audition processing capabilities which are rarely taught at other Universities in the world. Mentors and instructors will provide students with opportunities to apply skills learned to practical projects which align with industry need. Although the post graduate courses will equip graduates with many in-demand machine-learning techniques, Neuromorphic Engineering researchers go beyond status-quo Machine Learning so that they can find solutions to issues that block progress in AI machine learning sensing and computer vision. Neuromorphic Engineering seeks to progress beyond failures in regular machine learning approaches as conventional approaches usually fail to generalise, are not environmentally sustainable, and are poorly suited to high-stakes time-critical low-powered applications.
Odelia
The Department of Computer Science at University of Miami is inviting applications for tenure-track or tenure eligible faculty positions at levels of Associate Professor and Professor. The successful candidates must conduct research in Data Science, including areas such as Machine Learning, Deep Learning, Computer Vision, Cognitive Cybersecurity, Blockchain, Real-time Analytics, Streaming Analytics, Cyber-analytics, and Edge Computing, and are expected to develop/maintain an internationally recognized research program. The selected candidate will be expected to teach classes at the undergraduate and graduate levels. The faculty in these positions will be housed primarily in the Department of Computer Science and will have responsibilities in the Institute for Data Science and Computing (IDSC).
N/A
The Faculty of Computer Science of HSE University invites applications for full-time, tenure-track positions of Assistant Professor in all areas of computer science including but not limited to artificial intelligence, machine learning, computer vision, programming language theory, software engineering, system programming, algorithms, computation complexity, distributed and parallel computation, bioinformatics, human-computer interaction, and robotics. The successful candidate is expected to conduct high-quality research publishable in reputable peer-reviewed journals with research support provided by the University.
N/A
The KINDI Center for Computing Research at the College of Engineering in Qatar University is seeking high-caliber candidates for a research faculty position at the level of assistant professor in the area of artificial intelligence (AI). The applicant should possess a Ph.D. degree in Computer Science or Computer Engineering or related fields from an internationally recognized university and should demonstrate an outstanding research record in AI and related subareas (e.g., machine/deep learning (ML/DL), computer vision, robotics, natural language processing, etc.) and fields (e.g., data science, big data analytics, etc.). Candidates with good hands-on experience are preferred. The position is available immediately.
N/A
You will be working in the Pattern Analysis and Computer vision (PAVIS) Research Line, a multi-disciplinary and multi-cultural group where people with different backgrounds collaborate, each with their own expertise, to carry out the research on Computer Vision and Artificial Intelligence. PAVIS research line is coordinated by Dr. Alessio Del Bue. Within the team, your main responsibilities will be: Hardware and software prototyping of computational systems based on Computer Vision and Machine Learning technology; Support PAVIS facility maintenance and organization; Support PAVIS Technology Transfer initiatives (external projects); Support PAVIS researcher activities; Support PAVIS operations (procurement, ICT services, troubleshooting, data management, logistics, equipment management and maintenance).
Frank
Multiple open professor positions at the technical University of Applied Sciences Würzburg-Schweinfurt in Computer Vision, Reinforcement Learning, Dynamical Systems
Sebastiano Vascon
The selected candidate will work on a project of national interest on Computer Vision applied to Robotics for Health. The aim is to develop an active assistive device (walker) for people with walking deficits. The project involves three partners, Ca' Foscari University of Venice, the University of Padova, and the University of Catania, and several technologies. The candidate will be expected to actively contribute to the laboratory activities by participating in weekly seminars, discussions, and research-related tasks.
N/A
The position integrates into an attractive environment of existing activities in artificial intelligence such as machine learning for robotics and computer vision, natural language processing, recommender systems, schedulers, virtual and augmented reality, and digital forensics. The candidate should engage in research and teaching in the general area of artificial intelligence. Examples of possible foci include machine learning for pattern recognition, prediction and decision making, data-driven, adaptive, learning and self-optimizing systems, explainable and transparent AI, representation learning; generative models, neuro-symbolic AI, causality, distributed/decentralized learning, environmentally-friendly, sustainable, data-efficient, privacy-preserving AI, neuromorphic computing and hardware aspects, knowledge representations, reasoning, ontologies. Cooperations with research groups at the Department of Computer Science, the Research Areas and in particular the Digital Science Center of the University as well as with business, industry and international research institutions are expected. The candidate should reinforce or complement existing strengths of the Department of Computer Science.
Brad Wyble
This role is centered on cutting-edge research at the nexus of machine learning, deep learning, computer vision, psychology, and biology, with foci on psychology-inspired AI and addressing significant biological questions using AI.
Michael Kampffmeyer
We are seeking a candidate to take an active role in the group's research on developing novel machine learning/computer vision methodology. Special focus in this project will be on the development of deep learning methodology for learning from limited labeled data (few-shot/self-supervised/metric learning and/or deep clustering). The position will be part of the already ongoing effort to design approaches that explore geometric properties of the embedding spaces and the alignment of multiple modalities. We are looking for a motivated candidate who is independent thinking and enjoys working in a team. The suitable candidate should have expertise in deep learning and a strong documented background in mathematics is needed.
N/A
The Research Training Group 2853 “Neuroexplicit Models of Language, Vision, and Action” is looking for 6 PhD students and 1 Postdoc. Neuroexplicit models combine neural and human-interpretable (“explicit”) models in order to overcome the limitations that each model class has separately. They include neurosymbolic models, which combine neural and symbolic models, but also e.g. combinations of neural and physics-based models. In the RTG, we will improve the state of the art in natural language processing (“Language”), computer vision (“Vision”), and planning and reinforcement learning (“Action”) through the use of neuroexplicit models and investigate the cross-cutting design principles of effective neuroexplicit models (“Foundations”).
Sebastiano Vascon
The selected candidate will work on a project of national interest on Computer Vision applied to Robotics for Health. The aim is to develop an active assistive device (walker) for people with walking deficits. The project involves three partners, Ca' Foscari University of Venice, the University of Padova, and the University of Catania. The candidate will be expected to actively contribute to the laboratory activities by participating in weekly seminars, discussions, and research-related tasks.
N/A
We seek to appoint a full-time Machine Learning Research Engineer to contribute to the development of new technologies for cutting edge vision systems in the context of an industrial project in collaboration with a large multinational company. The project carries out innovative research on the topics of Visual Question Answering and fast adaptation of vision-language models. The project team will be responsible for all the phases of the research development, including methods design and implementation, data preparation and benchmarking, task planning and frequent reporting. Within the team, your main responsibilities and duties will depend on your expertise and experience.
Prof. Ioannis Pitas
The Artificial Intelligence and Information Analysis Laboratory (AIIA Lab, AIIA.CVML R&D group) of the School of Informatics, Aristotle University of Thessaloniki, Greece (AUTH) has two open postdoctoral research positions. The interested applicant must have strong theoretical and/or applied/programming background in Machine Learning and Computer Vision, with an emphasis on Deep Learning. A strong publication record is desirable. Potential (not exclusive) application domains include big data analysis, robotics/autonomous systems and digital media. A very competitive salary is offered.
N/A
PAVIS is looking to strengthen its activities on 3D multi-modal scene understanding. The research will focus on novel ML and CV methods that efficiently incorporate priors and constraints from world physical models and semantic priors, derived from vision, language models, or other modalities. The project will explore the interplay between vision and large language models to address tasks in 3D reasoning, visual (re-)localization, active vision, and neural/geometrical novel view rendering. The aim is to develop models applicable to interdisciplinary research, including drug discovery and robotics, utilizing in-house robotics platforms and HPC computational facilities.
Error Consistency between Humans and Machines as a function of presentation duration
Within the last decade, Deep Artificial Neural Networks (DNNs) have emerged as powerful computer vision systems that match or exceed human performance on many benchmark tasks such as image classification. But whether current DNNs are suitable computational models of the human visual system remains an open question: While DNNs have proven to be capable of predicting neural activations in primate visual cortex, psychophysical experiments have shown behavioral differences between DNNs and human subjects, as quantified by error consistency. Error consistency is typically measured by briefly presenting natural or corrupted images to human subjects and asking them to perform an n-way classification task under time pressure. But for how long should stimuli ideally be presented to guarantee a fair comparison with DNNs? Here we investigate the influence of presentation time on error consistency, to test the hypothesis that higher-level processing drives behavioral differences. We systematically vary presentation times of backward-masked stimuli from 8.3ms to 266ms and measure human performance and reaction times on natural, lowpass-filtered and noisy images. Our experiment constitutes a fine-grained analysis of human image classification under both image corruptions and time pressure, showing that even drastically time-constrained humans who are exposed to the stimuli for only two frames, i.e. 16.6ms, can still solve our 8-way classification task with success rates way above chance. We also find that human-to-human error consistency is already stable at 16.6ms.
Deep learning applications in ophthalmology
Deep learning techniques have revolutionized the field of image analysis and played a disruptive role in the ability to quickly and efficiently train image analysis models that perform as well as human beings. This talk will cover the beginnings of the application of deep learning in the field of ophthalmology and vision science, and cover a variety of applications of using deep learning as a method for scientific discovery and latent associations.
Modern Approaches to Behavioural Analysis
The goal of neuroscience is to understand how the nervous system controls behaviour, not only in the simplified environments of the lab, but also in the natural environments for which nervous systems evolved. In pursuing this goal, neuroscience research is supported by an ever-larger toolbox, ranging from optogenetics to connectomics. However, often these tools are coupled with reductionist approaches for linking nervous systems and behaviour. This course will introduce advanced techniques for measuring and analysing behaviour, as well as three fundamental principles as necessary to understanding biological behaviour: (1) morphology and environment; (2) action-perception closed loops and purpose; and (3) individuality and historical contingencies [1]. [1] Gomez-Marin, A., & Ghazanfar, A. A. (2019). The life of behavior. Neuron, 104(1), 25-36
Identity-Expression Ambiguity in 3D Morphable Face Models
3D Morphable Models are my favorite class of generative models and are commonly used to model faces. They are typically applied to ill-posed problems such as 3D reconstruction from 2D data. I'll start my presentation with an introduction into 3D Morphable Models and show what they are capable of doing. I'll then focus on our recent finding, the Identity-Expression Ambiguity: We demonstrate that non-orthogonality of the variation in identity and expression can cause identity-expression ambiguity in 3D Morphable Models, and that in practice expression and identity are far from orthogonal and can explain each other surprisingly well. Whilst previously reported ambiguities only arise in an inverse rendering setting, identity-expression ambiguity emerges in the 3D shape generation process itself. The goal of this presentation is to demonstrate the ambiguity and discuss its potential consequences in a computer vision setting as well as for understanding face perception mechanisms in the human brain.
What does the primary visual cortex tell us about object recognition?
Object recognition relies on the complex visual representations in cortical areas at the top of the ventral stream hierarchy. While these are thought to be derived from low-level stages of visual processing, this has not been shown, yet. Here, I describe the results of two projects exploring the contributions of primary visual cortex (V1) processing to object recognition using artificial neural networks (ANNs). First, we developed hundreds of ANN-based V1 models and evaluated how their single neurons approximate those in the macaque V1. We found that, for some models, single neurons in intermediate layers are similar to their biological counterparts, and that the distributions of their response properties approximately match those in V1. Furthermore, we observed that models that better matched macaque V1 were also more aligned with human behavior, suggesting that object recognition is derived from low-level. Motivated by these results, we then studied how an ANN’s robustness to image perturbations relates to its ability to predict V1 responses. Despite their high performance in object recognition tasks, ANNs can be fooled by imperceptibly small, explicitly crafted perturbations. We observed that ANNs that better predicted V1 neuronal activity were also more robust to adversarial attacks. Inspired by this, we developed VOneNets, a new class of hybrid ANN vision models. Each VOneNet contains a fixed neural network front-end that simulates primate V1 followed by a neural network back-end adapted from current computer vision models. After training, VOneNets were substantially more robust, outperforming state-of-the-art methods on a set of perturbations. While current neural network architectures are arguably brain-inspired, these results demonstrate that more precisely mimicking just one stage of the primate visual system leads to new gains in computer vision applications and results in better models of the primate ventral stream and object recognition behavior.
If we can make computers play chess, why can't we make them see?
If we can make computers play chess and even Jeopardy and Go, then why can't we make them see like us? How does our brain solve the problem of seeing? I will describe some of our recent insights into understanding object recognition in the brain using behavioral, neuronal and computational methods.
StereoSpike: Depth Learning with a Spiking Neural Network
Depth estimation is an important computer vision task, useful in particular for navigation in autonomous vehicles, or for object manipulation in robotics. Here we solved it using an end-to-end neuromorphic approach, combining two event-based cameras and a Spiking Neural Network (SNN) with a slightly modified U-Net-like encoder-decoder architecture, that we named StereoSpike. More specifically, we used the Multi Vehicle Stereo Event Camera Dataset (MVSEC). It provides a depth ground-truth, which was used to train StereoSpike in a supervised manner, using surrogate gradient descent. We propose a novel readout paradigm to obtain a dense analog prediction –the depth of each pixel– from the spikes of the decoder. We demonstrate that this architecture generalizes very well, even better than its non-spiking counterparts, leading to state-of-the-art test accuracy. To the best of our knowledge, it is the first time that such a large-scale regression problem is solved by a fully spiking network. Finally, we show that low firing rates (<10%) can be obtained via regularization, with a minimal cost in accuracy. This means that StereoSpike could be implemented efficiently on neuromorphic chips, opening the door for low power real time embedded systems.
Measuring relevant features of the social and physical environment with imagery
The efficacy of images to create quantitative measures of urban perception has been explored in psychology, social science, urban planning and architecture over the last 50 years. The ability to scale these measurements has become possible only in the last decade, due to increased urban surveillance in the form of street view and satellite imagery, and the accessibility of such data. This talk will present a series of projects which make use of imagery and CNNs to predict, measure and interpret the social and physical environments of our cities.
Tuts, a Talk and AGI !!
A panel discussion on "What might we still require to achieve AGI?", a set of Reinforcement Learning and Computer Vision domain tuts and a talk from George Konidaris
Learning to aggress – Behavioral and circuit mechanisms of aggression reward
Aggression is an ethologically complex behavior with equally complex underlying mechanisms. Here, I present data on one form of aggression, appetitive or rewarding aggression, and the behavioral, cellular and system-level mechanisms guiding this behavior. First, I will present one way in which appetitive aggression is modeled in mice, and extend aggression motivation to the concept of compulsive aggression seeking and relapse. I will then briefly highlight recent advances in computer vision and machine learning for automated scoring of aggressive behavior, the role of specific cell-types in controlling aggression reward, and close with preliminary data on the whole brain aggression reward functional connectome using light sheet fluorescent microscopy (LSFM).
A machine learning way to analyse white matter tractography streamlines / Application of artificial intelligence in correcting motion artifacts and reducing scan time in MRI
1. Embedding is all you need: A machine learning way to analyse white matter tractography streamlines - Dr Shenjun Zhong, Monash Biomedical Imaging Embedding white matter streamlines with various lengths into fixed-length latent vectors enables users to analyse them with general data mining techniques. However, finding a good embedding schema is still a challenging task as the existing methods based on spatial coordinates rely on manually engineered features, and/or labelled dataset. In this webinar, Dr Shenjun Zhong will discuss his novel deep learning model that identifies latent space and solves the problem of streamline clustering without needing labelled data. Dr Zhong is a Research Fellow and Informatics Officer at Monash Biomedical Imaging. His research interests are sequence modelling, reinforcement learning and federated learning in the general medical imaging domain. 2. Application of artificial intelligence in correcting motion artifacts and reducing scan time in MRI - Dr Kamlesh Pawar, Monash Biomedical imaging Magnetic Resonance Imaging (MRI) is a widely used imaging modality in clinics and research. Although MRI is useful it comes with an overhead of longer scan time compared to other medical imaging modalities. The longer scan times also make patients uncomfortable and even subtle movements during the scan may result in severe motion artifact in the images. In this seminar, Dr Kamlesh Pawar will discuss how artificial intelligence techniques can reduce scan time and correct motion artifacts. Dr Pawar is a Research Fellow at Monash Biomedical Imaging. His research interest includes deep learning, MR physics, MR image reconstruction and computer vision.
Silicon retinas that make spike events
The story of event cameras starts from the very beginnings of neuromorphic engineering with Misha Mahowald and Carver Mead. The chip design of these “silicon retina” cameras is the most crucial aspect that might enable them to come to mass production and widespread use. Once we have a usable camera is just the beginning, because now we need to think of our use of the data as though we were some type of artificial “silicon cortex”. That step has just started but the last few years have brought some remarkable results from the computer vision community. This talk will have a lot of live demonstrations.
Top-down Modulation in Human Visual Cortex
Human vision flaunts a remarkable ability to recognize objects in the surrounding environment even in the absence of complete visual representation of these objects. This process is done almost intuitively and it was not until scientists had to tackle this problem in computer vision that they noticed its complexity. While current advances in artificial vision systems have made great strides exceeding human level in normal vision tasks, it has yet to achieve a similar robustness level. One cause of this robustness is the extensive connectivity that is not limited to a feedforward hierarchical pathway similar to the current state-of-the-art deep convolutional neural networks but also comprises recurrent and top-down connections. They allow the human brain to enhance the neural representations of degraded images in concordance with meaningful representations stored in memory. The mechanisms by which these different pathways interact are still not understood. In this seminar, studies concerning the effect of recurrent and top-down modulation on the neural representations resulting from viewing blurred images will be presented. Those studies attempted to uncover the role of recurrent and top-down connections in human vision. The results presented challenge the notion of predictive coding as a mechanism for top-down modulation of visual information during natural vision. They show that neural representation enhancement (sharpening) appears to be a more dominant process of different levels of visual hierarchy. They also show that inference in visual recognition is achieved through a Bayesian process between incoming visual information and priors from deeper processing regions in the brain.
Machine Learning as a tool for positive impact : case studies from climate change
Climate change is one of our generation's greatest challenges, with increasingly severe consequences on global ecosystems and populations. Machine Learning has the potential to address many important challenges in climate change, from both mitigation (reducing its extent) and adaptation (preparing for unavoidable consequences) aspects. To present the extent of these opportunities, I will describe some of the projects that I am involved in, spanning from generative model to computer vision and natural language processing. There are many opportunities for fundamental innovation in this field, advancing the state-of-the-art in Machine Learning while ensuring that this fundamental progress translates into positive real-world impact.
Crowding and the Architecture of the Visual System
Classically, vision is seen as a cascade of local, feedforward computations. This framework has been tremendously successful, inspiring a wide range of ground-breaking findings in neuroscience and computer vision. Recently, feedforward Convolutional Neural Networks (ffCNNs), inspired by this classic framework, have revolutionized computer vision and been adopted as tools in neuroscience. However, despite these successes, there is much more to vision. I will present our work using visual crowding and related psychophysical effects as probes into visual processes that go beyond the classic framework. In crowding, perception of a target deteriorates in clutter. We focus on global aspects of crowding, in which perception of a small target is strongly modulated by the global configuration of elements across the visual field. We show that models based on the classic framework, including ffCNNs, cannot explain these effects for principled reasons and identify recurrent grouping and segmentation as a key missing ingredient. Then, we show that capsule networks, a recent kind of deep learning architecture combining the power of ffCNNs with recurrent grouping and segmentation, naturally explain these effects. We provide psychophysical evidence that humans indeed use a similar recurrent grouping and segmentation strategy in global crowding effects. In crowding, visual elements interfere across space. To study how elements interfere over time, we use the Sequential Metacontrast psychophysical paradigm, in which perception of visual elements depends on elements presented hundreds of milliseconds later. We psychophysically characterize the temporal structure of this interference and propose a simple computational model. Our results support the idea that perception is a discrete process. Together, the results presented here provide stepping-stones towards a fuller understanding of the visual system by suggesting architectural changes needed for more human-like neural computations.
Shape from shading in nature: does it provide optimal camouflage?
Blindspots in Computer Vision - How can neuroscience guide AI?
Scientists have worked to recreate human vision in computers for the past 50 years. But how much about human vision do we actually know? And can the brain be useful in furthering computer vision? This talk will take a look at the similarities and differences between (modern) computer vision and human vision, as well as the important crossovers, collaborations, and applications that define the interface between computational neuroscience and computer vision. If you want to know more about how the brain sees (really sees), how computer vision developments are inspired by the brain, or how to apply AI to neuroscience, this talk is for you.
Computer vision and image processing applications on astrocyte-glioma interactions in 3D cell culture
FENS Forum 2024
Impact of barrel cortex lesions and sensory deprivation on perceptual decision-making: Insights from computer vision and time series clustering of freely moving behavioral strategies
FENS Forum 2024