Deep Learning Models
deep learning models
From Spiking Predictive Coding to Learning Abstract Object Representation
In a first part of the talk, I will present Predictive Coding Light (PCL), a novel unsupervised learning architecture for spiking neural networks. In contrast to conventional predictive coding approaches, which only transmit prediction errors to higher processing stages, PCL learns inhibitory lateral and top-down connectivity to suppress the most predictable spikes and passes a compressed representation of the input to higher processing stages. We show that PCL reproduces a range of biological findings and exhibits a favorable tradeoff between energy consumption and downstream classification performance on challenging benchmarks. A second part of the talk will feature our lab’s efforts to explain how infants and toddlers might learn abstract object representations without supervision. I will present deep learning models that exploit the temporal and multimodal structure of their sensory inputs to learn representations of individual objects, object categories, or abstract super-categories such as „kitchen object“ in a fully unsupervised fashion. These models offer a parsimonious account of how abstract semantic knowledge may be rooted in children's embodied first-person experiences.
Automated generation of face stimuli: Alignment, features and face spaces
I describe a well-tested Python module that does automated alignment and warping of faces images, and some advantages over existing solutions. An additional tool I’ve developed does automated extraction of facial features, which can be used in a number of interesting ways. I illustrate the value of wavelet-based features with a brief description of 2 recent studies: perceptual in-painting, and the robustness of the whole-part advantage across a large stimulus set. Finally, I discuss the suitability of various deep learning models for generating stimuli to study perceptual face spaces. I believe those interested in the forensic aspects of face perception may find this talk useful.
No Free Lunch from Deep Learning in Neuroscience: A Case Study through Models of the Entorhinal-Hippocampal Circuit
Research in Neuroscience, as in many scientific disciplines, is undergoing a renaissance based on deep learning. Unique to Neuroscience, deep learning models can be used not only as a tool but interpreted as models of the brain. The central claims of recent deep learning-based models of brain circuits are that they shed light on fundamental functions being optimized or make novel predictions about neural phenomena. We show, through the case-study of grid cells in the entorhinal-hippocampal circuit, that one may get neither. We rigorously examine the claims of deep learning models of grid cells using large-scale hyperparameter sweeps and theory-driven experimentation, and demonstrate that the results of such models are more strongly driven by particular, non-fundamental, and post-hoc implementation choices than fundamental truths about neural circuits or the loss function(s) they might optimize. We discuss why these models cannot be expected to produce accurate models of the brain without the addition of substantial amounts of inductive bias, an informal No Free Lunch result for Neuroscience.
Implementing structure mapping as a prior in deep learning models for abstract reasoning
Building conceptual abstractions from sensory information and then reasoning about them is central to human intelligence. Abstract reasoning both relies on, and is facilitated by, our ability to make analogies about concepts from known domains to novel domains. Structure Mapping Theory of human analogical reasoning posits that analogical mappings rely on (higher-order) relations and not on the sensory content of the domain. This enables humans to reason systematically about novel domains, a problem with which machine learning (ML) models tend to struggle. We introduce a two-stage neural net framework, which we label Neural Structure Mapping (NSM), to learn visual analogies from Raven's Progressive Matrices, an abstract visual reasoning test of fluid intelligence. Our framework uses (1) a multi-task visual relationship encoder to extract constituent concepts from raw visual input in the source domain, and (2) a neural module net analogy inference engine to reason compositionally about the inferred relation in the target domain. Our NSM approach (a) isolates the relational structure from the source domain with high accuracy, and (b) successfully utilizes this structure for analogical reasoning in the target domain.
Understanding neural dynamics in high dimensions across multiple timescales: from perception to motor control and learning
Remarkable advances in experimental neuroscience now enable us to simultaneously observe the activity of many neurons, thereby providing an opportunity to understand how the moment by moment collective dynamics of the brain instantiates learning and cognition. However, efficiently extracting such a conceptual understanding from large, high dimensional neural datasets requires concomitant advances in theoretically driven experimental design, data analysis, and neural circuit modeling. We will discuss how the modern frameworks of high dimensional statistics and deep learning can aid us in this process. In particular we will discuss: (1) how unsupervised tensor component analysis and time warping can extract unbiased and interpretable descriptions of how rapid single trial circuit dynamics change slowly over many trials to mediate learning; (2) how to tradeoff very different experimental resources, like numbers of recorded neurons and trials to accurately discover the structure of collective dynamics and information in the brain, even without spike sorting; (3) deep learning models that accurately capture the retina’s response to natural scenes as well as its internal structure and function; (4) algorithmic approaches for simplifying deep network models of perception; (5) optimality approaches to explain cell-type diversity in the first steps of vision in the retina.
Theoretical and computational approaches to neuroscience with complex models in high dimensions across multiple timescales: from perception to motor control and learning
Remarkable advances in experimental neuroscience now enable us to simultaneously observe the activity of many neurons, thereby providing an opportunity to understand how the moment by moment collective dynamics of the brain instantiates learning and cognition. However, efficiently extracting such a conceptual understanding from large, high dimensional neural datasets requires concomitant advances in theoretically driven experimental design, data analysis, and neural circuit modeling. We will discuss how the modern frameworks of high dimensional statistics and deep learning can aid us in this process. In particular we will discuss: how unsupervised tensor component analysis and time warping can extract unbiased and interpretable descriptions of how rapid single trial circuit dynamics change slowly over many trials to mediate learning; how to tradeoff very different experimental resources, like numbers of recorded neurons and trials to accurately discover the structure of collective dynamics and information in the brain, even without spike sorting; deep learning models that accurately capture the retina’s response to natural scenes as well as its internal structure and function; algorithmic approaches for simplifying deep network models of perception; optimality approaches to explain cell-type diversity in the first steps of vision in the retina.
Synthesizing Machine Intelligence in Neuromorphic Computers with Differentiable Programming
The potential of machine learning and deep learning to advance artificial intelligence is driving a quest to build dedicated computers, such as neuromorphic hardware that emulate the biological processes of the brain. While the hardware technologies already exist, their application to real-world tasks is hindered by the lack of suitable programming methods. Advances at the interface of neural computation and machine learning showed that key aspects of deep learning models and tools can be transferred to biologically plausible neural circuits. Building on these advances, I will show that differentiable programming can address many challenges of programming spiking neural networks for solving real-world tasks, and help devise novel continual and local learning algorithms. In turn, these new algorithms pave the road towards systematically synthesizing machine intelligence in neuromorphic hardware without detailed knowledge of the hardware circuits.