ePoster

Feature-based letter perception – A neurocognitive plausible, transparent model approach

Janos Pauli, Benjamin Gagl
Bernstein Conference 2024(2024)
Goethe University, Frankfurt, Germany

Conference

Bernstein Conference 2024

Goethe University, Frankfurt, Germany

Resources

Authors & Affiliations

Janos Pauli, Benjamin Gagl

Abstract

Applying the principles of predictive coding to reading has successfully increased our understanding of the neuro-cognitive processes underlying the extraction of meaning from text. Central is integrating prior knowledge with the sensory input to highlight the informative parts of a visual percept, thus increasing the efficiency of the representation and processing in the visual pathway. Current computer vision models (e.g., deep learning models) can easily predict letter identities; however, their inner workings remain opaque, making it extremely hard to infer how the model implements the task. This shortcoming emphasizes a need for a transparent modeling approach that respects neuro-cognitive processing principles, like predictive coding, to increase our understanding of the computational processes implemented in the visual system of humans. The first visual word recognition models that consider the neuro-cognitive level have been promising but need a full implementation of image recognition processes underlying reading. Here, we implement a transparent image computable letter detection model, the missing piece for implementing a visual word recognition model with an image input only (i.e., a computer vision model). We aim to create a font- invariant letter detection model based on biologically inspired components described in the retina and the visual cortex within a transparent neurocognitive model based on the principles of predictive coding. We use an extensive letter dataset (N = 1976 letters; 25 % left out data for testing) consisting of all Latin alphabet uppercase letters in 19 font styles and four font sizes. The dataset is convolved with a minimal set of Gabor filters of different orientations and frequencies to mimic simple and complex cell activity in the visual cortex. In the first step, we generate pixel-level and filter-level features. We use these features to train a random forest classifier. Model testing is implemented on unseen test data, with and without increasing noise levels. The feature model shows superior classification performance compared to a pixel model with (Fig. 1) and without noise. Thus, we learn that font-invariant letter identification that is noise resistant can be implemented transparently, allowing, in the next step, the investigation of the implementation of the underlying neuro-cognitive processes based on brain data and the integration into word recognition models making them image computable.

Unique ID: bernstein-24/feature-based-letter-perception-2bad78d0