Voice Recognition
voice recognition
Face and voice perception as a tool for characterizing perceptual decisions and metacognitive abilities across the general population and psychosis spectrum
Humans constantly make perceptual decisions on human faces and voices. These regularly come with the challenge of receiving only uncertain sensory evidence, resulting from noisy input and noisy neural processes. Efficiently adapting one’s internal decision system including prior expectations and subsequent metacognitive assessments to these challenges is crucial in everyday life. However, the exact decision mechanisms and whether these represent modifiable states remain unknown in the general population and clinical patients with psychosis. Using data from a laboratory-based sample of healthy controls and patients with psychosis as well as a complementary, large online sample of healthy controls, I will demonstrate how a combination of perceptual face and voice recognition decision fidelity, metacognitive ratings, and Bayesian computational modelling may be used as indicators to differentiate between non-clinical and clinical states in the future.
Developing a test to assess the ability of Zurich’s police cadets to discriminate, learn and recognize voices
The goal of this pilot study is to develop a test through which people with extraordinary voice recognition and discrimination skills can be found (for forensic purposes). Since interest in this field has emerged, three studies have been published with the goal of finding people with potential super-recognition skills in voice processing. One of them is a discrimination test and two are recognition tests, but neither combines the two test scenarios and their test designs cannot be directly compared to a casework scenario in forensics phonetics. The pilot study at hand attempts to bridge this gap and analyses if the skills of voice discrimination and recognition correlate. The study is guided by a practical, forensic application, which further complicates the process of creating a viable test. The participants for the pilot consist of different classes of police cadets, which means the test can be redone and adjusted over time.
The quest for the cortical algorithm
The cortical algorithm hypothesis states that there is one common computational framework to solve diverse cognitive problems such as vision, voice recognition and motion control. In my talk, I propose a strategy to guide the search for this algorithm and I present a few ideas on how some of its components might look like. I'll explain why a highly interdisciplinary approach is needed from neuroscience, computer science, mathematics and physics to make further progress in this important question.
The Jena Voice Learning and Memory Test (JVLMT)
The ability to recognize someone’s voice spans a broad spectrum with phonagnosia on the low end and super recognition at the high end. Yet there is no standardized test to measure the individual ability to learn and recognize newly-learnt voices with samples of speech-like phonetic variability. We have developed the Jena Voice Learning and Memory Test (JVLMT), a 20 min-test based on item response theory and applicable across different languages. The JVLMT consists of three phases in which participants are familiarized with eight speakers in two stages and then perform a three-alternative forced choice recognition task, using pseudo sentences devoid of semantic content. Acoustic (dis)similarity analyses were used to create items with different levels of difficulty. Test scores are based on 22 Rasch-conform items. Items were selected and validated in online studies based on 232 and 454 participants, respectively. Mean accuracy is 0.51 with an SD of .18. The JVLMT showed high and moderate correlations with convergent validation tests (Bangor Voice Matching Test; Glasgow Voice Memory Test) and a weak correlation with a discriminant validation test (Digit Span). Empirical (marginal) reliability is 0.66. Four participants with super recognition (at least 2 SDs above the mean) and 7 participants with phonagnosia (at least 2 SDs below the mean) were identified. The JVLMT is a promising screen too for voice recognition abilities in a scientific and neuropsychological context.