Speech

Topic spotlight

TopicWorld Wide

speech

Discover seminars, jobs, and research tagged with speech across World Wide.

63 curated items40 Seminars19 ePosters4 Positions

Updated 2 days ago

Browse all topics Explore domains

63 items · speech

63 results

Position

We have are looking to fill a fully funded 3-year Ph.D. student position in the field of deep learning-based signal processing algorithms for speech enhancement and computational audition. The position is funded by the German research council (DFG) within the Collaborative Research Centre SFB 1330 “Hearing Acoustics” at the Department of Medical Physics and Acoustics, University of Oldenburg. Within project B3 of the research centre, the Computational Audition Group develops machine learning algorithms for signal processing of speech and audio data.

Position

Steve Schneider

University of Surrey

Dec 5, 2025

The School of Computer Science and Electronic Engineering is seeking to recruit a full-time Lecturer in Natural Language Processing to grow our AI research. The School is home to two established research centres with expertise in AI and Machine Learning: the Computer Science Research Centre and the Centre for Vision, Speech and Signal Processing (CVSSP). This post is aligned to the Nature Inspired Computer and Engineering group within Computer Science. This role encourages applicants from the areas of natural language processing including language modelling, language generation (machine translation/summarisation), explainability and reasoning in NLP, and/or aligned multimodal challenges for NLP (vision-language, audio-language, and so on) and we are particularly interested in candidates who enhance our current strengths and bring complementary areas of AI expertise. Surrey has an established international reputation in AI research, 1st in the UK for computer vision and top 10 for AI, computer vision, machine learning and natural language processing (CSRankings.org) and were 7th in the UK for REF2021 outputs in Computer Science research. Computer Science and CVSSP are at the core of the Surrey Institute for People-Centred AI (PAI), established in 2021 as a pan-University initiative which brings together leading AI research with cross-discipline expertise across health, social, behavioural, and engineering sciences, and business, law, and the creative arts to shape future AI to benefit people and society. PAI leads a portfolio of £100m in grant awards including major research activities in creative industries and healthcare, and two doctoral training programmes with funding for over 100 PhD researchers: the UKRI AI Centre for Doctoral Training in AI for Digital Media Inclusion, and the Leverhulme Trust Doctoral Training Network in AI-Enabled Digital Accessibility.

SeminarNeuroscience

Simulating Thought Disorder: Fine-Tuning Llama-2 for Synthetic Speech in Schizophrenia

Relating circuit dynamics to computation: robustness and dimension-specific computation in cortical dynamics

Shaul Druckmann

Stanford department of Neurobiology and department of Psychiatry and Behavioral Sciences

Apr 22, 2025

Neural dynamics represent the hard-to-interpret substrate of circuit computations. Advances in large-scale recordings have highlighted the sheer spatiotemporal complexity of circuit dynamics within and across circuits, portraying in detail the difficulty of interpreting such dynamics and relating it to computation. Indeed, even in extremely simplified experimental conditions, one observes high-dimensional temporal dynamics in the relevant circuits. This complexity can be potentially addressed by the notion that not all changes in population activity have equal meaning, i.e., a small change in the evolution of activity along a particular dimension may have a bigger effect on a given computation than a large change in another. We term such conditions dimension-specific computation. Considering motor preparatory activity in a delayed response task we utilized neural recordings performed simultaneously with optogenetic perturbations to probe circuit dynamics. First, we revealed a remarkable robustness in the detailed evolution of certain dimensions of the population activity, beyond what was thought to be the case experimentally and theoretically. Second, the robust dimension in activity space carries nearly all of the decodable behavioral information whereas other non-robust dimensions contained nearly no decodable information, as if the circuit was setup to make informative dimensions stiff, i.e., resistive to perturbations, leaving uninformative dimensions sloppy, i.e., sensitive to perturbations. Third, we show that this robustness can be achieved by a modular organization of circuitry, whereby modules whose dynamics normally evolve independently can correct each other’s dynamics when an individual module is perturbed, a common design feature in robust systems engineering. Finally, we will recent work extending this framework to understanding the neural dynamics underlying preparation of speech.

SeminarNeuroscience

The representation of speech conversations in the human auditory cortex

LLMs and Human Language Processing

Maryia Toneva, Ariel Goldstein, Jean-Remi King

Max Planck Institute of Software Systems; Hebrew University; École Normale Supérieure

Nov 28, 2024

This webinar convened researchers at the intersection of Artificial Intelligence and Neuroscience to investigate how large language models (LLMs) can serve as valuable “model organisms” for understanding human language processing. Presenters showcased evidence that brain recordings (fMRI, MEG, ECoG) acquired while participants read or listened to unconstrained speech can be predicted by representations extracted from state-of-the-art text- and speech-based LLMs. In particular, text-based LLMs tend to align better with higher-level language regions, capturing more semantic aspects, while speech-based LLMs excel at explaining early auditory cortical responses. However, purely low-level features can drive part of these alignments, complicating interpretations. New methods, including perturbation analyses, highlight which linguistic variables matter for each cortical area and time scale. Further, “brain tuning” of LLMs—fine-tuning on measured neural signals—can improve semantic representations and downstream language tasks. Despite open questions about interpretability and exact neural mechanisms, these results demonstrate that LLMs provide a promising framework for probing the computations underlying human language comprehension and production at multiple spatiotemporal scales.

Speech

speech

Prof David Brang

Tejas Savalia

Jörn Anemüller

Steve Schneider

Simulating Thought Disorder: Fine-Tuning Llama-2 for Synthetic Speech in Schizophrenia

Relating circuit dynamics to computation: robustness and dimension-specific computation in cortical dynamics

The representation of speech conversations in the human auditory cortex

LLMs and Human Language Processing

Sophie Scott - The Science of Laughter from Evolution to Neuroscience

Llama 3.1 Paper: The Llama Family of Models

Exploring the cerebral mechanisms of acoustically-challenging speech comprehension - successes, failures and hope

Dyslexia, Rhythm, Language and the Developing Brain

Prosody in the voice, face, and hands changes which words you hear

Silences, Spikes and Bursts: Three-Part Knot of the Neural Code

The speaker identification ability of blind and sighted listeners

Motor contribution to auditory temporal predictions

Pitch and Time Interact in Auditory Perception

Hierarchical transformation of visual event timing representations in the human brain: response dynamics in early visual cortex and timing-tuned responses in association cortices

A Framework for a Conscious AI: Viewing Consciousness through a Theoretical Computer Science Lens

Language Representations in the Human Brain: A naturalistic approach

Artificial Intelligence and Racism – What are the implications for scientific research?

Electrophysiological investigations of natural speech and language processing

Representation of speech temporal structure in human cortex

Towards an inclusive neurobiology of language

Hearing in an acoustically varied world

Conflict in Multisensory Perception

Development of multisensory perception and attention and their role in audiovisual speech processing

Speak your mind: cortical predictions of speech sensory feedback

Encoding and perceiving the texture of sounds: auditory midbrain codes for recognizing and categorizing auditory texture and for listening in noise

Multisensory speech perception

Exploring the neurogenetic basis of speech, language, and vocal communication

Speech as a biomarker in ataxia: What can it tell us and how should we use it?

The Jena Voice Learning and Memory Test (JVLMT)

Direction selectivity in hearing: monaural phase sensitivity in octopus neurons

Learning Speech Perception and Action through Sensorimotor Interactions

Decoding the neural processing of speech

Do deep learning latent spaces resemble human brain representations?

Kamala Harris and the Construction of Complex Ethnolinguistic Political Identity

Space for Thinking - Spatial Reference Frames and Abstract Concepts

Low dimensional models and electrophysiological experiments to study neural dynamics in songbirds

Monkey Talk – what studies about nonhuman primate vocal communication reveal about the evolution of speech

Towards a speech neuroprosthesis

Unsupervised deep learning identifies semantic disentanglement in single inferotemporal neurons

Neural control of vocal interactions in songbirds

Rhythm-structured predictive coding for contextualized speech processing

Brain-Rhythm-based Inference (BRyBI) for time-scale invariant speech processing

Cross-trial alignment reveals a low-dimensional cortical manifold of naturalistic speech production

Altered sensory prediction error signaling and dopamine function drive speech hallucinations in schizophrenia

Bayesian integration of audiovisual speech by DNN models is similar to human observers

Geometric Signatures of Speech Recognition: Insights from Deep Neural Networks to the Brain

Human precentral gyrus neurons link speech sequences from listening to speaking

Attentional modulation of the cortical contribution to the frequency-following response evoked by continuous speech

EEG beta de-synchronization signs the efficacy of a rehabilitation treatment for speech impairment in Parkinson’s disease population

Brain-rhythm-based inference (BRyBI) for time-scale invariant speech processing

The cortical frequency-following response to continuous speech in musicians and non-musicians

Decoding envelope and frequency-following responses to speech using deep neural networks

Decoding of selective attention to speech in CI patients using linear and non-linear methods

Decoding spatiotemporal processing of speech and melody in the brain

The effects and interactions of top-down influences on speech perception

EEG-based source analysis of the neural response at the fundamental frequency of speech

Examining speech disfluency through the analysis of grey matter densities in 5-year-olds using voxel-based morphometry

The neural processing of natural audiovisual speech in noise in autism: A TRF approach

Web-based speech transcription tool for efficient quantification of memory performance