Resources
Authors & Affiliations
T Anderson Keller
Abstract
Mounting evidence suggests that traveling waves are a key dynamical motif in biological neural information processing systems [Muller et al., 2018], yet their precise computational role remains under investigation. While the Transformer architecture [Vaswani et al., 2023] has achieved state-of-the-art performance across various data modalities, it relies heavily on an explicit `context’ – processing the entire input sequence simultaneously. This approach is both biologically implausible and computationally inefficient, resulting in quadratic complexity relative to sequence length. Recent interest has shifted toward State Space Models (SSMs) [Gu et al., 2022] as a potential `context-free’ alternative to Transformers, demonstrating comparable language modeling performance [Gu and Dao, 2024]. In this work, we demonstrate that these SSMs inherently implement a form of traveling wave dynamics with a fixed velocity ($\nu = 1$ neuron/timestep). However, this fixed wave velocity limits their capacity to approximate the Transformer’s context mechanism, leading to poor performance on context-dependent tasks like sequence copying [Jelassi et al., 2024]. Building on this insight, we introduce the Nu-Wave SSM – a framework that generalizes SSMs by incorporating variable wave velocities ($\nu \neq 1$). By allowing for adjustable wave dynamics, our model effectively enhances the approximation of an explicit context. Empirically, we demonstrate that our model learns exponentially faster and achieves significantly lower error rates on large-scale memory-dependent tasks, matching the performance of Transformers previously thought unattainable by SSMs; showing that Transformer-like `context’ can be effectively implemented in a more biologically plausible manner through wave dynamics. Our findings bridge the gap between artificial and biological neural processing, offering new insights into the role of traveling waves in neural information processing and memory. This work underscores the importance of further investigating traveling waves in natural neural systems and explains their adoption in many state-of-the-art recurrent neural network language models [Beck et al., 2024].