Resources
Authors & Affiliations
Matteo Ferrante, Tommaso Boccato, Nicola Toschi, Rufin VanRullen
Abstract
: In neuroscience, the question of how the brain combines individual visual concepts into more complex representations remains unresolved [1,2]. While previous studies have shown that artificial intelligence (AI) systems can encode and manipulate concepts through vector arithmetic (e.g., in language models), it is unclear whether the human brain operates in a similar, compositional manner for visual perception [2]. We tackle the question of whether brain activity patterns can be algebraically manipulated to represent compositional visual information, and if so, whether can we decode these patterns into meaningful visual changes. To this end, we introduce a novel framework termed "brain algebra." We hypothesize that neural activity patterns, measured via fMRI, can be algebraically combined to reflect compositional changes in perception. We construct "conceptual perturbations" by averaging fMRI responses to visual stimuli that share a specific semantic concept (e.g., "winter" or "summer"). These perturbations are then added to the neural activity corresponding to a base image (e.g., a man on a skateboard), hence creating a novel brain activity pattern that is hypothesized to represent the base image modified by the “added” concept (e.g., the man on a skateboard is on a snowboard in a wintery scene). To test this hypothesis, we decode the perturbed brain patterns into images using an fMRI-to-image decoding model ( Brain-Diffuser [3]). This model reconstructs visual stimuli from the corresponding neural activity, and the former allow to visually inspect whether the brain algebra operations lead to predictable compositional changes. Our findings provide compelling evidence for the compositionality of visual representations in the brain. When conceptual perturbations are added to base brain patterns, the decoded images reflect coherent and meaningful modifications. These results suggest that the brain can work by combining visual concepts in a manner analogous to algebraic operations in artificial intelligence model representations.