ePoster

Navigating through the Latent Spaces in Generative Models

Antoniya Boyanova, Tahmineh Koosha, Marco Rothermel, Hamidreza Jamalabadi
Bernstein Conference 2024(2024)
Goethe University, Frankfurt, Germany

Conference

Bernstein Conference 2024

Goethe University, Frankfurt, Germany

Resources

Authors & Affiliations

Antoniya Boyanova, Tahmineh Koosha, Marco Rothermel, Hamidreza Jamalabadi

Abstract

Understanding how visual attributes influence cognitive properties, such as memorability and emotional valence, is crucial for advancements in computational neuroscience and AI-driven image processing. Previous work has introduced a research framework that uses generative models to learn the complex nonlinear mapping between visual attributes and cognitive properties (Goetschalckx et al., 2019). This approach identifies transformations in the latent space of a generative model, which, when applied, generate images with tailored visual attributes that can enhance or diminish targeted cognitive properties. Moreover, these transformed images causally affect human performance. What remains to be elucidated is our understanding of the characteristics of latent spaces and how the choice of latent space affects the cognitive properties of modified visual stimuli. To do so we compare the application of the framework to the latent spaces of two generative models: Very Deep Variational Autoencoders (VDVAE) and Versite Diffusion (VD). The main difference is that VDVAEs capture predominantly low-level image features through a compressed latent representation, while Versatile Diffusion models capture both low and high-level features by progressively refining noise into a coherent image. To compare their latent spaces, we adapted the training procedure used previously to the employed generative models and learned two transformations - one that manipulates the memorability and one that manipulates the emotional valence of the generated images. Subsequently, we tested the transferability of the learned transformations. This involved applying the transformations solely to the latent space of the VDVAE, and then using its generated output to initialise the reverse diffusion process in the VD. Our results show that both models significantly manipulate cognitive properties via generative models, successfully altering memorability and emotional valence in VDVAE and VD. The transferability assessment revealed that transformations in VDVAE’s latent space retained their impact when initializing VD’s reverse diffusion process. This suggests a promising direction for cross-model cognitive property manipulation, enhancing generative models' applicability in computational neuroscience. Additionally, we discuss potential contributions of this approach to fMRI-to-image reconstructions.

Unique ID: bernstein-24/navigating-through-latent-spaces-61b442e1