ePoster

Efficient learning of deep non-negative matrix factorisation networks

Mahbod Nouri, David Rotermund, Alberto García Ortiz, Klaus Pawelzik
Bernstein Conference 2024(2024)
Goethe University, Frankfurt, Germany

Conference

Bernstein Conference 2024

Goethe University, Frankfurt, Germany

Resources

Authors & Affiliations

Mahbod Nouri, David Rotermund, Alberto García Ortiz, Klaus Pawelzik

Abstract

Networks composed of Non-Negative Matrix Factorization (NNMF) [1] modules can serve as abstractions of real neural networks. In particular, NNMF networks can be extended easily to perform computations based on stochastic spikes [2,3]. However, due to training via Autograd approaches following exact gradients to train deep networks of NNMF modules currently requires prohibitively large amounts of memory and is very slow. Here we present an approximative backpropagation (BP) method for optimizing deep NNMF networks. NNMF layers have latent variables that are updated iteratively towards a fixed point. The trick is to construct an approximation for the backpropagating error using the dynamics of the latent variables. With this, the derivation of the learning rule is straightforward and only the final values of the latent variables are taken into account for the gradient instead of their full history. Furthermore, we show how to make the NNMF backpropagation rule compatible with ADAM optimizers by using an auxiliary form of the weights which evades their restraints like positivity and normalization. As an example, we introduce a novel network architecture that combines the NNMF layers each with 1x1 convolutional Perceptron layers. The performance of this hybrid exceeds the performance of pure NNMF or pure multi-layer Perceptrons, given the same number of neurons as well as weight parameters. On benchmark classification data (CIFAR10) the "base" network reached 81.0% correct with Perceptrons and 81.5% with NNMF. Adding the 1x1 convolution Perception layer improves the performance of the base networks to 82.8% and 83.7% respectively. When furthermore in the NNMF network the pooling layers were replaced by 2x2 convolutions with 2x2 stride the performance increased to 85.4%. Our efficient learning algorithm not only improves the accessibility of deep NNMF networks for engineering applications, but also demonstrates that realistic neural computations based on stochastic spikes can outperform CNNs which are based on noiseless signals [4]. The novel extension of deep NNMF networks with intermediate perceptron layers might realize increasingly faithful representations of the classes along the hierarchy. Thus, we hope to tackle the question of interpretability of how a neuronal network processes its data, which would make the black box more transparent and thereby promises to explain also neuronal responses down stream the visual system.

Unique ID: bernstein-24/efficient-learning-deep-non-negative-dff0a4e4