Deep Kernel Methods
deep kernel methods
Deep kernel methods
Deep neural networks (DNNs) with the flexibility to learn good top-layer representations have eclipsed shallow kernel methods without that flexibility. Here, we take inspiration from deep neural networks to develop a new family of deep kernel method. In a deep kernel method, there is a kernel at every layer, and the kernels are jointly optimized to improve performance (with strong regularisation). We establish the representational power of deep kernel methods, by showing that they perform exact inference in an infinitely wide Bayesian neural network or deep Gaussian process. Next, we conjecture that the deep kernel machine objective is unimodal, and give a proof of unimodality for linear kernels. Finally, we exploit the simplicity of the deep kernel machine loss to develop a new family of optimizers, based on a matrix equation from control theory, that converges in around 10 steps.