Skip to content

Deep Kernel Learning. Gaussian Process Regression where the input is a neural network mapping of x that maximizes the marginal likelihood

Notifications You must be signed in to change notification settings

maka89/Deep-Kernel-GP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep-Kernel-GP

Dependencies

The package has numpy and scipy.linalg as dependencies. The examples also use matplotlib and scikit-learn

Introduction

Instead of learning a mapping X-->Y with a neural network or GP regression, we learn the following mappings: X-->Z-->Y where the first step is performed by a neural net and the second by a gp regression algorithm.

This way we are able to use GP Regression to learn functions on data where the the assumption that y(x) is a gaussian surface with covariance specified by one of the standard covariance fucntions, might not be a fair assumption. For instance we can learn functions with image pixels as inputs or functions with length scales that varies with the input.

The parameters of the neural net are trained maximizing the log marginal likelihood implied by z(x_train) and y_train.

Deep Kernel Learning - A.G. Wilson ++

Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes - G. Hinton ++

Examples

Basic usage is done with a Scikit ish API:

layers=[]
layers.append(Dense(32,activation='tanh'))
layers.append(Dense(1))
layers.append(CovMat(kernel='rbf'))

opt=Adam(1e-3) # or opt=SciPyMin('l-bfgs-b')

gp=NNRegressor(layers,opt=opt,batch_size=x_train.shape[0],maxiter=1000,gp=True,verbose=True)
gp.fit(x_train,y_train)
y_pred,std=gp.predict(x_test)

The example creates a mapping z(x) where both x and z are 1d vectors using a neural network with 1 hidden layer. The CovMat layer creates a covariance matrix from z using the covariance function v*exp(-0.5*|z1-z2|**2) with noise y where x and y are learned during training.

x and y are available after training as gp.layers[-1].var and gp.layers[-1].s_alpha. The gp.fast_forward() function can be used to extract the z(x) function (It skips the last layer that makes an array of size [batch_size, batch_size]).

Learning a function with varying length scale

In the example.py script, deep kernel learning (DKL) is used to learn from samples of the function sin(64(x+0.5)**4).

Learning this function with a Neural Network would be hard, since it can be challenging to fit rapidly oscilating functions using NNs. Learning the function using GPRegression with a squared exponential covariance function, would also be suboptimal, since we need to commit to one fixed length scale. Unless we have a lot of samples,we would be forced to give up precision on the slowly varying part of the function.

DKL Prediction:

DKL Prediction

z(x) function learned by neural network.

We see that DKL solves the problem quite nicely, given the limited data. We also see that for x<-0.5 the std.dev of the DKL model does not capture the prediction error.

About

Deep Kernel Learning. Gaussian Process Regression where the input is a neural network mapping of x that maximizes the marginal likelihood

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages