Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example for GPR with uncertain inputs #2090

Open
avivajpeyi opened this issue Nov 1, 2023 · 1 comment
Open

Example for GPR with uncertain inputs #2090

avivajpeyi opened this issue Nov 1, 2023 · 1 comment

Comments

@avivajpeyi
Copy link

In #893, there is some discussion on accounting for uncertain inputs with GPFlow. Specifically, @st-- suggested using:

def uncertain_conditional(Xnew_mu, Xnew_var, feat, kern, q_mu, q_sqrt, *,

However, as a new user, it is unclear how this may be implemented.

In scipy, one may account for uncertain inputs with:

Details

import numpy as np
import matplotlib.pyplot as plt
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, WhiteKernel

np.random.seed(1)

def f(x):
    return x * np.sin(x)


def generate_observations(n=10):
    X = np.random.uniform(0.1, 9.9, n).reshape(-1, 1)
    y = f(X).ravel()
    dy = 0.5 + 1.0 * np.random.random(y.shape)
    noise = np.random.normal(0, dy)
    y += noise
    return X, y, dy


def train_gp(X, y, dy):
    kernel = 1.0 * RBF(length_scale=1.0) + WhiteKernel(noise_level=1.0)
    if dy is None:
        alpha = 1e-10
    else:
        alpha = (dy / y) ** 2
    gp = GaussianProcessRegressor(kernel=kernel, alpha=alpha, n_restarts_optimizer=100)
    gp.fit(X, y)
    return gp



def main():
    # Generate observation + train GP
    X_obs, y_obs, dy_obs = generate_observations()
    gp = train_gp(X_obs, y_obs, dy_obs)
    gp_no_unc = train_gp(X_obs, y_obs, None)

    # Generate plotting data
    x = np.linspace(0, 10, 1000).reshape(-1, 1)
    y_true = f(x)
    y_pred, sigma = gp.predict(x, return_std=True)
    y_pred_no_unc, sigma_no_unc = gp_no_unc.predict(x, return_std=True)

    # Plot
    plt.figure(figsize=(10, 6))
    plt.plot(x, y_true, ls=':', color='k')
    plt.errorbar(X_obs.ravel(), y_obs, dy_obs, fmt='.', color='k', markersize=10, label=u'Observations')


    plt.plot(x, y_pred_no_unc, color="tab:orange")
    plt.fill_between(x.ravel(), y_pred_no_unc - 1.96 * sigma_no_unc, y_pred_no_unc + 1.960 * sigma_no_unc, alpha=0.1, color="tab:orange",
                     label='GP no input uncertainty')

    plt.plot(x, y_pred, color="tab:blue")
    plt.fill_between(x.ravel(), y_pred - 1.96 * sigma, y_pred + 1.960 * sigma, alpha=0.3, color="tab:blue",
                     label='GP with input uncertainty')

    plt.xlabel('$x$')
    plt.ylabel('$f(x)$')
    plt.ylim(-10, 12)
    plt.xlim(0,10)
    plt.legend(loc='upper left')
    plt.show()


if __name__ == '__main__':
    main()

Screen Shot 2023-11-02 at 10 36 58 am

It would be great if there could be a GPFlow example of something like this! Maybe adding an 'uncertain inputs' section to this page?
https://gpflow.github.io/GPflow/2.6.0/notebooks/basics/regression.html

@uri-granta
Copy link
Contributor

Apology for the late response!

If you want constant noise variance, then you can just use the noise_variance parameter

model = gpflow.models.GPR((X, Y), kernel=kernel, noise_variance=1e-10)

For varying output noise you can instead pass in a Gaussian likelihood parameter with a Function that calculates the variance at each input point. This is discussed in detail in this tutorial.

In your sklearn example you seem to be using noise variances from a precalculated tensor(?). If so, you could probably emulate it using something like this (though obviously this will not allow you to use predict_y on points that weren't seen in training):

class LookupFunction(gpflow.functions.Function):
    def __init__(self, X, dy):
        self.X = X
        self.dy = dy

    def __call__(self, X):
        matches = tf.reduce_all(tf.equal(self.X, X), -1)
        indices = tf.where(matches)[:,-1]
        return tf.gather(self.dy, indices)


likelihood = gpflow.likelihoods.Gaussian(LookupFunction(X, dy))
model = gpflow.models.GPR((X, Y), kernel=kernel, likelihood=likelihood)

Finally, it's also possible to model the variance using a second GP, as described in this tutorial on Heteroskedastic likelihoods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants