Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed broadcasting rules for gpflow.models.model.predict_y, partially resolves #1461. #1597

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

mohitrajpal1
Copy link

@mohitrajpal1 mohitrajpal1 commented Oct 11, 2020

PR type: new feature

**Related issue(s)/PRs:#1461

Summary

Proposed changes

This proposed change fixes the broadcasting rules for predict_y to correctly handle full_cov/full_output_cov = True. The changes made rely on slightly abusing tf broadcasting rules in a compatible manner to the covariance matrix/tensor when full_cov/full_output_cov = True.

What alternatives have you considered?

I think at this time we should keep predict_log_densities for full_cov/full_output_cov = True as is. The reasoning behind this is that with large covariance matrix/tensors, there is no efficient way to compute these exactly without hitting OOM issues. I strongly prefer keeping the predict_log_densities 'as-is' to avoid a situation where predict_log_densities takes far too long to run, or approximates a solution.

Minimal working example

import gpflow
import numpy as np
import gpflow as gpf

for foc in [False, True]:
    for oc in [False, True]:
        print(foc)
        print(oc)
        rng = np.random.RandomState(123)
        N = 100  # Number of training observations
        X = rng.rand(N, 1) * 2 - 1  # X values
        M = 50  # Number of inducing locations
        kernel = gpflow.kernels.SquaredExponential()
        Z = X[:M, :].copy()  # Initialize inducing locations to the first M inputs in the dataset
        m = gpflow.models.SVGP(kernel, gpflow.likelihoods.Gaussian(), Z, num_data=N)

        pX = np.linspace(10, 20, 4)[:, None]  # Test locations
        _, pYv = m.predict_y(pX, full_output_cov = foc, full_cov = oc)  # Predict Y values at test locations
        _, pFv = m.predict_f(pX, full_output_cov = foc, full_cov = oc)
        print(pYv.shape)
        print(pYv - pFv)


        num_elements = 7
        L = 2
        Zinit = [gpf.inducing_variables.InducingPoints(np.linspace(0, 100, 60)[:, None]) for _ in range(L)]
        kern_list = [gpflow.kernels.SquaredExponential() for _ in range(L)]

        #linear model of coregionalization
        kernel = gpf.kernels.LinearCoregionalization(
          kern_list, W=np.random.randn(num_elements, L))

        # create multi-output inducing variables from Z
        iv = gpf.inducing_variables.SeparateIndependentInducingVariables(
          Zinit
        )


        # initialize mean of variational posterior to be of shape MxL
        q_mu = np.zeros((60, L))
        # initialize \sqrt(Σ) of variational posterior to be of shape LxMxM
        q_sqrt = np.repeat(np.eye(60)[None, ...], L, axis=0) * 1.0

        # create SVGP model as usual and optimize
        m = gpf.models.SVGP(
          kernel, gpf.likelihoods.Gaussian(), inducing_variable=iv, q_mu=q_mu, q_sqrt=q_sqrt
        )

        _, pYv = m.predict_y(pX, full_output_cov = foc, full_cov = oc)  # Predict Y values at test locations
        _, pFv = m.predict_f(pX, full_output_cov = foc, full_cov = oc)
        print(pYv.shape)
        print(pYv - pFv)

PR checklist

  • New features: code is well-documented
    • detailed docstrings (API documentation)
    • notebook examples (usage demonstration)
  • The bug case / new feature is covered by unit tests
  • Code has type annotations
  • I ran the black+isort formatter (make format)
  • I locally tested that the tests pass (make check-all)

Release notes

Fully backwards compatible: yes

If not, why is it worth breaking backwards compatibility:

Commit message (for release notes): Fixed broadcasting rules for gpflow.models.model.predict_y, partially resolves #1461.

@mohitrajpal1
Copy link
Author

mohitrajpal1 commented Oct 11, 2020

I've opened this PR a bit early, as I'd like review of the design choices made. I will be adding unit tests once we're happy with code quality/design decisions before closing the PR.

With regards to automatic style rules enforced by black/isort, I found that many drastic changes are recommended for these well trafficked files, would we like to make these drastic changes recommended by black/isort?

Thanks,

Mohit

Copy link
Contributor

@vdutor vdutor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, many thanks for your PR! I've left you an initial set of comments that will help me better understand what problem you are trying to solve. Happy to iterate over this.

return self.likelihood.predict_mean_and_var(f_mean, f_var)

if full_cov and full_output_cov:
f_var_mat = tf.reshape(f_var, [1, f_var.shape[0]*f_var.shape[1], f_var.shape[2]*f_var.shape[3]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice if you could add expected shapes of tensors at the end of the line, for example

f_mean, f_var = ...  # [N, P, N, P] or [N, P, P], etc.

return self.likelihood.predict_mean_and_var(f_mean, f_var)

if full_cov and full_output_cov:
f_var_mat = tf.reshape(f_var, [1, f_var.shape[0]*f_var.shape[1], f_var.shape[2]*f_var.shape[3]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to use tf.shape(tensor) instead of tensor.shape, the former will work in both compiled and eager mode, while the latter only works in eager mode.

if full_cov and full_output_cov:
f_var_mat = tf.reshape(f_var, [1, f_var.shape[0]*f_var.shape[1], f_var.shape[2]*f_var.shape[3]])
f_mean_pred, f_var_pred = self.likelihood.predict_mean_and_var(f_mean, f_var_mat)
return f_mean_pred, tf.reshape(f_var_pred, f_var.shape)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shape comments would be helpful here as well.

return tf.identity(Fmu), Fvar + self.variance
rank = tf.rank(Fvar).numpy()
if rank == 2:
return tf.identity(Fmu), Fvar + self.variance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is not related to your PR, but I think we can drop the tf.identitys here, as this is a no-op and looks like legacy code.

rank = tf.rank(Fvar).numpy()
if rank == 2:
return tf.identity(Fmu), Fvar + self.variance
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we, please, have an elif that matches the exact ranks you want to target here, and another else that raises a NotImplementedError for the ones we do not support?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

predict_y and predict_log_density support for full_cov or full_output_cov
2 participants