predict_f and predict_y as NamedTuple #1657

antonykamp · 2021-03-28T06:19:44Z

PR type: enhancement

Related issue: #1567

Summary

Proposed changes

By requesting predict_f or predict_y, GPflow returns an instance of MeanAndVariance, a subclass of NamedTuple. By that, the user can access mean and variance individually.
You can find a hint to this enhancement in the example "Basic (Gaussian likelihood) GP regression model".

What alternatives have you considered?

I've considered shorter names for.mean and .variance but rejected this idea because the current resulting code is a) more readable b) already shorter than before.

Minimal working example

import gpflow
import numpy as np

rng = np.random.RandomState(0)

data = rng.randn(100, 2), rng.randn(100, 1)
Xtest, _ = rng.randn(Ntest, D), rng.randn(Ntest, 1)
kernel = Matern32() + gpflow.kernels.White()
model_gp = gpflow.models.GPR(data, kernel=kernel)

mu_f = model_gp.predict_f(Xtest).mean # instead of 'model.predict_f(Xtest)[0]' or 'mu_f, _ = model.predict_f(Xtest)'
var_y = model_gp.predict_y(Xtest).variance # instead of 'model.predict_y(Xtest)[1]' or '_, var_y = model.predict_y(Xtest)'

PR checklist

New features: code is well-documented
- detailed docstrings (API documentation)
- notebook examples (usage demonstration)
The bug case / new feature is covered by unit tests
Code has type annotations
I ran the black+isort formatter (make format)
I locally tested that the tests pass (make check-all)

Release notes

Fully backwards compatible: yes

Commit message (for release notes):

MeanAndVariance as NamedTuple
Added test about backwards compatibility
Added hint in docs, format

(Is likely to be rebased)

st--

Thanks for the contribution - I think this will be really helpful! I'd just like to hear from some of the other people involved with the GPflow project what their thoughts on the naming of the variance field are. :)

st-- · 2021-03-29T16:22:54Z

doc/source/notebooks/basics/regression.pct.py

@@ -179,6 +179,9 @@
 _ = plt.xlim(-0.1, 1.1)


+# %% [markdown]
+# Moreover, you can get the mean and variance data individually for example as `m.predict_f(xx).mean` instead of `m.predict_f(xx)[0]`.


Suggested change

# Moreover, you can get the mean and variance data individually for example as `m.predict_f(xx).mean` instead of `m.predict_f(xx)[0]`.

# Moreover, you can get the mean and variance predictions individually, using for `m.predict_f(xx).mean` instead of `m.predict_f(xx)[0]` and `m.predict_f(xx).variance` instead of `m.predict_f(xx)[1]`.

(note- if we rename the variance field we'll have to update this)

st-- · 2021-03-29T16:24:14Z

gpflow/models/model.py

-MeanAndVariance = Tuple[tf.Tensor, tf.Tensor]
+
+class MeanAndVariance(NamedTuple):
+    """ NamedTuple to access mean- and variance-function separately """


Suggested change

""" NamedTuple to access mean- and variance-function separately """

""" NamedTuple that holds mean and variance as named fields """

st-- · 2021-03-29T16:25:40Z

gpflow/models/model.py

+    """ NamedTuple to access mean- and variance-function separately """
+
+    mean: tf.Tensor
+    variance: tf.Tensor


More of a general question at other maintainers (@vdutor @markvdw @awav) & anyone else ...: what should the name of this field be? In code we often abbreviate the variance to "var", but also it sometimes represents the covariance... or maybe that should be a different type? MeanAndVariance and MeanAndCov (and return a different one depending on full_cov etc)?

i like the idea of the two types. Haven't thought about it in detail

do you always know for sure if you've got the cov or var?

If you pass in full_cov=False and full_output_cov=False, you get the marginals back. If one of full_cov or full_output_cov is True, you get the covariance over inputs or outputs, respectively. If both are True, you should get the N P x N P covariance matrix (though this combination isn't actually implemented in several cases, I believe). So the output type is solely determined by the full_cov and full_output_cov arguments.

btw you can use typing.overload and typing.Literal to give more information than just MeanAndVariance | MeanAndCov, if you wanted to do that

I've created a small example, which should work as wanted:

But I' not comfortable with the if-else statement :/

from typing import Literal, NamedTuple, overload class MeanAndVariance(NamedTuple): mean: int variance: int class MeanAndCovariance(NamedTuple): mean: int covariance: int @overload def predict_f(auto_cov: Literal[False]) -> MeanAndVariance: ... @overload def predict_f(auto_cov: Literal[True]) -> MeanAndCovariance: ... def predict_f(auto_cov: bool = False) -> (MeanAndVariance | MeanAndCovariance): # calculations return MeanAndCovariance(1, 2) if auto_cov else MeanAndVariance(1, 2)

I don't think there's a good way around the if-else, and I think it's better to be explicit than the ambiguity of having to remember whether it's a [N, Q] or [Q, N, N] tensor ...:) I'd be happy with this.

As it pointed out, typing.Literal is only available from Python 3.8 and up :/

@antonykamp the typing_extensions module provides backports for older versions of Python, it does seem to include Literal. :)

I added this pattern with overload and Tuple to the abstract method predict_f in gpflow/models/models.py. Should the overloads of predict_f be marked as abstract too? In any case I see no chance to test the Ellipsis operator of each overloaded funciton :/

Also, I wanted to ask if the parameters of the model constructor should be listed in the parametrization?

st-- · 2021-04-12T15:37:15Z

Hi @antonykamp could you make sure to update RELEASE.md and CONTRIBUTORS.md as part of this PR? (See #1660 and #1661).

antonykamp · 2021-04-15T06:19:09Z

Hi @antonykamp could you make sure to update RELEASE.md and CONTRIBUTORS.md as part of this PR? (See #1660 and #1661).

I'll take care of it after the "literal-overload-naming"-question is resolved. Otherwise, I would have to change it multiple times probably :)

add unittests

antonykamp · 2022-12-13T20:41:11Z

Closed because it's old :)

antonykamp added 3 commits March 22, 2021 09:40

MeanAndVariance as NamedTuple

05873f7

Added test about backwards compatibility

b318676

Added hint in docs, format

35e3617

antonykamp changed the title ~~Antonykamp/predict as named tuple~~ predict_f and predict_y as NamedTuple Mar 28, 2021

st-- reviewed Mar 29, 2021

View reviewed changes

Implement review I

c737f20

antonykamp added 3 commits May 4, 2021 08:54

add overload and Literal operator

9a0ad7e

add unittests

Add documentation

1bafd34

remove MeanAndCovariance from predict_y

415bad7

antonykamp closed this Dec 13, 2022

sc336 reopened this Dec 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

predict_f and predict_y as NamedTuple #1657

predict_f and predict_y as NamedTuple #1657

antonykamp commented Mar 28, 2021 •

edited

st-- left a comment

st-- Mar 29, 2021

st-- Mar 29, 2021

st-- Mar 29, 2021

joelberkeley Mar 31, 2021

joelberkeley Mar 31, 2021

st-- Mar 31, 2021

joelberkeley Mar 31, 2021 •

edited

antonykamp Apr 18, 2021

st-- Apr 21, 2021

antonykamp Apr 25, 2021 •

edited

st-- Apr 26, 2021

antonykamp May 4, 2021

st-- commented Apr 12, 2021

antonykamp commented Apr 15, 2021

antonykamp commented Dec 13, 2022

	# Moreover, you can get the mean and variance data individually for example as `m.predict_f(xx).mean` instead of `m.predict_f(xx)[0]`.
	# Moreover, you can get the mean and variance predictions individually, using for `m.predict_f(xx).mean` instead of `m.predict_f(xx)[0]` and `m.predict_f(xx).variance` instead of `m.predict_f(xx)[1]`.

	""" NamedTuple to access mean- and variance-function separately """
	""" NamedTuple that holds mean and variance as named fields """

predict_f and predict_y as NamedTuple #1657

Are you sure you want to change the base?

predict_f and predict_y as NamedTuple #1657

Conversation

antonykamp commented Mar 28, 2021 • edited

Summary

Minimal working example

PR checklist

Release notes

st-- left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joelberkeley Mar 31, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antonykamp Apr 25, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

st-- commented Apr 12, 2021

antonykamp commented Apr 15, 2021

antonykamp commented Dec 13, 2022

antonykamp commented Mar 28, 2021 •

edited

joelberkeley Mar 31, 2021 •

edited

antonykamp Apr 25, 2021 •

edited