Add gradient calculation for the covariance between points in GPyModelWrapper #347

BrunoKM · 2021-02-17T12:53:19Z

This change adds a method get_covariance_between_points_gradients to the GPyModelWrapper to facilitate obtaining the gradients of the corresponding get_covariance_between_points method implemented in the wrapper.

These gradients are useful when optimising any batch acquisition function that relies on the covariance between batch points (such as Multi-point Expected Improvement) sequentially, i.e. when optimising with respect to one element of the batch only.

A second change introduced is vectorization of the gradient calculation for full covariance in the GPyModelWrapper class. Replacing the for-loop with numpy array operations results in ~50% speed-up for larger batches of points (> 50).

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…ent calculation

codecov-io · 2021-02-17T12:55:59Z

Codecov Report

Merging #347 (6738faa) into master (b793989) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #347      +/-   ##
==========================================
+ Coverage   89.43%   89.46%   +0.02%     
==========================================
  Files         123      123              
  Lines        4051     4061      +10     
  Branches      462      461       -1     
==========================================
+ Hits         3623     3633      +10     
  Misses        332      332              
  Partials       96       96

Impacted Files	Coverage Δ
emukit/model_wrappers/gpy_model_wrappers.py	`84.66% <100.00%> (+1.09%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b793989...6738faa. Read the comment docs.

apaleyes · 2021-03-09T23:07:43Z

Apologies for taking time on this one, it's a bit hard for me to make complete sense of this change. I promise to get through it!

BrunoKM · 2021-03-10T16:56:21Z

No worries, let me know if there is anything I can do to explain and document it better in the code.

apaleyes

I have two major concerns:

This PR introduces a new method that isn't declared in any model interfaces. That's backwards from Emukit's point of view. Are you planning to use this method anywhere? Perhaps create a new acquisition? If so, let's define an interface, or augment an existing one if that makes more sense.
Both changes carry some pretty heavy matrix algebra, which is really hard to follow. Can you please add some comments throughout, to clarify what's going on? Thanks!

There are also some minor comments, which can be found below.

apaleyes · 2021-03-17T14:45:03Z

emukit/model_wrappers/gpy_model_wrappers.py

+    dkx1X_dx = np.zeros((d, q1*q1, n))
+    dkx1x2_dx = np.zeros((d, q1*q1, q2))
+    for i in range(d):
+        dkx1X_dx[i, ::q1 + 1, :] = kern.dK_dX(x1, x_train, i)
+        dkx1x2_dx[i, ::q1 + 1, :] = kern.dK_dX(x1, x2, i)
+    dkx1X_dx = dkx1X_dx.reshape((d, q1, q1, n))
+    dkx1x2_dx = dkx1x2_dx.reshape((d, q1, q1, q2))


Naming here is hard to get through: dkx1X? Can it be improved?

apaleyes · 2021-03-17T14:49:41Z

emukit/model_wrappers/gpy_model_wrappers.py

+        :return: gradient of the covariance matrix of shape (q1, q2) between outputs at X1 and X2
+                 (return shape is (q1, q2, q1, d)).
+        """
+        dcov_dx1 = dCov(X1, X2, self.model.X, self.model.kern, self.model.posterior.woodbury_inv)


Three inputs come directly from the model. I wander if it makes sense to just pass model in. Or not extract dCov as a stateless method at all. COnisder this: all this method is doing is just calling dCov, literally nothing else. And so far dCov is only used here. It seems pretty specific to me to not get reused. Is that impression correct?

I think that's fair to say, I was primarily just trying to mirror the implementation of dSigma that was already in the module. Happy to not separate it out.

BrunoKM · 2021-04-11T22:50:27Z

To answer the two points:

This PR introduces a new method that isn't declared in any model interfaces. That's backwards from Emukit's point of view. Are you planning to use this method anywhere? Perhaps create a new acquisition? If so, let's define an interface, or augment an existing one if that makes more sense.

This is indeed intended for an implementation of a new acquisition. It comes in handy whenever optimizing any of many batch acquisition functions sequentially – i.e. by selecting optimising batch acquisition with respect to one point in the batch at a time. Currently, get_covariance_between_points is part of the IEntropySearchModel interface, so maybe something like IDifferentiableEntropySeachModel would make sense?

Both changes carry some pretty heavy matrix algebra, which is really hard to follow. Can you please add some comments throughout, to clarify what's going on? Thanks!

Fair point, I'll try and make it easier to follow.

…ent calculation

…kit into gradients-for-covariance-gpy

codecov-commenter · 2021-06-11T12:20:52Z

Codecov Report

Merging #347 (c280f6d) into main (cdcb0d0) will decrease coverage by 0.02%.
The diff coverage is 93.10%.

@@            Coverage Diff             @@
##             main     #347      +/-   ##
==========================================
- Coverage   89.01%   88.99%   -0.03%     
==========================================
  Files         128      128              
  Lines        4244     4254      +10     
  Branches      609      609              
==========================================
+ Hits         3778     3786       +8     
- Misses        369      371       +2     
  Partials       97       97

Impacted Files	Coverage Δ
emukit/core/interfaces/__init__.py	`100.00% <ø> (ø)`
emukit/core/interfaces/models.py	`61.76% <60.00%> (-0.31%)`	⬇️
emukit/model_wrappers/gpy_model_wrappers.py	`84.24% <100.00%> (+0.55%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cdcb0d0...c280f6d. Read the comment docs.

apaleyes

I think we are good to go here, just one outdated docstring to update

apaleyes · 2021-06-11T12:19:06Z

emukit/model_wrappers/gpy_model_wrappers.py

+        :param x1: Prediction inputs of shape (q1, d)
+        :param x2: Prediction inputs of shape (q2, d)
+        :param x_train: Training inputs of shape (n_train, d)
+        :param kern: Covariance of the GP model
+        :param w_inv: Woodbury inverse of the posterior fit of the GP


I suppose this is a bit out of date now?

Removed redundant flags

apaleyes · 2021-06-11T12:20:06Z

emukit/model_wrappers/gpy_model_wrappers.py

+        :return: nd array of shape (q1, q2, d) representing the gradient of the posterior covariance between x1 and x2
+            with respect to x1. res[i, j, k] is the gradient of Cov(y1[i], y2[j]) with respect to x1[i, k]
+        """
+        # Get the relevent shapes


Suggested change

# Get the relevent shapes

# Get the relevant shapes

emukit/model_wrappers/gpy_model_wrappers.py

apaleyes · 2021-06-11T12:22:56Z

emukit/model_wrappers/gpy_model_wrappers.py

+    dkxX_dx = np.zeros((d, q*q, n))
+    # Tensor for the gradients of full covariance matrix at points x_predict (of shape (q, q) with respect to
+    # x_predict (of shape (q, d))
+    dkxx_dx = np.zeros((d, q*q, q))


Naming here is a bit unfortunate. I know it was already there, but do you have any ideas how to improve it? the only way to tell the two vars apart is one upper/lower case X, and that's so poor to read

I renamed it to something more verbose, let me know if you prefer it!

I tried to improve it, let me know what you think!

mmahsereci · 2021-06-11T13:10:28Z

What about the interface discussion? Is this resolved @apaleyes ? Where exactly is the new method required in Emukit, i.e., is there an acquisition function that requires the method?

apaleyes · 2021-06-11T13:23:04Z

Hey @mmahsereci , good to hear from you again!

@BrunoKM promised us an acquisition later down the line. But that's a good point that I forgot all about - the function shall be there as an implementation of some interface. Thanks for catching!

mmahsereci · 2021-06-11T13:29:50Z

That sounds great, thanks for catching me up @apaleyes! And, yes I am back now it seems. :)

BrunoKM · 2021-06-14T11:45:36Z

@apaleyes sorry, resolving the interfaces has been on the stack of my todos for the past couple of weeks, I will try to get around to it soon.

Yes, this is for a new acquisition that is still in the pipeline :)

apaleyes · 2021-12-09T20:23:09Z

@BrunoKM are you still planning to work on this PR (and the follow up of it)? Or shall we just close it?

BrunoKM · 2021-12-11T09:25:26Z

Thanks for the follow-up @apaleyes!

I'm still happy to finish off this PR (if I don't do it before the end-of-year, feel free to close it).

I've been a bit on the fence about whether to push the new acquisition soon afterwards. It's one with particularly tricky gradients: i.e. manually writing a gradient function for it in pure NumPy turned out to be a pain. The current implementation internally uses either jax/pytorch to automatically get the gradient function. I'd assume adding these dependencies is not in line with the philosophy of the rest of the EmuKit repo.

I'm still hoping I'll be able to add the manual gradients later down the line and push the new acquisition function, but due to limited time, I can't say how soon that might be.

Co-authored-by: Andrei Paleyes <apaleyes@users.noreply.github.com>

…kit into gradients-for-covariance-gpy

BrunoKM · 2022-01-01T20:19:49Z

I added an interface that specifies the covariance between points derivative method. Let me know your thoughts!

As for the new acquisition function, I think I'll be able to add it without an evaluate_with_gradients method relatively soon. An update with evaluate_with_gradients could follow later down the line.

apaleyes

looks great, just one comment

apaleyes · 2022-01-04T14:49:26Z

emukit/core/interfaces/models.py

@@ -72,6 +72,35 @@ def get_joint_prediction_gradients(self, X: np.ndarray) -> Tuple[np.ndarray, np.
        raise NotImplementedError


+class ICrossCovarianceDifferentiable:
+    def get_covariance_between_points(self, X1: np.ndarray, X2: np.ndarray) -> np.ndarray:


We already have an interface that defines nearly the same method: https://github.com/EmuKit/emukit/blob/main/emukit/bayesian_optimization/interfaces/models.py

While this isn't that big of an issue, would be nice if we could separate them somehow. Few ideas:

Just rename this method

Rename the other mehtod

Drag this method into a separate interface

Pick whichever you prefer!

BrunoKM added 2 commits February 17, 2021 13:18

Add covariance between points gradient and vectorize covariance gradi…

49f5389

…ent calculation

Add tests for the gradients

d0e25c8

BrunoKM and others added 2 commits February 17, 2021 14:27

Fix shapes in gradient tests

bda97ad

Merge branch 'master' into gradients-for-covariance-gpy

6738faa

apaleyes reviewed Mar 17, 2021

View reviewed changes

BrunoKM and others added 7 commits April 12, 2021 15:10

Rewrite the covariance gradient calculation code

6ed9094

Add covariance between points gradient and vectorize covariance gradi…

e7b436b

…ent calculation

Add tests for the gradients

8e16990

Fix shapes in gradient tests

38a0691

Rewrite the covariance gradient calculation code

d6d6b0f

Merge branch 'gradients-for-covariance-gpy' of github.com:BrunoKM/emu…

5bd6383

…kit into gradients-for-covariance-gpy

Merge branch 'main' into gradients-for-covariance-gpy

d34de42

apaleyes reviewed Jun 11, 2021

View reviewed changes

BrunoKM and others added 6 commits January 1, 2022 12:59

Merge branch 'main' into gradients-for-covariance-gpy

fe9b0d7

Fix typos and remove redundant args in doc-strings

bcf4416

Fix typo in emukit/model_wrappers/gpy_model_wrappers.py

26960ee

Co-authored-by: Andrei Paleyes <apaleyes@users.noreply.github.com>

Rename variable names to be more informative and verbose in dSigma()

2bb36d2

Add futher documentation to gradients of covariance calculations

9d52426

Add an interface for differentiable cross-covariance models

c8b9f83

BrunoKM added 2 commits January 1, 2022 21:17

Incorporate interface into GPyModel

f345342

Merge branch 'gradients-for-covariance-gpy' of github.com:BrunoKM/emu…

c280f6d

…kit into gradients-for-covariance-gpy

apaleyes reviewed Jan 4, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add gradient calculation for the covariance between points in GPyModelWrapper #347

Add gradient calculation for the covariance between points in GPyModelWrapper #347

BrunoKM commented Feb 17, 2021 •

edited

codecov-io commented Feb 17, 2021 •

edited

apaleyes commented Mar 9, 2021

BrunoKM commented Mar 10, 2021 •

edited

apaleyes left a comment

apaleyes Mar 17, 2021

apaleyes Mar 17, 2021

BrunoKM Apr 8, 2021

BrunoKM commented Apr 11, 2021 •

edited

codecov-commenter commented Jun 11, 2021 •

edited

apaleyes left a comment

apaleyes Jun 11, 2021

BrunoKM Jan 1, 2022

apaleyes Jun 11, 2021

BrunoKM Jan 1, 2022

apaleyes Jun 11, 2021

BrunoKM Jan 1, 2022

BrunoKM Jan 1, 2022

mmahsereci commented Jun 11, 2021 •

edited

apaleyes commented Jun 11, 2021

mmahsereci commented Jun 11, 2021

BrunoKM commented Jun 14, 2021 •

edited

apaleyes commented Dec 9, 2021

BrunoKM commented Dec 11, 2021

BrunoKM commented Jan 1, 2022

apaleyes left a comment

apaleyes Jan 4, 2022

Add gradient calculation for the covariance between points in GPyModelWrapper #347

Are you sure you want to change the base?

Add gradient calculation for the covariance between points in GPyModelWrapper #347

Conversation

BrunoKM commented Feb 17, 2021 • edited

codecov-io commented Feb 17, 2021 • edited

Codecov Report

apaleyes commented Mar 9, 2021

BrunoKM commented Mar 10, 2021 • edited

apaleyes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BrunoKM commented Apr 11, 2021 • edited

codecov-commenter commented Jun 11, 2021 • edited

Codecov Report

apaleyes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mmahsereci commented Jun 11, 2021 • edited

apaleyes commented Jun 11, 2021

mmahsereci commented Jun 11, 2021

BrunoKM commented Jun 14, 2021 • edited

apaleyes commented Dec 9, 2021

BrunoKM commented Dec 11, 2021

BrunoKM commented Jan 1, 2022

apaleyes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BrunoKM commented Feb 17, 2021 •

edited

codecov-io commented Feb 17, 2021 •

edited

BrunoKM commented Mar 10, 2021 •

edited

BrunoKM commented Apr 11, 2021 •

edited

codecov-commenter commented Jun 11, 2021 •

edited

mmahsereci commented Jun 11, 2021 •

edited

BrunoKM commented Jun 14, 2021 •

edited