🐁 💪 Add MuRP #347

cthoyt · 2021-03-16T13:57:51Z

This PR is a (first/bad) attempt to incorporate MuRP (for #343). It heavily borrows from Balažević's original implementation at https://github.com/ibalazevic/multirelational-poincare and does its best to give attribution. This PR does the following:

Implements the Riemannian Stochastic Gradient Descent (SGD) optimizer
Implements MuRP functional form (differs from original implementation by refactoring redundant code into helper functions - this actually makes it look very similar to MuRE!)
Implements MuRP interaction module
Implements MuRP model

Still to do:

Refactor parameter tagging used in Riemannian SGD to be more generalizable
Consider adding information about default / allowed optimizers for each model?

References #343 still need to deal with flagging which tensors should get treated as poincare

src/pykeen/optimizers/rsgd.py

src/pykeen/nn/functional.py

src/pykeen/utils.py

mberr · 2021-03-16T15:44:04Z

src/pykeen/nn/functional.py

+    """
+    h = clamp_norm(h, maxnorm=1)
+    t = clamp_norm(t, maxnorm=1)
+    r_mat = clamp_norm(r_mat, maxnorm=1)


isn't r_mat a matrix? By default, the norm is computed along the last dimension only.

yes, r_mat is a matrix. here's the code from the original implementation, which seems to be doing the same as the non-matrix ones:

https://github.com/ibalazevic/multirelational-poincare/blob/34523a61ca7867591fd645bfb0c0807246c08660/model.py#L30-L31

rvh is the vector in the original code, i.e., r_vec for us; Ru is the matrix, i.e., r_mat for us.

oops i have made a mistake then

Fixed in b15914c

Now that I took a closer look at the paper and the code, both are vectors, since the matrix is modelled as a diagonal matrix.

cthoyt · 2021-03-16T18:01:31Z

src/pykeen/utils.py

-    return torch.tanh(normv) * v / normv
+def p_exp_map(v: torch.FloatTensor, eps: float = 1.0e-10) -> torch.FloatTensor:
+    v_norm = v.norm(p=2, dim=-1, keepdim=True)
+    return v_norm.tanh() * v / v_norm.clamp_min(eps)


is this because you don't need to clamp before running tanh()?

As far as I saw it, the clamping was necessary to avoid division by zero. However, v_norm was also used as nominator, where there is no risk of numerical problems. So now, the clamping is only applied in the denominator.

… which embeddings are poincare I'm not at all happy with this, but there's very little information actually passed to the optimizer, so it's hard to recover it. That's why i build an object ID to tensor mapping, then pre-labeled it inside the model itself.

mberr · 2021-03-18T18:15:27Z

src/pykeen/optimizers/rsgd.py

@@ -20,10 +22,17 @@
 class RiemannianSGD(Optimizer):
    """A variant of :class:`torch.optim.SGD` generalized for riemannian manifolds."""

-    def __init__(self, params, lr=required, param_names=None):
+    def __init__(self, params: Sequence[Parameter], param_names: Sequence[str], poincare: Set[str], lr=required):


iirc, params and param_names should match in length?

Then why not provide params: Mapping[str, Parameter] instead?

mberr · 2021-03-18T18:16:34Z

src/pykeen/optimizers/rsgd.py

        defaults = dict(lr=lr)
        super().__init__(params, defaults)
-        self.param_names = [] if param_names is None else param_names
+        self._pykeen_extras = {


Why are these parameters saved in a dictionary rather than two attributes?

mberr · 2021-03-18T18:18:59Z

src/pykeen/optimizers/rsgd.py

+
+
+def euclidean_update(p, d_p, lr):
+    p.data = p.data - lr * d_p


The result is also assigned to p.data. So this is redundant?

mberr · 2021-03-18T18:19:09Z

src/pykeen/optimizers/rsgd.py

+
+def poincare_update(p, d_p, lr):
+    v = -lr * d_p
+    p.data = full_p_exp_map(p.data, v)


src/pykeen/optimizers/rsgd.py

mberr · 2021-03-18T18:27:03Z

src/pykeen/optimizers/rsgd.py

+
+
+def poincare_grad(p, d_p):
+    p_sqnorm = torch.clamp(torch.sum(p.data ** 2, dim=-1, keepdim=True), 0, 1 - 1e-5)


The 1 - 1.0e-05 is part of the Poincare disk model, right? I am not very deep into the topic, but is RiemannianSGD restricted to this model? At least when skimming this text it seemed like the method is more general.

So maybe we should document this limitation(?) somewhere.

btw, the last mentioned blog post refers to a library https://github.com/geoopt/geoopt which seems to implement RiemannianSGD along other optimizers in PyTorch.

Co-authored-by: Max Berrendorf <berrendorf@dbs.ifi.lmu.de>

cthoyt added 3 commits March 16, 2021 14:52

Add MuRP

912f52f

References #343 still need to deal with flagging which tensors should get treated as poincare

Update __init__.py

213166b

Update __init__.py

5e0a550

mberr reviewed Mar 16, 2021

View reviewed changes

src/pykeen/optimizers/rsgd.py Show resolved Hide resolved

mberr reviewed Mar 16, 2021

View reviewed changes

src/pykeen/nn/functional.py Outdated Show resolved Hide resolved

cthoyt added 2 commits March 16, 2021 16:16

Update to use pykeen implementation of clamp_norm

8946341

Pass flake8

eaca088

cthoyt commented Mar 16, 2021

View reviewed changes

src/pykeen/utils.py Outdated Show resolved Hide resolved

mberr reviewed Mar 16, 2021

View reviewed changes

src/pykeen/utils.py Outdated Show resolved Hide resolved

mberr reviewed Mar 16, 2021

View reviewed changes

cthoyt and others added 9 commits March 16, 2021 16:52

Use torch implementation instead

1cf32da

Refactor redundant code in calculation of clamped square norms

33d8dbc

Fix functional form

b15914c

make MuRP's relation matrix diagonal

f7be860

adjust MuRPInteraction's relation_shape

7e0d5c6

Merge branch 'master' into add-murp

cc416dc

simplify code for p_exp_map and p_log_map

fc0b1f5

Merge remote-tracking branch 'origin/add-murp' into add-murp

38fe917

simplify code for other p_* methods

084ad7c

cthoyt commented Mar 16, 2021

View reviewed changes

Merge branch 'master' into add-murp

9607a3c

cthoyt changed the title ~~Add MuRP~~ 🐁 💪 Add MuRP Mar 16, 2021

cthoyt added 2 commits March 17, 2021 01:39

Add test

47b0ac2

mberr reviewed Mar 18, 2021

View reviewed changes

src/pykeen/optimizers/rsgd.py

def poincare_update(p, d_p, lr):

v = -lr * d_p

p.data = full_p_exp_map(p.data, v)

Copy link

Member

mberr Mar 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

mberr reviewed Mar 18, 2021

View reviewed changes

src/pykeen/optimizers/rsgd.py Outdated Show resolved Hide resolved

mberr reviewed Mar 18, 2021

View reviewed changes

cthoyt added the 💃 Model Related to interaction models or interaction functions label Apr 12, 2021

cthoyt added the 💎 New Component label May 22, 2021

Update src/pykeen/optimizers/rsgd.py

f866d68

Co-authored-by: Max Berrendorf <berrendorf@dbs.ifi.lmu.de>

cthoyt force-pushed the master branch from 4a608a0 to 9b17a12 Compare July 4, 2022 13:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐁 💪 Add MuRP #347

🐁 💪 Add MuRP #347

cthoyt commented Mar 16, 2021 •

edited

mberr Mar 16, 2021

cthoyt Mar 16, 2021

mberr Mar 16, 2021

cthoyt Mar 16, 2021

cthoyt Mar 16, 2021

mberr Mar 16, 2021

cthoyt Mar 16, 2021

mberr Mar 16, 2021

mberr Mar 18, 2021

mberr Mar 18, 2021

mberr Mar 18, 2021

mberr Mar 18, 2021

mberr Mar 18, 2021



		def poincare_grad(p, d_p):
		p_sqnorm = torch.clamp(torch.sum(p.data ** 2, dim=-1, keepdim=True), 0, 1 - 1e-5)

🐁 💪 Add MuRP #347

Are you sure you want to change the base?

🐁 💪 Add MuRP #347

Conversation

cthoyt commented Mar 16, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cthoyt commented Mar 16, 2021 •

edited