Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add decorator to fix adjoint weightings #1177

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

kohr-h
Copy link
Member

@kohr-h kohr-h commented Oct 5, 2017

Update: The general logic is settled, details are in discussion.


This is a discussion PR. I think we could go somewhere along these lines to make adjoints "automatically correct" with respect to weighted spaces, in particular when the weights differ.

This solution should always be correct where it applies, however it wouldn't yet help in special cases like RayTransform where the weights need to be computed per axis.
The larger question is what to do instead of the monkey-patching here. I can imagine a bunch of scenarios:

  1. Do this automatically in Operator. This means that users implement the unweighted adjoint, and the weighted one is implemented with a helper function similar to this.
    This would require us to communicate this "magic" clearly and prominently.
  2. As in 1., but depending on some flag or whatever that is provided by the user. It could be a class attribute like _auto_weighting or so. Don't do anything by default.
    This option would still need good communication but leads to less unexpected behavior on the user side.
  3. Just offer a tool (e.g. a decorator) for adjoint, like @auto_weighting that users can choose to use.
    This option has the least surprise and the simplest migration path, we would just have to use it a lot ourselves so users pick it up.

Opinions?

For the records: What I don't want is to let adjoint return an operator composition. That would be ugly and a usability nightmare, since operator properties will be buried deep down.


TODO:

  • What to do if one of the spaces (or both) don't define weighting at all?
    We could treat no weighting as 1.0, but need to be careful with implementation since the current variant uses in-place arithmetic. It should be fine, though, since the weighting factors that are 1.0 are just skipped.
  • Make decorator for the (exact) Operator.norm method (should be way simpler though)
  • Fix existing operators.
    • Add domain or range where missing not in this PR
    • Fix operator impls for changes with differing domain/range
  • Add unit tests with domain and range differing in weighting
  • Document the decorator in the online doc.
  • Think about streamlining certain types, e.g. OperatorComp to avoid dumb "multiplications by 1"
  • There seems to be an issue when adjoint uses another property, e.g. inverse, needs investigation
  • Remove the repr changes that are now in DONTMERGE: improve repr by a lot #1276

@adler-j
Copy link
Member

adler-j commented Oct 5, 2017

In my opinion, adding a utility is the only option that is not outrageously surprising to users. This could be, as you suggest, in the form of a decorator, but I'd simply make it a free function that is either called on the adjoint:

@property
def adjoint(self):
    ...
    return auto_weighting(adjoint)

or in the operator itself

class AdjointyOperator:
    def __call__(self, x):
        return auto_weighting(adjoint)

@kohr-h
Copy link
Member Author

kohr-h commented Oct 5, 2017

In my opinion, adding a utility is the only option that is not outrageously surprising to users.

I think the same. Only for the sake of not introducing bias in the discussion I left it kind of open.

Note that the option of decorating the adjoint, which is equivalent to your first code box, would still require some monkey-patching. However, in this case I would find it more acceptable since it happens "almost" during object creation.

The second option is cleaner in that respect, but has the disadvantage that it would make it harder to use existing (unweighted) operators off the shelf. Things like return ThisOtherOperator() in adjoint will not be "auto-weightable" since that would require a _call method to decorate.

So I'm leaning towards the operator decoration option.

And BTW, this has interesting implications for OperatorComp where you could start pushing around weighting constants all the way to the smallest space, for instance.

I'll think a bit more about this.

@kohr-h
Copy link
Member Author

kohr-h commented Oct 5, 2017

, which is equivalent to your first code box,

roughly :-)

@kohr-h kohr-h force-pushed the issue-1068__ajoint_weighting branch from 9dbb725 to 2a92199 Compare October 5, 2017 14:21
@kohr-h
Copy link
Member Author

kohr-h commented Oct 5, 2017

Here's a more serious suggestion. Obviously needs more tests, but does the job.

@kohr-h
Copy link
Member Author

kohr-h commented Oct 5, 2017

Also obviously, the implementation does not make use of the optimize flag yet and just does it unconditionally. Further, it does it badly by computing those 1 / something arrays. That needs to change.

@kohr-h kohr-h force-pushed the issue-1068__ajoint_weighting branch 2 times, most recently from e25fa7d to c211b66 Compare October 5, 2017 15:56
@pep8speaks
Copy link

pep8speaks commented Oct 5, 2017

Checking updated PR...

Line 147:1: E303 too many blank lines (3)

  • Complete extra results for this file :

class ResizingOperator(Operator):^---

Line 517:80: E501 line too long (80 > 79 characters)

  • Complete extra results for this file :
    ProductSpace(uniform_discr([-1., -1.], [ 1.,  1.], (1, 2)), 2).element([                                                                               ^---
Comment last updated on February 15, 2018 at 22:06 Hours UTC

@kohr-h kohr-h force-pushed the issue-1068__ajoint_weighting branch 2 times, most recently from 64e16b2 to 85c64fb Compare October 5, 2017 16:01
@kohr-h
Copy link
Member Author

kohr-h commented Oct 5, 2017

So this is now a serious PR. I've added tests and an example, and I think this concept would actually work. Comments welcome.

@kohr-h kohr-h changed the title WIP: fix adjoint weightings ENH: add decorator to fix adjoint weightings Oct 5, 2017
@kohr-h kohr-h added this to Needs review in PR Status Oct 5, 2017
@kohr-h
Copy link
Member Author

kohr-h commented Oct 5, 2017

Once we settle on a solution, I think this PR should also go through the "standard library" and fix all instances of operators that need fixing. I'll make a TODO list at the top.

@kohr-h
Copy link
Member Author

kohr-h commented Oct 5, 2017

Open question: what to do with spaces that have no weighting attribute? (See TODOs)

@kohr-h
Copy link
Member Author

kohr-h commented Oct 5, 2017

Ping @matthiasje this may be interesting to you, too. You also had trouble with unscaled adjoints, right?

@mehrhardt
Copy link
Contributor

Yes, in the most elegant definition of TGV I would like to use weighted spaces for the second derivative. It then needs proper adjoints and prox operators. This should be used in
examples/solvers/pdhg_denoising_tgv.py. What would be an easy way for me to test this? Can I checkout your branch?

@kohr-h
Copy link
Member Author

kohr-h commented Oct 6, 2017

Sure, you can do

git remote add kohr-h https://github.com/odlgroup/odl.git
git fetch kohr-h
git checkout issue-1068__ajoint_weighting  # sorry about the typo :-)

@mehrhardt
Copy link
Contributor

@kohr-h, I tried the TGV example again with weighted spaces. It does not work as weighting is not implemented for product spaces?
NotImplementedError: weighted spaces not supported

@kohr-h
Copy link
Member Author

kohr-h commented Oct 12, 2017

What happens if you remove that check and add the @auto_weighting decorator to ProductSpaceOperator.adjoint?

@mehrhardt
Copy link
Contributor

I tried the following but it did not help significantly. It does then result in other errors about things not being well-defined (see below).

+++ b/odl/operator/pspace_ops.py
@@ -18,6 +18,7 @@ from odl.operator.operator import Operator
 from odl.operator.default_ops import ZeroOperator
 from odl.space import ProductSpace
 
+from odl.space.space_utils import auto_weighting
 
 __all__ = ('ProductSpaceOperator',
            'ComponentProjection', 'ComponentProjectionAdjoint',
@@ -148,15 +149,11 @@ class ProductSpaceOperator(Operator):
             if not isinstance(domain, ProductSpace):
                 raise TypeError('`domain` {!r} not a ProductSpace instance'
                                 ''.format(domain))
-            if domain.is_weighted:
-                raise NotImplementedError('weighted spaces not supported')
 
         if range is not None:
             if not isinstance(range, ProductSpace):
                 raise TypeError('`range` {!r} not a ProductSpace instance'
                                 ''.format(range))
-            if range.is_weighted:
-                raise NotImplementedError('weighted spaces not supported')
 
         # Convert ops to sparse representation
         self.ops = scipy.sparse.coo_matrix(operators)
@@ -312,6 +309,7 @@ class ProductSpaceOperator(Operator):
         return ProductSpaceOperator(deriv_matrix, self.domain, self.range)
 
     @property
+    @auto_weighting
     def adjoint(self):
         """Adjoint of this operator.
normE = E.norm(estimate=True)
normE2 = E2.norm(estimate=True)


Traceback (most recent call last):

  File "<ipython-input-2-b66a9577355f>", line 1, in <module>
    normE = E.norm(estimate=True)

  File "/home/ehrhardt/repositories/git_myODL/odl/operator/operator.py", line 751, in norm
    return power_method_opnorm(self, **kwargs)

  File "/home/ehrhardt/repositories/git_myODL/odl/operator/oputils.py", line 229, in power_method_opnorm
    op.adjoint(tmp, out=x)

  File "/home/ehrhardt/repositories/git_myODL/odl/operator/operator.py", line 686, in __call__
    result = self._call_in_place(x, out=out, **kwargs)

  File "/home/ehrhardt/repositories/git_myODL/odl/space/space_utils.py", line 421, in _call
    x = new_ran_w * x

TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'ProductSpaceElement'

@mehrhardt
Copy link
Contributor

Maybe it would be the best if you have a look yourself. I tried to run pdhg_denoising_tgv.py with the following modification after the definition of derivates:

# Create symmetrized operator and weighted space.
# TODO: As the weighted space is currently not supported in ODL we find a
# workaround.
W = odl.ProductSpace(U, 3, weighting=[1, 1, 2])
E = odl.operator.ProductSpaceOperator(
    [[Dx, 0], [0, Dy], [0.5*Dy, 0.5*Dx]], range=W)

E2 = odl.operator.ProductSpaceOperator(
    [[Dx, 0], [0, Dy], [0.5 * Dy, 0.5 * Dx], [0.5 * Dy, 0.5 * Dx]])
W2 = E.range

normE = E.norm(estimate=True)
normE2 = E2.norm(estimate=True)

@kohr-h
Copy link
Member Author

kohr-h commented Oct 18, 2017

Without actually looking, I believe this has to do with #1062 in some sense. What happens here is that the code tries to multiply the result, a product space element, with a Numpy array that has the weights, one for each product space component. That doesn't work, unfortunately, which is part of a bigger issue.

I can't find the discussion we had before, but let me make a suggestion how to solve this and #1062, and probably some other issues, too. I would suggest this:

>>> pspace = odl.rn(2) * odl.rn(3)
>>> z = pspace.one()
>>> z * [1, 2]  # should work, doesn't currently
ProductSpace(rn(2), rn(3)).element([
    [1.0, 1.0],
    [2.0, 2.0, 2.0]
])
>>> z * np.array([1, 2])  # same as above
ProductSpace(rn(2), rn(3)).element([
    [1.0, 1.0],
    [2.0, 2.0, 2.0]
])
>>> pspace = odl.rn(2) ** 2
>>> z = pspace.one()
>>> z * np.array([1, 2])[None, :]  # broadcast along pspace axis, now produces Numpy array
ProductSpace(rn(2), rn(3)).element([
    [1.0, 2.0],
    [1.0, 2.0]
])

The principle would be as follows:

  • For non-power product spaces, only 1D arrays of the same length as the space make sense. So the only sensible thing is to do the equivalent of z[i] * arr[i] for i in range(length).
  • Since power spaces are just a special case, the same thing must work for them.
  • To broadcast in a different manner, one has to explicitly add empty axes into the array, and the same broadcasting rules as in Numpy apply.

That would make it unambiguous as to how to interpret

>>> pspace = odl.rn(2) ** 2
>>> z = pspace.one()
>>> z * [1, 2]  # along axis 0 or axis 1??

It would always be axis 0, and for axis 1 one would have to do z * [[1, 2]] or with Numpy, z * np.array([1, 2])[None, :]. And it would solve the current issue that the latter variant produces a Numpy array, which I think is a bug.

@kohr-h kohr-h mentioned this pull request Oct 19, 2017
@kohr-h
Copy link
Member Author

kohr-h commented Oct 23, 2017

In view of the current issue with product spaces, I think I'll implement a workaround for that case to move things forward.

@kohr-h kohr-h force-pushed the issue-1068__ajoint_weighting branch from 85c64fb to e7c7da4 Compare November 11, 2017 15:20
@kohr-h
Copy link
Member Author

kohr-h commented Nov 11, 2017

In view of the current issue with product spaces, I think I'll implement a workaround for that case to move things forward.

Will do that now.

@kohr-h
Copy link
Member Author

kohr-h commented Nov 11, 2017

Now the decorator works also for operators on product spaces. Needs a proper review now. @mehrhardt if you like, you can test your code again.

Anyway, I think this is a rather important issue since many adjoints are wrong as of now, we should give it some priority.

Copy link
Member

@adler-j adler-j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some overall questions:

Does this commute with @property?

How does this handle adjoints that are defined in terms of other operators?

Shouldn't this be a operator util instead of a space util?

@@ -0,0 +1,58 @@
"""Example demonstrating the usage of the ``auto_weighting`` decorator."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be somewhat fleshed out, especially given that new users come here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do

@@ -281,6 +284,303 @@ def rn(shape, dtype=None, impl='numpy', **kwargs):
return rn


class auto_weighting(OptionalArgDecorator):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest rename to auto_adjoint_weighting to make this more clear

``M: R^n -> R^m``, and its unweighted adjoint is defined by the
transposed matrix ``M^T: R^m -> R^n``.

Similarly, when an integration operator ``A: L^2(I) -> R`` on an
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldnt always using the R^n -> R style be better?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "using"? In this example?


Parameters
----------
unweighted_adjoint : `Operator`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the type of this be "method" or something like that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically yes. I'll change it.

elif dom_w_type == 'const' and ran_w_type == 'const':
if unweighted_adjoint.domain.size < unweighted_adjoint.range.size:
new_dom_w = 1.0
new_ran_w = ran_w / dom_w
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this always makes sense? What about operators C -> R etc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because weightings can never be complex. (Note to self: make sure that this is checked.)

Copy link
Member

@adler-j adler-j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some further reviewing and a question.

Could this somehow become buggy if adjoint is an attribute instead of a method?

return ScalingOp(self.range, self.domain, self.c)

op1 = ScalingOp(rn, rn, 1.5)
op2 = ScalingOp(rn_w, rn_w, 1.5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer not using 1.5 everywhere

def test_auto_weighting_cached_adjoint():
"""Check if auto_weighting plays well with adjoint caching."""
rn = odl.rn(2)
rn_w = odl.rn(2, weighting=2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a "random" value here

def __init__(self, dom, ran, c):
super(InvalidScalingOp, self).__init__(dom, ran, linear=True)
self.c = c
self._adjoint = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this line do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That must be copy-paste from the caching test, should be removed.

@kohr-h
Copy link
Member Author

kohr-h commented Dec 17, 2017

Could this somehow become buggy if adjoint is an attribute instead of a method?

That cannot happen since adjoint is a property on Operator and cannot be overwritten by an attribute:

>>> class A(odl.Operator):
...     def __init__(self):
...         super(A, self).__init__(odl.rn(2), odl.rn(2))
...         self.adjoint = None
>>>
>>> a = A()
Traceback (most recent call last):
    [...]

    self.adjoint = None

AttributeError: can't set attribute

@kohr-h
Copy link
Member Author

kohr-h commented Dec 17, 2017

Does this commute with @Property?

No. I tried to hack it in, but the wrapping does not seem to work as expected. I'll leave it as it is, but I can improve the error message to give users a hint.

@kohr-h
Copy link
Member Author

kohr-h commented Dec 17, 2017

How does this handle adjoints that are defined in terms of other operators?

Not sure what you mean. Like compositions and such?

@kohr-h
Copy link
Member Author

kohr-h commented Dec 17, 2017

Shouldn't this be a operator util instead of a space util?

Yes I guess so.

@kohr-h kohr-h force-pushed the issue-1068__ajoint_weighting branch from 8a39fc6 to 77e1fba Compare December 17, 2017 22:30
@kohr-h kohr-h moved this from Needs review to In revision in PR Status Dec 18, 2017
@kohr-h kohr-h force-pushed the issue-1068__ajoint_weighting branch from 3fe7cec to 962a8bc Compare December 24, 2017 11:14
@kohr-h kohr-h force-pushed the issue-1068__ajoint_weighting branch from f67f8cf to 9a0a073 Compare February 15, 2018 22:06
@kohr-h kohr-h mentioned this pull request Jun 30, 2018
21 tasks
@kohr-h kohr-h mentioned this pull request Aug 27, 2018
3 tasks
@kohr-h
Copy link
Member Author

kohr-h commented Aug 27, 2018

Current status here is that there's some mysterious issue when adjoint defers to inverse.

I'll think a bit about better ways to do this, without doing brain surgery on the operator returned by adjoint. Maybe a metaclass after all.

@kohr-h kohr-h added this to In Progress in Release 1.0.0 Sep 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
PR Status
In revision
Release 1.0.0
  
In Progress
Development

Successfully merging this pull request may close these issues.

None yet

4 participants