Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emulate in-place operators like += by replacing layout #3084

Open
jpivarski opened this issue Apr 18, 2024 · 3 comments
Open

Emulate in-place operators like += by replacing layout #3084

jpivarski opened this issue Apr 18, 2024 · 3 comments
Labels
feature New feature or request

Comments

@jpivarski
Copy link
Member

Description of new feature

Although Awkward layouts can't be updated in place, high-level ak.Array objects are a slightly mutable layer on top of the immutable core. For example, ak.Array.__setitem__ allows fields to be added by replacing the ak.Array.layout with the result of ak.with_field.

Sometimes, users want to use operators like +=, -=, *=, etc. for conciseness. We currently don't allow

>>> one = ak.Array([[1.1, 2.2, 3.3], [], [4.4, 5.5]])
>>> two = ak.Array([[10, 20, 30], [], [40, 50]])
>>> one += two

and the way we report the error is like this:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpivarski/irishep/awkward/src/awkward/_operators.py", line 75, in func
    return ufunc(self, other, out=(self,))
TypeError: operand type(s) all returned NotImplemented from __array_ufunc__(<ufunc 'add'>, '__call__', <Array [[1.1, 2.2, 3.3], [], [4.4, 5.5]] type='3 * var * float64'>, <Array [[10, 20, 30], [], [40, 50]] type='3 * var * int64'>, out=(<Array [[1.1, 2.2, 3.3], [], [4.4, 5.5]] type='3 * var * float64'>,)): 'Array', 'Array', 'Array'

That is, we don't support the out argument of ufuncs. The connection between += and np.add with out can be unclear, though. What we're doing is following a regular rule, but the consequences are obscure to anyone who doesn't know this level of detail about NumPy ufuncs, or even the fact that binary operators dispatch to NumPy ufuncs in the first place.

This feature request is to add

    def __iadd__(self, other):
        self.layout = (self + other).layout
        return self

and similar to high-level ak.Array, instead of letting the NDArrayOperatorsMixin formalism have its way. It should probably be done in src/awkward/_operators.py, replacing

def _inplace_binary_method(ufunc, name):
"""Implement an in-place binary method with a ufunc, e.g., __iadd__."""
def func(self, other):
return ufunc(self, other, out=(self,))
func.__name__ = f"__i{name}__"
return func

I'm not sure what happens to the copyright in that case (this file, alone, is copyright the NumPy developers because it was lifted from NumPy), since this one function would be different from the original NDArrayOperatorsMixin.


Keep in mind that this would mean that in-place operators like += do nothing for performance (minimizing memory latency and total memory use); it would just be a syntactic convenience. But it would be in line with similar treatment in dask.array, dask-histogram, and soon dask-awkward.

@agoose77
Copy link
Collaborator

I am mildly in favour of this, as long as it can be replicated in Dask (I don't think we want people to have to think too hard about which kind of array they're working with)!

@jpivarski
Copy link
Member Author

Dask consistency motivated it: apparently, dask.array already does it. Issues with the same titles have been opened in Awkward and dask-awkward.

@agoose77
Copy link
Collaborator

Oh, wonderful! It's nice that although it's user-facing mutability, we're still immutable (effectively) in our internals :)

@jpivarski jpivarski added this to Set aside (don't do) in Finalization May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
Finalization
Set aside (don't do)
Development

No branches or pull requests

2 participants