Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sparse + dask array not supported #5259

Open
shoyer opened this issue Aug 12, 2019 · 9 comments
Open

sparse + dask array not supported #5259

shoyer opened this issue Aug 12, 2019 · 9 comments
Labels

Comments

@shoyer
Copy link
Member

shoyer commented Aug 12, 2019

cc @hameerabbasi

dask+sparse works, but not sparse+dask.

I think dask should probably implement this case, via it's __array_ufunc__ method. Here's the traceback:

In [1]: import sparse

In [2]: import dask.array as da

In [3]: import numpy as np

In [4]: x = np.arange(3)

In [5]: y = sparse.COO(x)

In [6]: z = da.from_array(y, chunks=3)

In [7]: z + y
Out[7]: dask.array<add, shape=(3,), dtype=int64, chunksize=(3,)>

In [8]: y + z
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-ff96b8185f8d> in <module>
----> 1 y + z

~/miniconda3/envs/xarray-py37-dev/lib/python3.7/site-packages/numpy/lib/mixins.py in func(self, other)
     23         if _disables_array_ufunc(other):
     24             return NotImplemented
---> 25         return ufunc(self, other)
     26     func.__name__ = '__{}__'.format(name)
     27     return func

TypeError: operand type(s) all returned NotImplemented from __array_ufunc__(<ufunc 'add'>, '__call__', <COO: shape=(3,), dtype=int64, nnz=2, fill_value=0>, dask.array<array, shape=(3,), dtype=int64, chunksize=(3,)>): 'COO', 'Array'
@mrocklin
Copy link
Member

What do you think would be the right approach here? Call da.from_array on all non-dask arrays?

Currently we have this code, which seems problematic:

dask/dask/array/core.py

Lines 1106 to 1110 in 6f38240

def __array_ufunc__(self, numpy_ufunc, method, *inputs, **kwargs):
out = kwargs.get("out", ())
for x in inputs + out:
if not isinstance(x, (np.ndarray, Number, Array)):
return NotImplemented

I wonder if just removing that check would make things work

@mrocklin mrocklin added the array label Aug 12, 2019
@hameerabbasi
Copy link
Contributor

hameerabbasi commented Aug 12, 2019

I think probably adding a hasattr(..., '__array_ufunc__') would help here.

@shoyer
Copy link
Member Author

shoyer commented Aug 12, 2019 via email

@hameerabbasi
Copy link
Contributor

Maybe there needs to be a way to declare something as “ndarray-compatible”, this has come up before.

@shoyer
Copy link
Member Author

shoyer commented Aug 12, 2019

Maybe there needs to be a way to declare something as “ndarray-compatible”, this has come up before.

Right, but remember that other "ndarray-compatible" arrays might need to wrap dask arrays. For example, if someone wrote a duck type compatible version of autograd.

@jakirkham
Copy link
Member

What do you think would be the right approach here? Call da.from_array on all non-dask arrays?

We call da.asarray in other places to handle non-Dask Arrays. Seems reasonable to do it here too.

@shoyer
Copy link
Member Author

shoyer commented Aug 15, 2019

We call da.asarray in other places to handle non-Dask Arrays. Seems reasonable to do it here too.

yes, though with overriding numpy ufuncs we still have to be careful to explicitly skip types like xarray.DataArray

@hameerabbasi
Copy link
Contributor

Right, but remember that other "ndarray-compatible" arrays might need to wrap dask arrays. For example, if someone wrote a duck type compatible version of autograd.

Wouldn’t autograd be wrapped by Dask Array instead?

@jakirkham
Copy link
Member

We call da.asarray in other places to handle non-Dask Arrays. Seems reasonable to do it here too.

yes, though with overriding numpy ufuncs we still have to be careful to explicitly skip types like xarray.DataArray

This is what I'm hoping the asduckarray function will do for us in libraries like Dask. 😉

xref: numpy/numpy#13831 (for others interested in learning more)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants