Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import DLPack tensors directly into NumPy (without going via PyTorch or TF) #55

Open
vadimkantorov opened this issue Jul 28, 2020 · 9 comments

Comments

@vadimkantorov
Copy link

vadimkantorov commented Jul 28, 2020

I made an experimental wrapper: https://github.com/vadimkantorov/pydlpack/blob/master/dlpack.py#L107

The most difficult part is managing memory / capsules. Currently it's sort of move-semantics (and deallocation is done in C). I'm sure you'd be able to do it better.

It would be a nice illustration in addition to existing borrowing from NumPy

A more complete usecase of mine: https://github.com/vadimkantorov/readaudio

@vadimkantorov
Copy link
Author

I guess for proper ref-counting like semantics (so that NumPy doesn't call the deleter too early in presence of other array views) something like weakref would be needed: https://stackoverflow.com/questions/37988849/safer-way-to-expose-a-c-allocated-memory-buffer-using-numpy-ctypes, but not completely sure.

@junrushao
Copy link
Member

Zero-copy borrowing from numpy is not a difficult issue, it does not have too include weakref or capsule. I have some examples here: https://github.com/dmlc/dlpack/blob/master/apps/from_numpy/main.py.

@szha
Copy link
Member

szha commented Aug 4, 2020

I think for the case of zero-copy into numpy, if the original array doesn't give up the ownership of the data buffer, we do need to make sure that numpy doesn't release the buffer. I thought this would be something that the OWNDATA flag in numpy arrays already deal with (judging from the name) though I haven't look into the details yet.

@vadimkantorov
Copy link
Author

Yeah. It shouldn't release the buffer and shouldn't call deleter either if there're some other existing arrays (it should also ideally work when torch.from_numpy is called on such a NumPy array)

@junrushao
Copy link
Member

A quick heads-up: we prototyped a simple pure python library that allows zero-copy between dlpack-compatible array api and numpy ndarray: https://github.com/jwfromm/numpy_dlpack. The lifetime and ownership are properly taken care of if we didn’t miss out anything.

Do you guys think we should contribute the implementation to this repo?

@rgommers
Copy link
Contributor

Thanks for sharing @junrushao1994.

Do you guys think we should contribute the implementation to this repo?

I'm not sure that will be helpful in the long run, or if it's worth spending time reviewing if all the corner cases are correct (from a quick scan of your code, I'd say there'll be a few things it doesn't handle). We just need to finish numpy/numpy#19083, which implements DLPack support in NumPy itself.

@junrushao
Copy link
Member

junrushao commented Oct 10, 2021

Thank you @rgommers! Yeah I believe numpy/numpy#19083 is definitely a nicer way to allow numpy to interact with DLPack natively, and of course in the long run we should go all in with the numpy native approach this PR brings :-)

Alternatively, this repo could potentially be a pure python-based example of exchanging data with any numpy-like arrays using DLPack in a non-intrusive way.

Here is my proposal:

  • Contribute dlpack.py to python/dlpack/dlpack.py, so that it could be shared across codebase
  • Contribute from_numpy.py and to_numpy.py to python/dlpack/ so that it could help when numpy's dlpack interface doesn't exist
  • Complete the scripts by detecting if numpy's ndarray has __dlpack__ or from_dlpack APIs. If so, go with the numpy native APIs instead; Otherwise, fall back to this non-intrusive approach

@vadimkantorov
Copy link
Author

Hmm. I now see that this ctypes example is committed! Good news. One difference with my https://github.com/vadimkantorov/pydlpack/blob/master/dlpack.py#L107 is that my array_interface creation from a DLPack included some sort of calling the wrapped dl_managed_tensor.deleter if the numpy array needed to be destroyed. This piece seems missing from to_numpy.py?

@jakirkham
Copy link

jakirkham commented Jun 8, 2023

Am seeing this dlpack mention in the NumPy 1.22.0 release notes:

Add NEP 47-compatible dlpack support

Add a ndarray.__dlpack__() method which returns a dlpack C structure wrapped in a PyCapsule. Also add a np._from_dlpack(obj) function, where obj supports __dlpack__(), and returns an ndarray.

(gh-19083)

Given NumPy now supports this, should we close?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants