Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify DLPack helper C APIs #74

Open
leofang opened this issue May 28, 2021 · 5 comments
Open

Specify DLPack helper C APIs #74

leofang opened this issue May 28, 2021 · 5 comments

Comments

@leofang
Copy link
Contributor

leofang commented May 28, 2021

In the recent discussions scattered everywhere, it appears that some functions should better be just implemented by DLPack so that the downstream libraries do not have to reinvent the wheels. The possibilities include

  1. tensor deleter (see the discussion starting Specify Python embedding of DLPack tensors #51 (comment))
  2. query if CPU-accessible (see Add kDLROCMHost and kDLCUDAManaged #71 (comment))
  3. C APIs corresponding to __dlpack__ and __dlpack_device__ in the Python Array API standard for handing streams (Standardize C interface for stream exchange #65)
  4. C API that could be exposed as a new Python attribute __dlpack_info__ for returning API and ABI versions (and potentially more, see Future ABI compatibility  #34, Add ABI version. #72)

cc: @tqchen @rgommers @seberg @eric-wieser @kkraus14 @jakirkham @hameerabbasi @vadimkantorov @oleksandr-pavlyk @szha @veritas9872

@leofang
Copy link
Contributor Author

leofang commented Jun 4, 2021

cc: @kmaehashi @emcastillo @dalcinl (for vis)

@leofang
Copy link
Contributor Author

leofang commented Jul 28, 2021

Hi guys, I am bringing some steroid to stimulate progress toward closing this issue!

  1. C API that could be exposed as a new Python attribute __dlpack_info__ for returning API and ABI versions (and potentially more, see Future ABI compatibility  #34, Add ABI version. #72)

The need to retrieve API/ABI versions has been brought up repeatedly in several occasions that I am aware of, including in a recent weekly Array API call. Furthermore, @emcastillo has kindly demonstrated a toy case to show that #71 can be a backward incompatible change:

import torch
import torch.utils.dlpack
import cupy

# Use managed memory
cupy.cuda.set_allocator(cupy.cuda.MemoryPool(cupy.cuda.malloc_managed).malloc)

def get_torch_managed(size):
    return torch.utils.dlpack.from_dlpack(cupy.empty(size).toDlpack())

a = get_torch_managed((3, 2, 1))

The code here uses CuPy to allocate CUDA managed memory and exposes it to PyTorch. The problem is that before #71 landed in DLPack v0.6 managed memory could be disguised as normal device memory by library developers in order to facilitate the exchange. However, if the Producer (CuPy here) supports kDLCUDAManaged from #71 but the Consumer (PyTorch here) does not, the above code snippet breaks during hand-shaking as the Consumer does not recognize it.

tl;dr: We really need to be able to query the supported API/ABI version.

@seberg
Copy link
Contributor

seberg commented Jul 28, 2021

tl;dr: We really need to be able to query the supported API/ABI version.

In the example wouldn't you also need to request the API version (as in "Please give me the struct, with a max version of X.Y")? The exporter (which supports the newer version) must know that the importer is stuck on the older one.

As helpful as signalling the exported version is, it does not replace a Python side (i.e. __dlpack__ spec) for requesting an older version, that would be necessary to do more than raise an informative error that the exported version is too new.

@leofang
Copy link
Contributor Author

leofang commented Aug 13, 2021

Sorry @seberg I dropped the ball...

as in "Please give me the struct, with a max version of X.Y"

I am not sure if I understand it correctly. I assume you meant if the Consumer only supports v0.5, the Producer (on v0.6) should be able to export a v0.5 struct. But, wouldn't this lead to a huge maintenance burden to keep track of all versions of DLPack as time goes by?

I would incline to have an error raised here, but currently the errors raised in each library for such a scenario is cryptic; as of now the informative error ("the exported version is too new") you said is not possible to generate due to lack of API version-query...

@seberg
Copy link
Contributor

seberg commented Aug 13, 2021

Yeah, that is what I thought. I am not sure that in this case the "usupported (new?) device" error would be worse than "incompatible version", but then you could argue that adding devices could just be a minor version bump anyway.

Even without requesting, an exporter could choose to export an old version for easier compatibility (if it has feature parity for the use-case). I guess requesting a version could tape over some transitions, but not sure that it is important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants