Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NOTICE] ABI Update For adding Version to DLPack #110

Open
tqchen opened this issue Oct 8, 2022 · 10 comments
Open

[NOTICE] ABI Update For adding Version to DLPack #110

tqchen opened this issue Oct 8, 2022 · 10 comments

Comments

@tqchen
Copy link
Member

tqchen commented Oct 8, 2022

Dear DLPack community:

After quite a bit of discussions and coordinations, we are planning to do a ABI breaking change to add versioning and read only field to DLPack. DLPack has been a minimum stable ABI for exchanging Array data and we would like for it to continue stay that way.

In the meantime, we would like to have opportunities to carefully evolve DLPack, of course in still carefully considered manner. After long discussions, we have decided to make the following change.

  • Add a new data structure DLManagedTensorVersioned, which contains a new version field
  • Add DLPackVersion struct to indicate both api and abi versions.
  • Add flags field for boolean options, with support for READ_ONLY bit mask

We also propose to change Data API exchange protocol, to allow new versions of DLPack to return capsule with name "vdltensor"(instead of the old "dltensor"), this would .

The change is still ABI compatible, as the new data structure ABI comes with the new class DLManagedTensorVersioned. The data api however would involve an ABI update and update of all DLPack importer/exporters.

Such a move certainly impact a lot of packages and we would like to plan it carefully, as a result, we would like to have at least one month of notice to let everyone chime in, also see if we have enough volunteers to help update the data api exchanges in various packages.

struct DLManagedTensor {
   DLTensor dl_tensor;
   void * manager_ctx;
   void (*deleter)(struct DLManagedTensor * self);
};

/*!
 * \brief The DLPack and DLPack ABI versions of the tensor.
 */
typedef struct {
  /*! \brief DLPack version. */
  uint32_t dlpack;
  /*! \brief DLPack ABI version. */
  uint32_t abi;
} DLPackVersion;

struct DLManagedTensorVersioned {
    DLPackVersion version;
    void * manager_ctx;
    void (*deleter)(struct DLManagedTensorVersioned * self);
    uint64_t flags;
    DLTensor dl_tensor;
}

typedef DLPACK_BIT_MASK_READ_ONLY  1
  • DataAPI change: PyCapsule - rename to vdltensor, so there is proper error messages because of name mismatch
@tqchen
Copy link
Member Author

tqchen commented Oct 8, 2022

cc @leofang @rgommers @oleksandr-pavlyk @tirthasheshpatel @seberg @kkraus14 @wjakob

@rgommers
Copy link
Contributor

Thanks @tqchen, great to see this move forward.

also see if we have enough volunteers to help update the data api exchanges in various packages.

DataAPI change: PyCapsule - rename to vdltensor, so there is proper error messages because of name mismatch

I can open a PR with a proposed update to the spec once this notice period closes, and coordinate getting the implementations updated. IIRC that's a two-step process, first update the from_dlpack side to understand both old and new name, and then update the exporters to switch to the new name some time later. I'll check for the details once we're actually doing this.

@leofang
Copy link
Contributor

leofang commented Nov 3, 2022

I've raised for awareness in today's array API meeting (because I forgot two weeks ago...) and asked for help to spread the words.

@tqchen
Copy link
Member Author

tqchen commented Nov 9, 2022

PR that implements the change #113

@wjakob
Copy link

wjakob commented Nov 9, 2022

It looks great! The issue+PR make the memory layout of the data structures perfectly clear, but it may also be worth addressing the Python bits. For example, tensor types in array frameworks currently implement a __dlpack__ method that returns a capsule with a "dltensor" identifier. If some packages now switch over to "vdltensor", it will introduce breakage that may be considered problematic in a transition to versioned DLpack (For example, suppose that CuPy and PyTorch adopt the new interface with a half-year delay and cannot talk to each other in the meantime -- that would be bad!). My suggestion would be that versioned tensors are returned by a new __vdlpack__ method implemented in addition to the existing __dlpack_. That way, the old interface can be phased out once every tool can deal with the versioned variant.

@wjakob
Copy link

wjakob commented Nov 9, 2022

My other request is that the PR adds some explanation of what to do when loading a DLManagedTensorVersioned that has a mis-matched DLPack or ABI version (when comparing the versions of the header file available in the tensor provider and tensor consumer). As the spec evolves, this might become a very common situation. What is allowed? What is disallowed?

@tqchen
Copy link
Member Author

tqchen commented Nov 9, 2022

Thanks @wjakob , would love to see what others think about __vdlpack__ function, this would primarily be data-api decision.

cc @leofang @rgommers @oleksandr-pavlyk @tirthasheshpatel @seberg @kkraus14

@seberg
Copy link
Contributor

seberg commented Nov 10, 2022

Can we move the discussion about Python somewhere else? I have to think about it a bit more, and I think it would also make sense to think about it a bit more from the C perspective. The reason is what Keith allured to: Currently Python is ahead of C in defining the exchange protocol. For example, it has a protocol around "streams" (I figured out what I don't quite love about it, is that we pass stream without the device).
But if we have a clear picture for an exchange protocol in C, that should inform the Python one.

Generally, I am not in favor of a new name, I don't see what it adds compared to renaming the capsule only. But passing version= or even the allocated struct are other things that could be considered.

@leofang
Copy link
Contributor

leofang commented Nov 10, 2022

Agreed with Sebastian, it'd be better if we focus on the C perspective here (we are talking about ABI breaks, after all), and have the Python discussion in a separate issue. I also think the existing name is fine, I thought this was actually discussed somewhere in this repo, I need to look in up.

@tqchen
Copy link
Member Author

tqchen commented Nov 10, 2022

Happy to move the python discussion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants