Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[protocol] Clarify the exact meaning of the buffer's dtype (Dtype tuple in get_buffers()) #273

Open
jorisvandenbossche opened this issue Oct 3, 2023 · 0 comments · May be fixed by #272

Comments

@jorisvandenbossche
Copy link
Member

If my assumption is incorrect

Just as a sanity check I checked and pandas/vaex/cuDF/modin all return the type described in Column.dtype for the get_buffers() values. A. TODO I realize for dataframe-interchange-tests is generalise test_dtype and use it in test_get_buffers.

Note that this is actually not correct, depending on how you interpret it. Yes, the buffers' dtype returns a similar type of DType tuple, but it should not necessarily return the same dtype tuple as its Column.dtype does, as the buffer can have a different dtype than the column.

It seems that we all interpreted this wrongly and all implementations got this wrong (or the text about "the data buffer's associated dtype" is wrong), see apache/arrow#37598, pandas-dev/pandas#54781, pola-rs/polars#10787 (and the same for StaticFrame mentioned above, from a quick look).

Originally posted by @jorisvandenbossche in #87 (comment)

@jorisvandenbossche jorisvandenbossche changed the title Clarify the exact meaning of the buffer's dtype (Dtype tuple in get_buffers()) [protocol] Clarify the exact meaning of the buffer's dtype (Dtype tuple in get_buffers()) Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant