Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: PyArray_BufferConverter is unsafe #15450

Open
eric-wieser opened this issue Jan 27, 2020 · 5 comments
Open

BUG: PyArray_BufferConverter is unsafe #15450

eric-wieser opened this issue Jan 27, 2020 · 5 comments
Labels
00 - Bug Project Possible project, may require specific skills and long commitment

Comments

@eric-wieser
Copy link
Member

eric-wieser commented Jan 27, 2020

This function:

  1. Calls PyObject_GetBuffer
  2. Extracts a data pointer
  3. Calls PyBuffer_Release
  4. Returns the pointer to the caller

PyBuffer_Release calls PyBufferProcs.bf_releasebuffer(PyObject *exporter, Py_buffer *view), which according to the docs may "free all memory associated with view", and gives no requirement that view remains alive as long as exporter.

So in principle a type that allocates a buffer for itself on the fly in bf_getbuffer and deletes it when the last uses releases it will cause a use-after-free in numpy.

I don't know if any implementers of the buffer protocol actually do this, but my reading of it is that they are permitted to.

@seberg
Copy link
Member

seberg commented Jan 27, 2020

This is strange. I think I would like if the buffer interface specified that while the buffer struct fields (such as strides, etc.) get free'd by PyBuffer_Release the actual memory pointed to must be owned by the original object (this would solve our problem with the very annoying _dealloc_cached_buffer_info). I think this is the intention and true, but...

Now this function returns a PyArray_Chunk and not a buffer, so if you make the clarification that I say above, it is actually completely fine.

@eric-wieser
Copy link
Member Author

Your interpretation sounds more reasonable - perhaps a pull request against python/cpython would be wise, to update the docs to tell people not to implement what I described above.

@seberg
Copy link
Member

seberg commented Jan 27, 2020

Was going to open, but then went to lunch first: https://bugs.python.org/issue39471 I may look into proposing actually changes to the text. If we can agree on my interpretation, we can clean up very ugly and very slow code in the buffer protocol implementation (it currently slows down scalar math by 20+%). (EDIT: But I suppose only after Python fixes their ArgParse code :()

@eric-wieser
Copy link
Member Author

That adds some clarity, thanks - I hadn't realized that python was using the same reading as bf_releasebuffer as I was.

@seberg
Copy link
Member

seberg commented Jan 27, 2020

Damn, I had overread that before (from the PEP):

Exporters will need to define a bf_releasebuffer function if they can re-allocate their memory, strides, shape, suboffsets, or format variables which they might share through the struct bufferinfo.

Which reads like it is valid to give the buffer a new chunk of memory :(.

Although: looking at the implementation of array (python one), all it does is use it to ensure that it disables resizing if a buffer to it exists.

@seberg seberg added sprintable Issue fits the time-frame and setting of a sprint Project Possible project, may require specific skills and long commitment and removed sprintable Issue fits the time-frame and setting of a sprint labels May 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug Project Possible project, may require specific skills and long commitment
Projects
None yet
Development

No branches or pull requests

2 participants