Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NumPy Scalars #3544

Open
wants to merge 262 commits into
base: master
Choose a base branch
from
Open

Add NumPy Scalars #3544

wants to merge 262 commits into from

Conversation

sun1638650145
Copy link

Description

Compared with using py::buffer to provide processing py::numpy_scalar is more concise and easier to use. I think PR #2060 is a very good solution, but it seems to be abandoned. So I made a simple modification and resubmitted a new PR.

@rwgk
Copy link
Collaborator

rwgk commented Dec 10, 2021

Logging my thoughts after looking for a bit.

That's a lot of code (cost).
What's the benefit?

It makes sense to distinguish between ints of different sizes etc. for large arrays, but how often does it matter for scalars in real life? Passing a one-element array could do the trick, too. Probably a minor nuisance that's worth spending time and code on only if it becomes wide-spread clutter.

I will wait for other maintainers to chime in.

@sun1638650145
Copy link
Author

@rwgk I understand your thoughts, thanks for your suggestions.

@sun1638650145
Copy link
Author

It is not sure whether the PR will be merged, so if you want to use this feature, you can directly use this patch.

@bjodah
Copy link
Contributor

bjodah commented Feb 1, 2022

I won't speak to the cost/benefit ratio. But I thought I could mention that I have found myself writing a small wrapper function like this:

template<typename T>
py::object create_numpy_scalar(T val) {
    // usage requires initialized NumPy C-API (call _imoprt_array() before use)
    py::object dt = py::dtype::of<T>();
    PyObject * scal = PyArray_Scalar(&val, (PyArray_Descr*)dt.ptr(), py::int_(sizeof(T)).ptr());
    return py::reinterpret_steal<py::object>(scal);
}

to wrap e.g. a long double scalar to be passed to a python function, e.g. def my_jacobian(some_scalar, some_vector, an_out_matrix). Of course, passing a ndarray with shape == (1,) as some_scalar and writing an extra line of code in my_jacobian to convert it to a numpy scalar (if needed for the semantics in that function) should work equally well I suppose?

@sun1638650145
Copy link
Author

  1. The cost of 300 lines seems to be very high, but in fact most of them are repetitive code to support different precision int, float and complex (these macro definitions really can't use functions to reduce code measured).

  2. I understand the situation you mentioned. I used py::buffer to achieve the same function before, but the function implemented in this way has 3 obvious shortcomings.

    • When using the __doc__ function on the Python side, the data type is lost, and buffer is always displayed instead of e.g. np.int32.
    • Representing a scalar with an array of [1, 1] alone also makes the readability on the Python side worse.
    • The readability of the code on the C++ side becomes extremely low, and only a lot of comments can be written to explain why this is done.

@bjodah, thank you very very much. My original idea was to transfer this part of the cost directly from the user to us. Whether it is pybind or Python, I hope it is simple and easy to use. English is not my native language; please excuse typing errors.

@ax3l
Copy link
Collaborator

ax3l commented Feb 18, 2022

That's a lot of code (cost).
What's the benefit?

I have been bitten by this in the past as well and the current inconsistency for 0-dim data (scalars) makes generic ND data handling on API interfaces pretty wild/complicated.

As long as this goes in the separate numpy.h header, I think better numpy scalar support is reasonable to add.

@henryiii
Copy link
Collaborator

I'm personally fine with following numpy more closely (that is, adding features already supported by numpy, within reason).

@rwgk
Copy link
Collaborator

rwgk commented Feb 18, 2022

As long as this goes in the separate numpy.h header, I think better numpy scalar support is reasonable to add.

Thanks for pointing out @ax3l, I hadn't given that enough weight previously.

I see the CI is currently very unhappy (42 failing). @sun1638650145, after I see you got it back to green, I'll run the Google global testing with this PR. Could you please tag me with an CI All Green message or similar when it's ready?

@sun1638650145
Copy link
Author

As long as this goes in the separate numpy.h header, I think better numpy scalar support is reasonable to add.

Thanks for pointing out @ax3l, I hadn't given that enough weight previously.

I see the CI is currently very unhappy (42 failing). @sun1638650145, after I see you got it back to green, I'll run the Google global testing with this PR. Could you please tag me with an CI All Green message or similar when it's ready?

OK, I'll try to fix the bug as soon as I can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants