Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Enable cp.asarray(cudf.RangeIndex) #15618

Open
isVoid opened this issue Apr 30, 2024 · 2 comments
Open

[FEA] Enable cp.asarray(cudf.RangeIndex) #15618

isVoid opened this issue Apr 30, 2024 · 2 comments
Labels
feature request New feature or request

Comments

@isVoid
Copy link
Contributor

isVoid commented Apr 30, 2024

Is your feature request related to a problem? Please describe.
Currently if one attempts to explicitly materialize a cupy array via cudf.RangeIndex, this error is thrown:

(Pdb) import cudf
(Pdb) cp.asarray(cudf.RangeIndex(0, 100))
*** TypeError: Implicit conversion to a host NumPy array via __array__ is not allowed, To explicitly construct a GPU matrix, consider using .to_cupy()
To explicitly construct a host matrix, consider using .to_numpy().

Offline discussion with @vyasr and @pentschev suggests that we should have this usage working transparently. The benefit of this is that cp.asarray(obj) would work for all cudf objects.

Describe the solution you'd like
The most straight forward way is to enable RangeIndex.__array__, which is currently disabled. The rationale is that when __array__ is invoked, the intention of converting to numpy array is clear. However, additional care should be taken when it's being invoked within a cuDF API. According to @vyasr , we should leverage the frame tracking tooling to check if the __array__ interface is invoked internally in cuDF, or externally. If the former, we should raise an error and suggest that to_cupy method should be used. If the latter, the API should work, but maybe a warning can be thrown suggesting this is not as efficient as to_cupy.

@isVoid isVoid added the feature request New feature or request label Apr 30, 2024
@er-eis
Copy link
Contributor

er-eis commented May 1, 2024

if in either case we want the user to call to_cupy, why not call it directly in __array__?

@vyasr
Copy link
Contributor

vyasr commented May 7, 2024

Apologies for the slow response here. The main reason not to call it directly in __array__ is that it would be surprising to users if np.asarray(cudf.RangeIndex(...)) returned a cupy array instead of a numpy array, especially since there are types that are representable in cudf and in numpy but not in cupy so the above conversion would actually fail if we implicitly converted to cupy instead of numpy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

3 participants