You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When using cuml's train_test_split on a cudf dataframe with a string column it fails with "TypeError: String Arrays is not yet implemented in cudf". cudf refers back to cuml and there seems to be no news (see here).
Using sklearn or dask-ml for splitting works as expected, so I'm not sure if it even is a cudf issue.
Currently we don't active test for string columns in the method if I'm not mistaken, since back when we originally added this to cuML we didn't think/expect string columns. That said, we will check if this is expected or a bug and update the issue. Thanks @AndreasKarasenko !
I was doing some cudf issue triage and came across the linked issue rapidsai/cudf#12989. Please take a look at my last comment, it might provide some insight into what caused this incompatibility in cuml (it looks like this used to work in 23.02 so maybe cuml started converting to a cupy array internally during 23.04 development).
Describe the bug
When using cuml's train_test_split on a cudf dataframe with a string column it fails with "TypeError: String Arrays is not yet implemented in cudf". cudf refers back to cuml and there seems to be no news (see here).
Using sklearn or dask-ml for splitting works as expected, so I'm not sure if it even is a cudf issue.
Steps/Code to reproduce bug
This code uses a mix of the HPO example and the Naive Bayes example.
Expected behavior
It should split the dataframe like sklearn or dask.
Environment details:
The text was updated successfully, but these errors were encountered: