New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dask Nunique bug under dask 2024.2.1 #10982
Comments
The query doesn't make much sense, you are computing nunique on one of the group columns which will always return 1 We should fix this anyway though |
LOL.
|
PRs to fix are welcome |
Is there a newbie guide for where to start? My knowledges on python is average, but I've no experience in building, sharing and keeping a library. I don't even know how to read the source code of the unique method so I could try to better understand what is happening. |
And for the related example of bug,it's interesting the the "marca" column returns an error, but any other column seems to work. My first thought was that is something related to the category dtype ( a very buggy dtype), but I've tried changing it to string and the error persist |
The error happens because Marca is part of your grouping keys, it’s not dtype related |
But it does not happen if I use MCI column. |
Hello guys,
Take this CSV as an example dataframe. I am sorry but I could set an example dataframe by coding able to reproduce such bug
teste.csv
Now, lets open and execute the query on the example dataframe under dask 2023.10.0:
Runs ok!
Now, lets do the same test under dask 2024.2.1
Environment:
The text was updated successfully, but these errors were encountered: