Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix grouped nanmean with bool #131

Open
dcherian opened this issue Oct 2, 2023 · 3 comments
Open

fix grouped nanmean with bool #131

dcherian opened this issue Oct 2, 2023 · 3 comments

Comments

@dcherian
Copy link
Contributor

dcherian commented Oct 2, 2023

numbagg.grouped.group_nanmean(np.array([True, True, False]), [0, 0, 0], axis=-1)
# array([0], dtype=int32)
np.nanmean(np.array([True, True, False]))
# 0.6666666666666666

Should I just promote these to float instead?

@dcherian
Copy link
Contributor Author

dcherian commented Oct 2, 2023

Another edge case

numbagg.grouped.group_nansum(np.array([True, True, False]), [0, 0, 0], axis=-1).dtype
# int32
np.nansum(np.array([True, True, False]), axis=-1).dtype
# int64

@max-sixty
Copy link
Collaborator

max-sixty commented Oct 2, 2023

This is somewhat downstream of #121. There's a bullet tucked away there "Always keep types — current state (except for bools...)"

The current state is that we maintain types, apart from a quirk where we convert bools to int32.

I think the closest thing would be for flox to convert into whatever type it wants before calling numbagg, and then these aren't dependent on any handling in numbagg. Then later we can optimize by solving #121. WDYT? Or is this more complicated than converting bools to floats and flox needs to then handle lots of edge cases?

@dcherian
Copy link
Contributor Author

dcherian commented Oct 2, 2023

Yeah, I'll just promote for nanmean now. I think that's easiest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants