fix grouped nanmean with bool #131

dcherian · 2023-10-02T21:54:06Z

numbagg.grouped.group_nanmean(np.array([True, True, False]), [0, 0, 0], axis=-1)
# array([0], dtype=int32)
np.nanmean(np.array([True, True, False]))
# 0.6666666666666666

Should I just promote these to float instead?

dcherian · 2023-10-02T21:56:01Z

Another edge case

numbagg.grouped.group_nansum(np.array([True, True, False]), [0, 0, 0], axis=-1).dtype
# int32
np.nansum(np.array([True, True, False]), axis=-1).dtype
# int64

max-sixty · 2023-10-02T22:08:31Z

This is somewhat downstream of #121. There's a bullet tucked away there "Always keep types — current state (except for bools...)"

The current state is that we maintain types, apart from a quirk where we convert bools to int32.

I think the closest thing would be for flox to convert into whatever type it wants before calling numbagg, and then these aren't dependent on any handling in numbagg. Then later we can optimize by solving #121. WDYT? Or is this more complicated than converting bools to floats and flox needs to then handle lots of edge cases?

dcherian · 2023-10-02T22:23:31Z

Yeah, I'll just promote for nanmean now. I think that's easiest.

dcherian mentioned this issue Oct 2, 2023

Add engine="numbagg" xarray-contrib/flox#72

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix grouped nanmean with bool #131

fix grouped nanmean with bool #131

dcherian commented Oct 2, 2023

dcherian commented Oct 2, 2023

max-sixty commented Oct 2, 2023 •

edited

dcherian commented Oct 2, 2023

fix grouped nanmean with bool #131

fix grouped nanmean with bool #131

Comments

dcherian commented Oct 2, 2023

dcherian commented Oct 2, 2023

max-sixty commented Oct 2, 2023 • edited

dcherian commented Oct 2, 2023

max-sixty commented Oct 2, 2023 •

edited