Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nullable integers conversion #17659

Closed
Kreol64 opened this issue Oct 27, 2020 · 3 comments
Closed

Nullable integers conversion #17659

Kreol64 opened this issue Oct 27, 2020 · 3 comments

Comments

@Kreol64
Copy link

Kreol64 commented Oct 27, 2020

Hi,

Recently pandas introduced nullable integer dtype:
https://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.html

I would expect numpy to automatically convert such arrays into a float type and fill with np.nan. However, for some reason it converts it to object:

pd.DataFrame({'col': [1, np.nan, 3]}).astype('UInt8').values.dtype

which leads to errors like this one

np.nanmax(pd.DataFrame({'col': [1, np.nan, 3]}).astype('UInt8').values)

returns "TypeError: boolean value of NA is ambiguous"

Is that expected ?

@charris
Copy link
Member

charris commented Oct 27, 2020

Is that expected ?

I would say yes, as nullable integers are not a NumPy type.

@seberg
Copy link
Member

seberg commented Oct 28, 2020

I agree, this is basically by design and means things are working as expected. It is not impossible that NumPy might understand this at some point, but it is unlikely to happen soon.

By using the new UInt8, you are in a sense choosing that NaN really is not the same as NA, and pandas does have pd.NA now to make this distinction more clear.

@arw2019
Copy link

arw2019 commented Oct 28, 2020

@Kreol64 I opened pandas-dev/pandas#37460 to discuss this on the pandas side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants