Understanding behavior of Simple Imputer with categorical values #19445
Replies: 2 comments 4 replies
-
Looks like an unsupported edge case but I agree it can be surprising. It will work as expected if you specify Unless you really need to, try to use numpy arrays instead of pandas dataframes. @thomasjpfan may have thoughts on how to best handle this |
Beta Was this translation helpful? Give feedback.
-
Thank you @theonlypoi for reaching out the We use the To solve the issue, you can change: df_2 = pd.DataFrame({"col_1": [np.nan, np.nan, np.nan]}, dtype="category") by: df_2 = pd.DataFrame({"col_1": [np.nan, np.nan, np.nan]}, dtype=object) That will solve the issue 😉. |
Beta Was this translation helpful? Give feedback.
-
Trying to understand the behaviour of an already fitted simple imputer on a dataframe that have only missing values.
Adding a sample code for better understanding.
Now, if i do
imputer.transform(df_1)
, then it works correctly and imputes the missing value in df_1.However, if i try to do
imputer.transform(df_2)
, it generatesValueError: could not convert string to float: 'missing'
.If I modify df_2 dataframe to include one non-missing value, then it works fine.
missing
in df_2 as imputer is already fitted ?Beta Was this translation helpful? Give feedback.
All reactions