Skip to content

Understanding behavior of Simple Imputer with categorical values #19445

Answered by alfaro96
theonlypoi asked this question in Q&A
Discussion options

You must be logged in to vote

Trying to understand the behaviour of an already fitted simple imputer on a dataframe that have only missing values.

Adding a sample code for better understanding.

import pandas as pd
import numpy as np
from sklearn.impute import SimpleImputer

df_1 = pd.DataFrame({"col_1": ["A", "A", "B", np.nan, "A"]}, dtype="category")
df_2 = pd.DataFrame({"col_1": [np.nan, np.nan, np.nan]}, dtype="category")

imputer = SimpleImputer(strategy="constant", fill_value="missing")
imputer.fit(df_1)

Now, if i do imputer.transform(df_1), then it works correctly and imputes the missing value in df_1.
However, if i try to do imputer.transform(df_2), it generates ValueError: could not convert string to float: '…

Replies: 2 comments 4 replies

Comment options

You must be logged in to vote
3 replies
@theonlypoi
Comment options

@theonlypoi
Comment options

@NicolasHug
Comment options

Comment options

You must be logged in to vote
1 reply
@thomasjpfan
Comment options

Answer selected by theonlypoi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
4 participants