Skip to content

RelativeFeatures inside a pipeline #675

Answered by solegalli
orlandoflv asked this question in Q&A
Discussion options

You must be logged in to vote

My first thought is that SimpleImputer takes a pandas dataframe and converts it into a numpy array, so the column names are lost.

When feature-engine transformers take numpy arrays, they transform them into pandas dataframes and add artificial column names:

X.columns = [f"x{i}" for i in range(X.shape[1])]

Datetimes is not raising an error because it does not have a Scikit-learn transfomer before it. So it is taking in the original dataframe with the correct columns names.

I suggest using the MeanMedianImputer() instead of the SimpleImputer() in numerical_trans_pipe. That is probably the simplest solution.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@orlandoflv
Comment options

Answer selected by orlandoflv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants