How can I perform multiple transformations of columns with some columns being same across the transformations #24261
Replies: 2 comments 3 replies
-
Moving this into a discussion |
Beta Was this translation helpful? Give feedback.
-
Thank you. A couple of questions. In option #1 - where we can use columns more than once in column transformer : In this case the transformations are parallel applied to the same column ( if it is listed for two transformations) and not serially as we expect. For example, if we would want to do imputation and then standardscalar on the same column. In option #2 - using 2 column transformers in a pipeline : The first column transformer will not return the column names right ? In this case, how should I specify the column names for the second column transformer, should I be figuring out the order of the columns and specify the indices in this case ? Other challenge in this case is if we use one hot encoding in the the first column transformer with multiple columns (where it would be difficult to get the indices - although feasible in some cases where the categories are fixed and limited). |
Beta Was this translation helpful? Give feedback.
-
For my usecase, I wanted to perform target encoding for some columns (say c1, c2,c3) and I also want to perform imputation for a column (c4) and I now wanted to perform standardscalar (once the previous target encoding and imputation are performed) for all these and more columns (c1,c2,c3,c4,c5,c6,c7,c8).
Note I also have more columns that I need to perform, say OHE, but those columns are disjoint from the above c1...c8.
I am not sure how to use column transformer and pipeline, such that when I perform different transformations with some of the columns being same among the transformations (as in the above example). Is there a way to do this as part of the pipeline. Thank you.
Beta Was this translation helpful? Give feedback.
All reactions