Skip to content

Data Processing in Pipeline #243

Answered by SvenKlaassen
benTC74 asked this question in Q&A
Discussion options

You must be logged in to vote

Hi,
thank you. These are great questions.

  1. Doesn't the ColumnTransformer also work on numpy arrays as e.g.
import doubleml as dml
import numpy as np
from doubleml.datasets import make_plr_CCDDHNR2018
from sklearn.ensemble import RandomForestRegressor
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import Normalizer, StandardScaler, MinMaxScaler
from sklearn.pipeline import make_pipeline

data = make_plr_CCDDHNR2018(alpha=0.5, dim_x=5, return_type='DataFrame')
print(data.head())

ct = ColumnTransformer(
    [("norm", Normalizer(norm='l1'), [0, 1]),  # apply to columns 0 and 1
     ("standard", StandardScaler(), slice(2, 4))],  # apply to columns 2 and 3
     remai…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@benTC74
Comment options

@SvenKlaassen
Comment options

@benTC74
Comment options

Answer selected by benTC74
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants