Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I use MimicExplainer with Voting Classifier? [Question] #544

Open
RNakamura90 opened this issue Nov 9, 2022 · 5 comments
Open

Comments

@RNakamura90
Copy link

I'm building a Voting Ensemble Classifier model with transformation pipelines.

pipe_xgbc = Pipeline(steps=[
    ('transformer', variable_transformer),
    ('classifier', clf1)])

pipe_lgbc = Pipeline(steps=[
    ('transformer', variable_transformer),
    ('classifier', clf2)])

pipe_rf = Pipeline(steps=[
    ('transformer', variable_transformer),
    ('classifier', clf3)])

classifiers = [
    ('xgbc', pipe_xgbc),
    ('lgbc', pipe_lgbc),
    ('rf', pipe_rf)
]

voting = VotingClassifier(classifiers,
                        voting='soft',
                        weights=[1, 1, 1])

fited_soft_voting = voting.fit(X_train, y_train)

But when I try to use the MimicExplainer function with an LGBM surrogate model, I get the following error:


explainer = MimicExplainer(fited_soft_voting[-1][1],
                            X_train,
                            LGBMExplainableModel,
                            augment_data=True,
                            max_num_of_augmentations=10,
                            features=X_train.columns,
                            transformations=variable_transformer.transformers,
                            allow_all_transformations=True                  
)

---------------------------------------------------------------------------
TypeError: 'VotingClassifier' object is not subscriptable

@imatiach-msft
Copy link
Collaborator

imatiach-msft commented Nov 9, 2022

hi @RNakamura90 , sorry about the trouble you are having. I am guessing that the issue is in this class:

fited_soft_voting

The error:

"TypeError: 'VotingClassifier' object is not subscriptable"

Suggests to me that the fited_soft_voting is getting indexed into when it can't be:

fited_soft_voting[-1][1]

Do you see any error when you pass just fited_soft_voting instead of fited_soft_voting[-1][1]?
Can you call

fited_soft_voting.predict(X_train)

and validate that this predict call is passing? We are just calling the predict and predict_proba methods in the explainer.

@RNakamura90
Copy link
Author

Hello @imatiach-msft, thanks for the answer.
When I use only fited_soft_voting.predict(X_train), I get other error:

too many values to unpack (expected 2)

A step prior to the construction of Pipelines for each algorithm:

numeric_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='mean')),
    ('scaler', StandardScaler())
])

categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='most_frequent')),
    ('onehot', OneHotEncoder(handle_unknown='ignore'))
])

variable_transformer = ColumnTransformer(
    transformers=[
        ('numeric', numeric_transformer, numerical),
        ('categorical', categorical_transformer, categorical)],
    remainder='passthrough')

@imatiach-msft
Copy link
Collaborator

Hi @RNakamura90 ,

When I use only fited_soft_voting.predict(X_train), I get other error:

too many values to unpack (expected 2)

This is very unexpected to me. You should be able to call predict or predict_proba on the X_train dataset using the fitted estimator. Please see the guide:
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html

Can you print the type of the fited_soft_voting and X_train? Is this just the sklearn.ensemble.VotingClassifier and a pandas DataFrame or something different?
If you can send me a small notebook to reproduce the issue I can take a look as well, that might be easier.

@imatiach-msft
Copy link
Collaborator

Also, what is the stack trace for this error:

too many values to unpack (expected 2)
I wonder if it's coming from the pipeline transformations or underlying models.

@imatiach-msft
Copy link
Collaborator

imatiach-msft commented Nov 10, 2022

I see you are passing variable_transformer three times. I wonder if it needs to be copied. It would be great to see the full stack trace, as I think that would determine where the issue is coming from. If you are not able to call predict/predict_proba on the model, then it won't work with the explainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants