New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stacked ensembler: use same CV data splitter as the rest of automl #1592
Comments
@angela97lin does my explanation make sense / was there a reason we chose not to do this when we were setting up stacking? :) |
@dsherry I think your explanation makes sense! IIRC when we were setting up stacking and were trying to make it more performant / make stacking run faster, we wanted to default to something that didn't have too many folds--hence the |
Dug into this a bit more, and tried to weave the data split method used by AutoML to the stacked ensemble component. However, I ran into this issue (after addressing the API updates necessary to get our
This is an error thrown when we try to call:
The reason for this is because scikit-learn validates that the cv passed is indeed a cross-validation method; it isn't happy with single splits such as As such, I think the best plan for now is to do the easy thing to support scikit-learn 0.24 and set the default cv's |
Problem
Currently, the stacked ensembler has its own setup for CV.
IterativeAlgorithm
calls the_make_stacked_ensembler
util, but doesn't currently thread through the data splitter from automl search.The data splitter set up by default in the stacked ensembler doesn't set
shuffle=True
, which could lead to poor performance if the input dataset has an ordering. It also won't have the same settings for other parameters liken_folds
, which isn't ideal.Also, this difference is preventing us from supporting sklearn 0.24.0. Fixing this issue should allow us to support that version.
Fix
Let's have automl pass its data splitter through
IterativeAlgorithm
down into the stacked ensembler.The text was updated successfully, but these errors were encountered: