Small data support for ensembles: use out-of-sample preds to train metalearner #1898
Labels
enhancement
An improvement to an existing feature.
performance
Issues tracking performance improvements.
Problem
In the case when we don't have many samples (<1k total) provided to automl, we should adjust our splitting strategy to maximize our usage of what data we have available.
Background
#1746 tracks creating a separate 20% split for stacked ensemblers.
Proposal
If our data is "small", we should instead do the following:
We still need to resolve: what data do we use to validate the ensembler?
The text was updated successfully, but these errors were encountered: