Replies: 1 comment 2 replies
-
By default the resources is just the number of samples and this is this part on which the halving is happening. However, you could imagine that the resources could be another parameter, e.g. the number of trees in an random forest. Then, you would use the halving strategy on the number of trees. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Was reading over the HalvingRandomSearchCV, but still couldn't quite understand what you guys mean by "resources". Here is my current understanding, please feel free to correct me:
Unlike RandomSearchCV or GridSearchCV, this method does not use the entire dataset for a cross-validation set. When iterations have just begun, only a small fraction of the data is taken out for each model to do a standard K-Fold CV evaluation. I am guessing that both the training subset and the validation subset are taken from this fractional dataset?
Then, technically speaking, won't we always expect to see a performance increase over time, simply and most directly due to more training data?
Beta Was this translation helpful? Give feedback.
All reactions