Hyperparameter search with threshold-dependent metrics #25321
vitaliset
started this conversation in
Show and tell
Replies: 1 comment 3 replies
-
It is a while since I did not look at this PR but you should be able to tune the parameter of the base estimator as well if you place the |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
When optimizing hyperparameters, threshold-dependent metrics make
sklearn.model_selection.BaseSearchCV
-like search methods use the estimator's.predict
method instead of.predict_proba
. This can be harmful as 0.5 might not be the best threshold, especially in imbalanced learning scenarios. I often see a 0 score f1_score, and the threshold is probably off by a lot.What is the most scikit-learnable way to deal with it now? I have written a post about this and some of the ways I see this, but I would love to see others' opinions! :D
Also, as far as I can see,
sklearn.model_selection.CutoffClassifier
(from #16525) will solve this problem once you have a fixed model. But, for hyperopt, can we use this estimator to tune thebase_estimator
's params? From my understanding of sklearn API, it would not be possible becauseget_params
andset_params
would not look at the base_estimator.Is there a way to dribble this? I can see this working if something like
make_pipeline(estimator, CutoffClassifier)
works, for instance. But I don't think it will in the way it is being designed (it will behave the way sklearn.calibration.CalibratedClassifierCV acts).Beta Was this translation helpful? Give feedback.
All reactions