You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to evaluate a classif.catboost model using repeated cross-validation with predict_type = 'prob', the predicted probabilities are all 0. This doesn't happen with other models like classif.lightgbm. regr.catboost works fine, as does classif.catboost with predict_type = 'response'. Training a model on the whole dataset and predicting probabilities from that works fine.
Reproducible example
library(mlr3) # ver 0.19.0
library(mlr3extralearners) # ver 0.8.0# catboost ver 1.2.5iris_task<- tsk("iris")
lrn_catboost_classif<- lrn("classif.catboost", predict_type="prob")
resamp<- rsmp("repeated_cv", repeats=1, folds=10)
rr<- resample(iris_task, ,
learner=lrn_catboost_classif,
resampling=resamp
)
#> INFO [12:21:56.406] [mlr3] Applying learner 'classif.catboost' on task 'iris' (iter 1/10)#> INFO [12:21:57.425] [mlr3] Applying learner 'classif.catboost' on task 'iris' (iter 2/10)#> INFO [12:21:58.257] [mlr3] Applying learner 'classif.catboost' on task 'iris' (iter 3/10)#> INFO [12:21:59.080] [mlr3] Applying learner 'classif.catboost' on task 'iris' (iter 4/10)#> INFO [12:21:59.918] [mlr3] Applying learner 'classif.catboost' on task 'iris' (iter 5/10)#> INFO [12:22:00.726] [mlr3] Applying learner 'classif.catboost' on task 'iris' (iter 6/10)#> INFO [12:22:01.539] [mlr3] Applying learner 'classif.catboost' on task 'iris' (iter 7/10)#> INFO [12:22:02.373] [mlr3] Applying learner 'classif.catboost' on task 'iris' (iter 8/10)#> INFO [12:22:03.171] [mlr3] Applying learner 'classif.catboost' on task 'iris' (iter 9/10)#> INFO [12:22:03.994] [mlr3] Applying learner 'classif.catboost' on task 'iris' (iter 10/10)rr$score(measures= msr("classif.logloss")) # all same and horrible#> task_id learner_id resampling_id iteration classif.logloss#> <char> <char> <char> <int> <num>#> 1: iris classif.catboost repeated_cv 1 34.53878#> 2: iris classif.catboost repeated_cv 2 34.53878#> 3: iris classif.catboost repeated_cv 3 34.53878#> 4: iris classif.catboost repeated_cv 4 34.53878#> 5: iris classif.catboost repeated_cv 5 34.53878#> 6: iris classif.catboost repeated_cv 6 34.53878#> 7: iris classif.catboost repeated_cv 7 34.53878#> 8: iris classif.catboost repeated_cv 8 34.53878#> 9: iris classif.catboost repeated_cv 9 34.53878#> 10: iris classif.catboost repeated_cv 10 34.53878#> Hidden columns: task, learner, resampling, predictionrr$prediction() # responses appear random, and all probs are 0#> <PredictionClassif> for 150 observations:#> row_ids truth response prob.setosa prob.versicolor prob.virginica#> 7 setosa virginica 0 0 0#> 23 setosa versicolor 0 0 0#> 42 setosa virginica 0 0 0#> --- --- --- --- --- ---#> 108 virginica setosa 0 0 0#> 133 virginica virginica 0 0 0#> 150 virginica virginica 0 0 0model<-lrn_catboost_classif$train(task=iris_task)
predict(model, predict_type="prob", newdata=iris[c(1, 75, 150), ]) # works fine#> setosa versicolor virginica#> [1,] 0.9994297973 0.0003649189 0.0002052838#> [2,] 0.0004712565 0.9992043693 0.0003243742#> [3,] 0.0013331991 0.0027428026 0.9959239983
Description
When trying to evaluate a classif.catboost model using repeated cross-validation with predict_type = 'prob', the predicted probabilities are all 0. This doesn't happen with other models like classif.lightgbm. regr.catboost works fine, as does classif.catboost with predict_type = 'response'. Training a model on the whole dataset and predicting probabilities from that works fine.
Reproducible example
Created on 2024-05-11 with reprex v2.1.0
Session info
The text was updated successfully, but these errors were encountered: