-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The small bug with multiclass xgboost #1208
Comments
it is not, preds.size() gives the total number of probability entries. Which equals num_training * num_class. |
Ok, I see, thanks! Actually, when I run xgboost from python on dataset with 1024389 training examples and num_class = 100, it gives error: XGBoostError: [21:26:57] src/objective/multiclass_obj.cc:43: Check failed: preds.size() == (static_cast<size_t>(param_.num_class) * info.labels.size()) SoftmaxMultiClassObj: label size and pred size does not match When I reduce the number of training examples to 1024300, it works fine. I suppose this is a memory problem then. My laptop cannot fit 1024389*100 matrix in the memory. Is it possible that I got such error because of this? |
This could happen if your laptop is 32 bit and the predict vector size exceed integer range |
Hi I am getting the same issue. @tqchen |
@diefimov i was getting the same issue. The problem was the eval_metrics in my case. 'AUC' doesnt work with multi-class i believe. i simple changed the metrics and my code worked. |
@shang-vikas what did you change your metric to from 'auc' which fixed the issue? |
Changing the metric to "mlogloss" or "merror" fixed the issue for me. |
I have converted my sparse matrices to I have this error using sparse matrices:
Seems some rows of Code for training:
Output:
|
Minimum example to reproduce error:
Output:
|
This check:
CHECK(preds.size() == (static_cast<size_t>(param_.num_class) * info.labels.size()))
<< "SoftmaxMultiClassObj: label size and pred size does not match";
means that number of training examples should be divided by the number of classes (if I understood correctly). For example, if number of classes is 10, then number of training examples should be divided by 10. Of course, it is not true in the majority of cases. Is it a bug? Thanks!
The text was updated successfully, but these errors were encountered: