speed for a single record #12

geoHeil · 2016-12-13T15:02:50Z

Did you know about dmlc/xgboost#1849 (comment)

Apparently xgboost4j is quicker for batch predictions in the current version than this library.
Do you have a test which compares predicting a single new value and not 200k values? As described in the linked xgboost issue xgboost4j,s api is only supporting batch mode. What about your library?

I have tested on a dataset (containing 200,000 data) on spark. The xgboost4j-spark cost 1775736 milliseconds containing implicit data transformations. xgboost-predictor-java cost 4620104 milliseconds containing data transformations and 2907550 milliseconds without transformations. I think xgboost4j's prediction on a batch is faster and I will keep using xgboost4j.

CasyWang · 2017-05-03T11:40:45Z

Any conclusion here?

slevental · 2017-06-26T08:53:38Z

Looks like benchmark results posted in the README.md file is quite misleading, they claim that current JVM version is few orders of magnitude faster than xgboost4j, and if you would run benchmark you will be able to get similar results. However, if you will dig deeper you would figure out that most of the time xgboost4j spend on creating DMatrix object - which is not in sparse format (by default) and has huge size: 100x100000. I believe that using sparse matrix format would boost performance. I've checked benchmark with DMatrix of size 80x100 - more suitable for my case and performance of xgboost4j was better (30-40% faster).

edumucelli · 2017-10-25T21:02:12Z

I have made a benchmark on some of the different libraries available, among them XGBoost4j and XGBoost-Predictor, you can take a look here if you are interested.

ssimeonov mentioned this issue Jul 3, 2017

[jvm-packages] Models saved using xgboost4j-spark cannot be loaded in Python xgboost dmlc/xgboost#2480

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speed for a single record #12

speed for a single record #12

geoHeil commented Dec 13, 2016

CasyWang commented May 3, 2017

slevental commented Jun 26, 2017 •

edited

edumucelli commented Oct 25, 2017

speed for a single record #12

speed for a single record #12

Comments

geoHeil commented Dec 13, 2016

CasyWang commented May 3, 2017

slevental commented Jun 26, 2017 • edited

edumucelli commented Oct 25, 2017

slevental commented Jun 26, 2017 •

edited