XGBoost Performance Issues #631

voganrc · 2020-01-22T23:23:52Z

Hello,

I ran some JMH benchmarks that show MLeap to be significantly slower than other libraries for evaluating XGBoost models.

Here you can see throughput (ops / sec) as a function of library and batch size, where:

xgboost4j = https://github.com/dmlc/xgboost/tree/master/jvm-packages
xgboost-predictor-java = https://github.com/komiya-atsushi/xgboost-predictor-java
yelp-xgboost = https://github.com/Yelp/xgboost-predictor-java
mleap = https://github.com/combust/mleap

Given that Mleap makes use of xgboost4j-spark does anyone know why it would have half the throughput of xgboost4j? Also, is there a reason why mleap does not observe constant throughput scaling like xgboost4j does?

Thanks!
-Ryan

The text was updated successfully, but these errors were encountered:

lucagiovagnoli · 2020-01-23T01:48:49Z

Hi Anca! So, we've run some tests above and noticed that xgboost4j is much slower than xgboost-predictor-java :(
Historically we've used a fork of xgboost-predictor at Yelp (yelp-xgboost) so we're hitting a performance issue when running MLeap cause xgboost4j seems thousands of times slower :/

I was thinking to make a PR to allow deserializing the model binary as either an xgboost-predictor OR xgboost4j (based users' preference). Fortunately it seems that xgboost-predictor supports loading from xgboost4j binaries, see the ModelReader. I just wanted to check what you think before we get to it.

PS: I also noticed that @hollinwilkins looked into xgboost-predictor in the past, he commented on the xgboost-predictor project about deploying it to Maven Central (comment here). I wonder if they considered that rather than xgboost4j and why it didn't work out ?

@ancasarb

ancasarb · 2020-01-23T01:57:16Z

Hey @lucagiovagnoli this seems fine to me, do you know if xgboost-predictor-java is available in maven already? Or is that ticket still pending?

lucagiovagnoli · 2020-01-23T02:07:13Z

I've just noticed they are! It's mentioned on their README

and i found them on bintray :)

ancasarb · 2020-01-23T02:23:55Z

ok, cool :) It would be nice to see the difference in performance between the two :)

lucagiovagnoli · 2020-01-31T20:44:45Z

Hi @ancasarb,
I just found out that mleap was actually using xgboost-predictor in 2017/2018 (#259)! That sounds great. I wonder if you know why it was removed in this PR #401. Any any lesson learnt or big blockers in using it? (wondering if it was just to switch back to the "official" implementation)

It seems some others like @ytjia are suffering from performance degradation. We also fixed all known bugs in our fork https://github.com/Yelp/xgboost-predictor-java and we plan to contribute these back upstream

ancasarb · 2020-01-31T23:08:43Z

I'd probably say that it was due to the Maven availability and being able to run on travis etc., but it was Hollin who did the integration.

What performance degradation are you referring to from @ytjia? Is there an issue for that, or?

lucagiovagnoli · 2020-02-01T00:11:16Z

My bad, I misused "degradation", I meant that xgboost4j is so slow for us that it's impractical to use at scale for online inference. I found this comment from @ytjia on pr #401 mentioning they are having the same problem.

I really liked @hollinwilkins work integrating xgboost-predictor in PR #259, if there's no real blocker I have a PR (almost ready) to include both xgboost4j (status quo) and xgboost-predictor (high performance) in MLeap so developers can de-serialize xgboost models using the library they prefer :)

ancasarb · 2020-02-04T01:56:14Z

That sounds like a good plan @lucagiovagnoli.

changhiskhan · 2020-02-05T06:35:29Z

I can add some color on the performance issues mentioned by @ytjia. I believe xgboost4j uses a static synchronized JNI entry point to communicate with the underlying xgboost code (which is not thread-safe). This isn't a problem in single-threaded applications, but in many real-time prediction applications, it's behind a multi-threaded service handling concurrent requests. We ended up using xgboost-predictor-java as a workaround.

indranilr · 2020-02-05T12:29:40Z

Very useful analysis by @voganrc , we are looking at a spark xgboost4j model for online inference but after reading this thread it appears xgboost-predictor-java is better choice for such scenarios. Would be quite useful if MLeap can be configured to use xgboost-predictor-java .

Hopefully, this enhancement will be part of next Mleap release.

indranilr · 2020-02-10T19:30:49Z

I can add some color on the performance issues mentioned by @ytjia. I believe xgboost4j uses a static synchronized JNI entry point to communicate with the underlying xgboost code (which is not thread-safe). This isn't a problem in single-threaded applications, but in many real-time prediction applications, it's behind a multi-threaded service handling concurrent requests. We ended up using xgboost-predictor-java as a workaround.

To clarify my understanding a little, can a spark model exported from XGBoost4j-Spark (Example: xgbClassificationModel.write.overwrite().save(xgbClassificationModelPath) be used to predict using xgboost-predictor-java ? Would I need to export the XGBoost4j-Spark model to native binding first (Example : xgbClassificationModel.nativeBooster.saveModel(nativeModelPath) and then use xgboost-predictor-java to load and serve the model ?

lucagiovagnoli · 2020-02-15T03:19:51Z

@indranilr @changhiskhan feel free to review #645 which should solve this. There's some instructions on how to use the Predictor in the PR

lucagiovagnoli mentioned this issue Jan 23, 2020

Adding Batch leap frame and a sample batch tf transformer #600

Open

lucagiovagnoli mentioned this issue Jan 28, 2020

MLeap usage in streaming scenario (perf issue) #633

Closed

lucagiovagnoli mentioned this issue Jan 31, 2020

Feature/xgboost update #401

Merged

lucagiovagnoli self-assigned this Feb 5, 2020

lucagiovagnoli mentioned this issue Feb 15, 2020

xgboost Predictor performant Op #645

Merged

ancasarb closed this as completed in #645 May 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XGBoost Performance Issues #631

XGBoost Performance Issues #631

voganrc commented Jan 22, 2020

lucagiovagnoli commented Jan 23, 2020 •

edited

ancasarb commented Jan 23, 2020

lucagiovagnoli commented Jan 23, 2020

ancasarb commented Jan 23, 2020

lucagiovagnoli commented Jan 31, 2020 •

edited

ancasarb commented Jan 31, 2020

lucagiovagnoli commented Feb 1, 2020

ancasarb commented Feb 4, 2020

changhiskhan commented Feb 5, 2020

indranilr commented Feb 5, 2020 •

edited

indranilr commented Feb 10, 2020

lucagiovagnoli commented Feb 15, 2020

XGBoost Performance Issues #631

XGBoost Performance Issues #631

Comments

voganrc commented Jan 22, 2020

lucagiovagnoli commented Jan 23, 2020 • edited

ancasarb commented Jan 23, 2020

lucagiovagnoli commented Jan 23, 2020

ancasarb commented Jan 23, 2020

lucagiovagnoli commented Jan 31, 2020 • edited

ancasarb commented Jan 31, 2020

lucagiovagnoli commented Feb 1, 2020

ancasarb commented Feb 4, 2020

changhiskhan commented Feb 5, 2020

indranilr commented Feb 5, 2020 • edited

indranilr commented Feb 10, 2020

lucagiovagnoli commented Feb 15, 2020

lucagiovagnoli commented Jan 23, 2020 •

edited

lucagiovagnoli commented Jan 31, 2020 •

edited

indranilr commented Feb 5, 2020 •

edited