Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parity between c++ xgboost and xgboost-predictor-java #21

Open
seme0021 opened this issue Sep 21, 2017 · 2 comments
Open

Parity between c++ xgboost and xgboost-predictor-java #21

seme0021 opened this issue Sep 21, 2017 · 2 comments

Comments

@seme0021
Copy link

Hi -

Have you done any parity tests between the scored output of the c++ models and the java models?

Asking because I'm seeing large differences (greater than 1) when using double precision values when doing regression.

Using these training parameters:

    "eta" -> 0.3,
    "max_depth" -> 2,
    "objective" -> "reg:linear",
    "early_stopping_rounds" ->2,
    "num_round" -> 15,
    "nworkers" -> 2

When I cast the Doubles in the FVec to a Float first, I then see the results are much closer, to within a .0001 tolerance.

@miceno
Copy link

miceno commented Oct 13, 2017

Hi,

We also noted this tolerance. Is there any way to improve it?

Thank you!!!

@cpfarrell
Copy link

I would recommend to always be using features as floats. XGBoost is explicit that it treats things as 32 bit due to performance optimizations (one example dmlc/xgboost#1410). If a model has been trained using xgboost its split values will be stored as floats and so giving it doubles may cause inaccurate predictions if hit just the right values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants