Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how do you explain the output of breakDown #16

Open
Nadu123 opened this issue Apr 22, 2018 · 4 comments
Open

how do you explain the output of breakDown #16

Nadu123 opened this issue Apr 22, 2018 · 4 comments

Comments

@Nadu123
Copy link

Nadu123 commented Apr 22, 2018

Hi
I am following the example for Random Forests : https://pbiecek.github.io/breakDown/articles/break_randomForest.html
I am still not sure how to translate the output of breakDown.
in the example about random forests we get
image
what can be said about the final prediction? is it that for this employee there is 88% probability that she left and that is because of each feature contribution to the prediction.

thanks

@pbiecek
Copy link
Owner

pbiecek commented Apr 22, 2018

Random forest said that the probability for this employee is 88% (= final prediction)
The average prediction from random forest is 14.8% (=intercept)
the increase in odds for leaving is attributed to variables with the use of algorithm described in
https://arxiv.org/abs/1804.01955

@Nadu123
Copy link
Author

Nadu123 commented Apr 23, 2018

Thanks for the rocket quick reply!
One more thing, whether we use step up or step down strategy, aren't the features supposed to be ordered in terms of their contribution?
For the above case , Aren't we saying that the leading cause for this particular employee to be 88% leaving is that the number of projects assigned to her is only 2? since that has the more weight among other features.

thanks

@mylanhong
Copy link

@pbiecek
hello,the algorithm described in https://arxiv.org/abs/1804.01955 was difficult for me to undestand.
I have a question, for a logistic regression model (e.g, the Y is good or bad), what does it mean when the final_decision is a negative value ?

@pbiecek
Copy link
Owner

pbiecek commented Aug 5, 2018

Presented scores are logits,
logits can be transformed to probability
Negative logit means that corresponding probability is below 0.5
read more here: https://en.wikipedia.org/wiki/Logistic_regression

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants