-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Something off with coefitients? #9
Comments
Yes, actually there is a pretty cool reason why you do not want to have beta*value as separate contributions (see below) In the This is to make contributions resistant to shifting of an X variables. It is easy to get such individual contributions.
|
Thank you for the explanation! May I suggest giving the user the option to use the centered or regular x values, as well as providing some explanation in the documentation. This is a great chart, but confusing without any explanation of using type = "terms" |
Yes, some documentation is required. Winter semester has just ended so I will have some time to work on it. |
dear @pbiecek Following @alathrop it would be great to have an option for having directly the application of the different terms rather than the centered values. I completely understand for point of view. But in other context, such plot would be relevant, e.g. for pedagogic purpose. When teaching, I often need to explain to my students how a single prediction is obtained from a model, in particular when explaining how to interpret interactions. Thanks for this package |
Maybe some code could be helpful. I have tried the following. betas <- function (object, newdata)
{
tt <- terms(object)
Terms <- delete.response(tt)
mm <- model.matrix(Terms, newdata)
ass <- attr(mm, "assign")
tl <- attr(Terms, "term.labels")
co <- coef(object)
pred <- co * mm
ret <- matrix(rep_len(NA, length.out = length(tl) * nrow(newdata)), nrow = nrow(newdata))
colnames(ret) <- tl
rownames(ret) <- rownames(ret)
for (i in 1:length(tl)) {
ret[, i] <- rowSums(pred[, ass == i, drop = FALSE], na.rm = TRUE)
}
attr(ret, "constant") <- rowSums(pred[, ass == 0, drop = FALSE], na.rm = TRUE)
ret
} At the beginning of Would you consider adding such options? |
I have prepared a pull request, just in case |
Thanks, merged. |
thanks |
Hi, I don't understand how the broken function calculates the coefficients? (or something is off?)
In the lm function this is my test result:
Call:
lm(formula = TotalCharges ~ ., data = data_in_test)
Residuals:
Min 1Q Median 3Q Max
-1943.33 -453.71 -94.64 490.26 1887.26
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2162.4583 21.9717 -98.420 < 2e-16 ***
MonthlyCharges 36.1234 0.3080 117.301 < 2e-16 ***
tenure 65.3606 0.3683 177.476 < 2e-16 ***
SeniorCitizen -86.7050 24.3449 -3.562 0.000371 ***
Test user:
-2162.4583 + (data_in_test[analysed_user,]$MonthlyCharges * 36.1234) +
data_in_test[analysed_user,]$tenure65.3606 +
data_in_test[analysed_user,]$SeniorCitizen(-86.7050)
[1] 721.2045
While you get: (u can see that the intercept is different)
Obviously one would expect that contributions of a waterfall plot would be simply Y=intercept + beta*value ... etc. from the summary output?
The text was updated successfully, but these errors were encountered: