Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

excessive runtime of coef.mboost #114

Open
bastistician opened this issue Aug 12, 2021 · 2 comments
Open

excessive runtime of coef.mboost #114

bastistician opened this issue Aug 12, 2021 · 2 comments

Comments

@bastistician
Copy link
Contributor

Profiling coef.mboost for a colleague, I have noticed that it seems to waste time to "check if base-learner has coefficients" here:

if(any(sapply(cf, is.null))){

I don't know what the original intention was (@hofnerb), but given that cf is a numeric matrix, I think that condition will always be FALSE.

In the particular application, the base learner was a Markov random field over 327 regions. Here are the runtimes I get for evaluating the above condition as a function of mstop:

for (mstop in 10^(2:5)) {
  cat(mstop, ": ")
  cf <- matrix(0, 327, mstop)  # the actual values are irrelevant for this runtime assessment
  cat(system.time(any(sapply(cf, is.null)))["elapsed"], "s\n")
}
#> 100 : 0.009 s
#> 1000 : 0.109 s
#> 10000 : 1.094 s
#> 1e+05 : 20.04 s

Maybe that condition should be fixed to really "check if base-learner has coefficients". Otherwise the ret <- NULL case could simply be dropped. It would fail anyway when trying to set names on NULL later on.

bastistician added a commit to bastistician/mboost that referenced this issue Aug 12, 2021
@hofnerb
Copy link
Member

hofnerb commented Aug 13, 2021

The intention is that not all baselearners have coefficients. If I am remembering correctly the only BL without coefficients is btree, though:

library("mboost")
mod <- mboost(dist ~ btree(speed), data = cars)
coef(mod)
# $`btree(speed)`
# NULL
# 
# attr(,"offset")
# [1] 42.98

I just checked your fix propsal. It breaks with the following error

Fehler in base::rowSums(x, na.rm = na.rm, dims = dims, ...) : 
  'x' muss numerisch sein

We hence cannot simply remove the check. I am wondering a bit, though, why we have the sapply bit. Are there lists of coefficients or does (is.null(cf)) not work for specific coefficient objects we might have?

@bastistician
Copy link
Contributor Author

Ok, I see. Maybe the check could be done earlier and more efficiently? Something like bastistician@50a0392 ? Of course, being an mboost noob, I have no idea whether checking the first element for being NULL is sufficient to conclude that the base learner has no coefs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants