ranef in predictInterval extremely slow #119

humana · 2021-08-19T09:33:33Z

Hi,

I am having an issue where the predictInterval is extremely slow, taking about 70 seconds to execute. After some profiling I've discovered that the ranef() calls take up 99% the compute time.

To test this, what I did was copy the predictInterval function and the helpers methods, precompute the randef, and pass them in to a modified version of predictInterval, e.g.

ranef_cached <- ranef(model)
ranef_condvar_cached <- ranef(model, condVar = TRUE)
myPredictInterval(model, X, ranef_cached, ranef_condvar_cached, n.sims=1000, which="full")

Then it basically runs under a second.

From what I can see in the code, there is really no reason to have to re-extract the random effects over and over (it's used in about 3 places) since the model doesn't change. But the actual problem is that the ranef() call is just extremely slow. I created a ticket on the lme4 project as well lme4/lme4#637. I'm using the latest version of lme4.

The text was updated successfully, but these errors were encountered:

bbolker · 2021-08-21T01:56:48Z

Commented in linked issue: there isn't anything to do about lme4:::ranef.merMod. There is some relatively low-hanging fruit in merTools; (1) don't call ranef() more than necessary. (2) merTools implements its own version of mkNewReTrms, which doesn't set condVar=FALSE when calling ranef() (!) (The default for ranef used to be condVar=FALSE; this changed in version 1.1-20 (2019-02-04).) (What does the merTools version of mkNewReTrms do differently? Could this functionality be incorporated upstream to avoid these kind of issues ... ? Or is it just a matter of exporting mkNewReTrm so it can be used by merTools ... ??

jknowles · 2021-09-27T18:52:53Z

Thank you both. I am not on GitHub nearly as much these days so I am just seeing this now. It is certainly true that the repeated ranef() calls are a bad idea! The function was such an improvement over the alternatives at the time that 70 seconds didn't seem extremely slow to me, so I never looked into it. So thank you @humana for narrowing it down and doing the work to profile the issue 🙏

This is really helpful and I see the PR from @bbolker

I unfortunately have limited time to maintain the package, but as I noted in #118 there is a need for a refactor of predictInterval() and now with what I know from this and what I know from support requests from the field I think it is possible to make the function modular and less impenetrable, which would make it easier for everyone to have a good sense of how it works moving forward.

jknowles · 2022-05-26T16:44:24Z

There are two minor test failures caused by #120 in test-predict.R that I am working on before I am able to publish this to CRAN. There are also some new warnings about matrix types that I need to review as well that appeared under R 4.2.0 on Windows.

bbolker · 2022-05-26T17:36:46Z

I can try to take a look if you want (hopefully the fixes are fairly simple once you dig in and look at the failures ... ??)

jknowles · 2022-05-26T19:51:13Z

I will update and tag you @bbolker if I can't get a line on it - I have a little gap in my schedule today to look at it and it's a good chance to get refamiliar with the code.

jknowles · 2022-05-26T21:02:23Z

I think I have fixed it with 3abd5fc

There was one place the sparse matrix wasn't made dense for an edge case with nested random effects. It was easy to adapt the code you used above.

The other issues you raised are important. mkNewReTrms is in this package solely to avoid CRAN issues with calling a non-exported function from lme4. If it were exported by lme4 the code here could/should be adapted to call it. I don't think I did anything special to it.

I like the idea of ensuring it is consistent with lme4 to avoid more moving parts than are here.

I do also think the merPredict function should be split into 3 sub functions that handle each part of the simulation independently (fixed, random, model variance) which would make it easier for someone to tinker with making one part more efficient, without getting lost in pages of nested for statements. If I get time to push for a 1.0 of the package, that would be a milestone that indicated 1.0 was reached.

bbolker mentioned this issue Aug 22, 2021

minimize use of ranef(); try to keep model matrices sparse #120

Merged

jknowles closed this as completed in #120 May 26, 2022

jknowles reopened this May 26, 2022

jknowles closed this as completed May 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ranef in predictInterval extremely slow #119

ranef in predictInterval extremely slow #119

humana commented Aug 19, 2021

bbolker commented Aug 21, 2021

jknowles commented Sep 27, 2021 •

edited

jknowles commented May 26, 2022

bbolker commented May 26, 2022

jknowles commented May 26, 2022

jknowles commented May 26, 2022

ranef in predictInterval extremely slow #119

ranef in predictInterval extremely slow #119

Comments

humana commented Aug 19, 2021

bbolker commented Aug 21, 2021

jknowles commented Sep 27, 2021 • edited

jknowles commented May 26, 2022

bbolker commented May 26, 2022

jknowles commented May 26, 2022

jknowles commented May 26, 2022

jknowles commented Sep 27, 2021 •

edited