Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ranef in predictInterval extremely slow #119

Closed
humana opened this issue Aug 19, 2021 · 6 comments · Fixed by #120
Closed

ranef in predictInterval extremely slow #119

humana opened this issue Aug 19, 2021 · 6 comments · Fixed by #120

Comments

@humana
Copy link

humana commented Aug 19, 2021

Hi,

I am having an issue where the predictInterval is extremely slow, taking about 70 seconds to execute. After some profiling I've discovered that the ranef() calls take up 99% the compute time.

To test this, what I did was copy the predictInterval function and the helpers methods, precompute the randef, and pass them in to a modified version of predictInterval, e.g.

ranef_cached <- ranef(model)
ranef_condvar_cached <- ranef(model, condVar = TRUE)
myPredictInterval(model, X, ranef_cached, ranef_condvar_cached, n.sims=1000, which="full")

Then it basically runs under a second.

From what I can see in the code, there is really no reason to have to re-extract the random effects over and over (it's used in about 3 places) since the model doesn't change. But the actual problem is that the ranef() call is just extremely slow. I created a ticket on the lme4 project as well lme4/lme4#637. I'm using the latest version of lme4.

@bbolker
Copy link
Contributor

bbolker commented Aug 21, 2021

Commented in linked issue: there isn't anything to do about lme4:::ranef.merMod. There is some relatively low-hanging fruit in merTools; (1) don't call ranef() more than necessary. (2) merTools implements its own version of mkNewReTrms, which doesn't set condVar=FALSE when calling ranef() (!) (The default for ranef used to be condVar=FALSE; this changed in version 1.1-20 (2019-02-04).) (What does the merTools version of mkNewReTrms do differently? Could this functionality be incorporated upstream to avoid these kind of issues ... ? Or is it just a matter of exporting mkNewReTrm so it can be used by merTools ... ??

@jknowles
Copy link
Owner

jknowles commented Sep 27, 2021

Thank you both. I am not on GitHub nearly as much these days so I am just seeing this now. It is certainly true that the repeated ranef() calls are a bad idea! The function was such an improvement over the alternatives at the time that 70 seconds didn't seem extremely slow to me, so I never looked into it. So thank you @humana for narrowing it down and doing the work to profile the issue 🙏

This is really helpful and I see the PR from @bbolker

I unfortunately have limited time to maintain the package, but as I noted in #118 there is a need for a refactor of predictInterval() and now with what I know from this and what I know from support requests from the field I think it is possible to make the function modular and less impenetrable, which would make it easier for everyone to have a good sense of how it works moving forward.

@jknowles
Copy link
Owner

There are two minor test failures caused by #120 in test-predict.R that I am working on before I am able to publish this to CRAN. There are also some new warnings about matrix types that I need to review as well that appeared under R 4.2.0 on Windows.

@jknowles jknowles reopened this May 26, 2022
@bbolker
Copy link
Contributor

bbolker commented May 26, 2022

I can try to take a look if you want (hopefully the fixes are fairly simple once you dig in and look at the failures ... ??)

@jknowles
Copy link
Owner

I will update and tag you @bbolker if I can't get a line on it - I have a little gap in my schedule today to look at it and it's a good chance to get refamiliar with the code.

@jknowles
Copy link
Owner

I think I have fixed it with 3abd5fc

There was one place the sparse matrix wasn't made dense for an edge case with nested random effects. It was easy to adapt the code you used above.

The other issues you raised are important. mkNewReTrms is in this package solely to avoid CRAN issues with calling a non-exported function from lme4. If it were exported by lme4 the code here could/should be adapted to call it. I don't think I did anything special to it.

I like the idea of ensuring it is consistent with lme4 to avoid more moving parts than are here.

I do also think the merPredict function should be split into 3 sub functions that handle each part of the simulation independently (fixed, random, model variance) which would make it easier for someone to tinker with making one part more efficient, without getting lost in pages of nested for statements. If I get time to push for a 1.0 of the package, that would be a milestone that indicated 1.0 was reached.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants