New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor predictInterval #106
Comments
Hi @jknowles, I have my own version of predictInterval that I butchered so I could use it for online predictions behind an API, so I needed it to be very fast < 1 second for each call. Would the above predict_setup() be doing all the calls to ranef() and storing those results, so the actual predictInterval never has to call those again? That's where I found most/all of the time is being spent. E.g. I. have something like this in my "predict_setup()":
And I pass that cache_obj back into predictInterval and wherever it was calling randef() I just made it reference the cached version. Could I make one additional suggestion: To allow the "levels" parameter to be a vector, so you can have different intervals for each row. This would also make it work the same as predict.lm and other predict functions in R, where the level can be given as a vector parameter. I have done this to my own version of predictInterval, it requires about 10 lines of code. So basically predictInterval(model, newdata, level = c(0.8, 0.9, 0.7)) or more likely to be used as.. predictInterval(model, newdata, level = newdata$level) E.g.:
|
This is great - caching the random effects is definitely what I had in mind. Depending on the size of the model we may even be able to get more speed caching the simulated matrix instead of generating it each times in I like the idea of making the intervals a vector. The original use case for Thanks for the suggestions! |
The time has come to refactor
predictInterval
. It always should have been multiple functions since each piece of the prediction interval can be sampled independently. To help make future maintenance easier and unit testing more straightforward, the function should be broken up into the following components:predict_setup()
- does the work of getting the merMod pieces needed to make intervalssample_re()
- creates the random component of the prediction intervalssample_fe()
- creates the fixed component of the prediction intervalssample_model()
- gets the model fit error (if user requests) component of the prediction intervalsThen
predictInterval
is simply a function that calls these functions and then combines the results into different configurations depending on what the user requests.I don't think the sub-functions should be exported,
predictInterval()
should always be the function the user interacts with even when they only request part of the prediction interval. But, it will make unit testing easier moving forward and should make it easier to adapt to changes inlme4
if they come up or to incorporate new model types like in issue #101What say you @carlbfrederick - does this sound like a good plan? Any feeling on names/conventions/structure? This is mostly off the top of my head.
The text was updated successfully, but these errors were encountered: