Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory issue from mable and possibly model definition #369

Open
FinYang opened this issue Oct 14, 2022 · 2 comments
Open

Memory issue from mable and possibly model definition #369

FinYang opened this issue Oct 14, 2022 · 2 comments

Comments

@FinYang
Copy link

FinYang commented Oct 14, 2022

This is more of a question than an issue if this behaviour is expected:

library(fable)
#> Loading required package: fabletools
library(tsibble)
#> 
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, union
mdl <- model(tourism, SNAIVE(Trips))
lobstr::obj_size(tourism)
#> 1.11 MB
lobstr::obj_size(mdl)
#> 3.65 MB

Created on 2022-10-14 with reprex v2.0.2
The size of mable is triple of the data itself, even if I'm only using SNAIVE. Isn't this a little big?
The model definitions themselves are quite big, so maybe this is normal:

lobstr::obj_size(fable::SNAIVE())
#> 1.15 MB
lobstr::obj_size(fable::ARIMA())
#> 1.57 MB
lobstr::obj_size(fable::ETS())
#> 1.25 MB

Created on 2022-10-14 with reprex v2.0.2

But the real issue is the size scales quite seriously to the data

> mdl <- model(df, SNAIVE(value))
> lobstr::obj_size(df)
68.25 MB
> lobstr::obj_size(mdl)
147.51 MB

It looks the model definition is stored separately for every key group?

> obj_size(mdl$`SNAIVE(value)`[[1]])
1.82 MB
> obj_size(mdl$`SNAIVE(value)`[[1]]$model)
1.81 MB
> mdl$`SNAIVE(value)`[[1]]$model
<RW model definition>

With my much larger dataset, my RAM quickly explodes. I tried to find out why but couldn't dig deeper with model definition being R6 with so many environments linking each other so still didn't find out what is it that the model is storing that causes the size to be big. All I can tell is the model definition seems like a bunch of harmless functions. Is there any way to avoid this?

@mitchelloharawild
Copy link
Member

The objects might appear large in lobstr since they contain references to functions in package environments (for example the model training function). Memory efficiency is however a common issue and something that needs to be investigated closely.

@mitchelloharawild
Copy link
Member

Could you check some sizes of the model internals for your df and mdl example? I'm curious about the size of the model and transformation relative to the fit.
fit should contain things specific to each model, while model and transformation contain environments which may appear large.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants