You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the beginning, sim analytics plots used matplotlib. It didn't save state at all. What you saw was the plots built from in-memory data.
Then as part of the matplotlib -> streamlit work #749, the plot data for each sim iteration got stored (as pickles).
It was a big upside: greatly helped UX to have persistent storage of plots.
But there was a downside: a single run can take 10Gb+ disk storage space. It's acceptable, but if one does multiple runs it can quickly chew up all storage space.
Example from a 2h run:
...
Towards a solution
The way that the plots are constructed, they only need the most recent state. The most recent state holds data from all past iterations. Therefore the sim doesn't need past states.
Which means when a new state file is stored, the previous state's pickle file (.pkl) can be deleted. (But be sure that the new file is stored first)
TODOs / DoD
In sim_engine: once a new state is stored to disk, delete all states' pickle files (.pkl).
The text was updated successfully, but these errors were encountered:
trentmc
changed the title
[Sim, Analytics] Sim plots use massive storage; so only keep most recent state
[Sim plots] Sim plots use massive storage; so only keep most recent state
May 11, 2024
@trentmc the previous states are used when browsing through with the slider, after finalising the state. I do agree: not all the pickes are needed. We have two options right now:
I can work to map previous states in the slider based on the most recent state
removing the slider altogether. It was originally added in order to experiment with some streamlit view components, to see if they are instantaneous enough. Then it got ported into Dash, due to inertia + checking to see if components can be used similarly. But that timeline slider is virtually useless for time-based graphs and I've seen few people even know it exists because they don't wait for the sim final state.
What I will do:
see if I can easily keep previous state management based on the most recent state.
if it takes too much time and effort, I suggest we reconsider the slider and just remove it. There's plenty more cool things to do.
Background / motivation
In the beginning, sim analytics plots used matplotlib. It didn't save state at all. What you saw was the plots built from in-memory data.
Then as part of the matplotlib -> streamlit work #749, the plot data for each sim iteration got stored (as pickles).
It was a big upside: greatly helped UX to have persistent storage of plots.
But there was a downside: a single run can take 10Gb+ disk storage space. It's acceptable, but if one does multiple runs it can quickly chew up all storage space.
Example from a 2h run:
...
Towards a solution
The way that the plots are constructed, they only need the most recent state. The most recent state holds data from all past iterations. Therefore the sim doesn't need past states.
Which means when a new state file is stored, the previous state's pickle file (
.pkl
) can be deleted. (But be sure that the new file is stored first)TODOs / DoD
.pkl
).The text was updated successfully, but these errors were encountered: