New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ExtModel reads some forcings multiple times #354
Comments
A comment was made that the bc file is probably read only once and used multiple times. However, If I run this script on a ext file with 1 forcingblock (eg only Waal) it takes 4.8 seconds, while it takes 13-15 seconds with three blocks (waal/lek/maas). Therefore I expect that the file is in fact read multiple times. |
I verified that the file is indeed loaded 3 seperate times. |
Edge cases for caching --> problem with bigger files:
To do:
Other possibility when the issue becomes too complex:
|
prototypeI made a simple prototype which caches the data from the file into a dictionary based on the filepath. Object to hold the checksum and data and to verify if the checksum has changed.
Base singleton class, to make the cacher a singleton.
Cache class.
Class to calculate the checksum.
The ParsableFileModel has its load updated like this.
External packages which can be usedCachingpython packages which can be used for caching are as described below. cachetools
diskcache
Tools for profiling with Cachingmemory_profiler
timeit
cprofile & profile
ProfilingProfiling without prototypememory_profiler
TimeitExecution time: 160.72932409999885 seconds Profiling with prototypememory_profiler
timeitExecution time: 153.78531450000082 seconds Conclusion ProfilingThe above profilings will be compared and a conclusion will be written down based on the profiles if possible. memory_profilerThe caching prototype uses more memory, since the data from the files is cached. timeitApprox 7 seconds margin on approx 1.5 gb extra memory usage for this specifc case. other notable findingsIn the prototype I implemented the caching on the
The
Updated the Caching in filemodel in the following way:
timeit gave the following runtime: making a time difference without this change approx. 25 seconds. Profiling with caching
profiling without caching
Noteworthy testing outcomeI tried testing loading an ext file which tries to load the same file 11 times. With the caching disabled the loading takes approx 33 seconds and uses ~660 Mb is memory. I tried testing with a few self made bc files of sizes from 300 Mb to 1.4 Gb. Update on testing: after implementation I tested again on the PR and on the master. Main: PR: Differences Main and PR: The files used for this testing might not be the same as the previous tests. Test files: points for discussion in the next refinement
RequirementsDiscussed with @veenstrajelmer, the following requirements are for this issue and should be tackled, other points can be thought about and implemented in a followup issue when needed.
Testcase to take in account: |
Describe the bug
When reading in an ExtModel, I see some duplicate forcings.
The ext file contains:
ExtModel has five 'forcings' but the first three bc models all have three forcingobjects, which are the same for all three bc models:
I guess this happens because the bc file contains data for three rivers, which are coupled to one pli in three blocks. However, I would say this is not expected behaviour.
To Reproduce
The dependency dfm_tools is optional but it helps with plotting.
Expected behavior
I would expect the forcingdata to be present once instead of in this case three times. In the kernel the pli name decides which forcingblock is read from the ext file I think.
Additional information
Version info (please complete the following information):
The text was updated successfully, but these errors were encountered: