rio.to_raster() and unmanaged memory problems with dask #677
Unanswered
martin-git
asked this question in
Q&A
Replies: 1 comment
-
I don't know how the data are stored, but have you tried a different chunk size along the band dimension? (Might be triggering a rechunking operation). I usually use a chunk size of 1 along a band dimension. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am trying to do simple processing on a very large Geotiffs (32GB+) using dask.distributed multiprocessing but I am getting massive issues with unmanaged memory filling up RAM.
What I notice is that, no matter how i set it up (worker numbers, thread numbers, memory limits, chunk sizes,,... ) each worker starts to fill up 60% of the available ram on the machine with unmanaged memory. Setting a memory_limit <32GB only leads to memory warnings and crashes. decreasing the memory target fraction for dask has no effect either.
So far the only working way I have found is to set n_workers=1, so the 32gb of unmanaged memory fit into ram.
The issue sounds a lot like what was discussed here:
https://github.com/dask/distributed/issues/6232
Here is a small snipped of the code:
Typical situation during processing:
Is this a known issue or am i overseeing something?
Beta Was this translation helpful? Give feedback.
All reactions