You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rather than using larger resources, would it be possible to aggregate provisioned resources/Workers to act as a larger single virtual Worker - this Worker would then be available to have high-resource tasks allocated to it. The idea is to save costs on cloud resources while increasing efficiency of the compute cluster.
The motivation behind this post is formed by working on large amounts of EO data and specifically writing a large raster to remote cloud storage. Having more, smaller, Workers allows developers to process more tasks at a given time, but given parts of the workflow, demand for more RAM increases (writing large rasters) and often kills Workers.
I could imagine the API looking something like this:
agg_worker_client=dask.aggregate_workers(N) # Creates a grouping operation of Workers into a logical compute into DAGagg_worker_client.run(some_delayed_function, func_arg1, func_arg2, ...) # submit tasks to aggregated Workeragg_worker_client.release() # Releases the group back to separate Workers
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Rather than using larger resources, would it be possible to aggregate provisioned resources/Workers to act as a larger single virtual Worker - this Worker would then be available to have high-resource tasks allocated to it. The idea is to save costs on cloud resources while increasing efficiency of the compute cluster.
The motivation behind this post is formed by working on large amounts of EO data and specifically writing a large raster to remote cloud storage. Having more, smaller, Workers allows developers to process more tasks at a given time, but given parts of the workflow, demand for more RAM increases (writing large rasters) and often kills Workers.
I could imagine the API looking something like this:
Thoughts?
Beta Was this translation helpful? Give feedback.
All reactions