Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add content on Dask task graph and debugging #41

Open
DamienIrving opened this issue May 19, 2021 · 1 comment
Open

Add content on Dask task graph and debugging #41

DamienIrving opened this issue May 19, 2021 · 1 comment
Assignees

Comments

@DamienIrving
Copy link
Collaborator

At my 2021 Dask Summit presentation about teaching Dask to atmosphere and ocean scientists it was suggested that content could be added about the Dask task graph and debugging / best practices for finding pain points.

It was suggested that this PyData talk might be useful:
https://www.youtube.com/watch?v=JoK8V2eWFPE

@DamienIrving
Copy link
Collaborator Author

On the debugging side of things, it would be worth adding the progress bar to the lesson:

import dask.diagnostics
dask.diagnostics.ProgressBar().register()

In order to do this we'd need to explain the difference between a local (or single-machine; default) scheduler and a distributed scheduler, because the tools you use for profiling are different for each. I think this distinction is well worth explaining.
https://docs.dask.org/en/stable/diagnostics-local.html
https://docs.dask.org/en/stable/scheduling.html

This script also shows how to use the resource profiler:
https://github.com/climate-resilient-enterprise/workflows/blob/master/cmdline_programs/return_period.py

@DamienIrving DamienIrving self-assigned this Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant