New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add getting started doc page #11093
base: main
Are you sure you want to change the base?
Add getting started doc page #11093
Conversation
Are you familiar with https://docs.dask.org/en/stable/10-minutes-to-dask.html ? |
Yes, but that's was too much (and unnecessary) information for someone who wants to start playing around with. |
I don't disagree with your critique, however I think that both pages are trying to serve the same role, and I'd rather that we have exactly one of them. I suggest that we find some way to improve the old page, or bring this one up to a state where it could comfortably replace it. One thing I like about the 10-minutes page is that it's not dask-dataframe specific. This new attempt talks a lot about pandas and not at all about other APIs, which I would prefer to avoid. |
This isn't ready yet and definitely needs more content, but serving 5 different use-cases at the same time is one of the issues with the other page. Someone who hasn't heard about Dask yet is best served with a clear cut intro that holds their hand for the first few steps. We probably want something similar on another page where we can send user to that don't care about DataFrames. I wouldn't want to remove the other page completely, but just offer information for the second 10 minutes basically, but again, the PR isn't ready yet |
I agree that this is a challenge.
Not if those first few steps don't lead in the direction that they care about. I think that this is harmful in those cases. If I'm coming to Dask for general purpose parallel computing, and I see that the quickstart is all about pandas, then there's a non-trivial possibility that I move on. I acknowledge that solving both of these problems simultaneously is hard
I'm not sure I understand what this would be. A second quickstart?
I'll hold off. Maybe I'm just communicating early that I'll be -1 on a quickstart that strongly emphasizes dataframes at the expense of other APIs. |
Just thinking out loud here, but another solution would be to have a dataframes-specific quickstart in the dataframes section. Then different external resources could link to that page, rather than to a single overall dask quickstart. |
FWIW I was thinking something similar. Could imagine a Dask Array quickstart as well. |
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ±0 15 suites ±0 3h 24m 19s ⏱️ - 3m 11s Results for commit d9c6a9d. ± Comparison against base commit efb4a62. |
pre-commit run --all-files