Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add getting started doc page #11093

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Conversation

phofl
Copy link
Collaborator

@phofl phofl commented May 3, 2024

  • Closes #xxxx
  • Tests added / passed
  • Passes pre-commit run --all-files

@phofl phofl marked this pull request as draft May 3, 2024 13:59
@mrocklin
Copy link
Member

mrocklin commented May 3, 2024

Are you familiar with https://docs.dask.org/en/stable/10-minutes-to-dask.html ?

@phofl
Copy link
Collaborator Author

phofl commented May 3, 2024

Are you familiar with https://docs.dask.org/en/stable/10-minutes-to-dask.html ?

Yes, but that's was too much (and unnecessary) information for someone who wants to start playing around with.
It's not really a quick start, more a resource for things after this section

@mrocklin
Copy link
Member

mrocklin commented May 3, 2024

Yes, but that's was too much (and unnecessary) information for someone who wants to start playing around with.
It's not really a quick start, more a resource for things after this section

I don't disagree with your critique, however I think that both pages are trying to serve the same role, and I'd rather that we have exactly one of them. I suggest that we find some way to improve the old page, or bring this one up to a state where it could comfortably replace it.

One thing I like about the 10-minutes page is that it's not dask-dataframe specific. This new attempt talks a lot about pandas and not at all about other APIs, which I would prefer to avoid.

@phofl
Copy link
Collaborator Author

phofl commented May 3, 2024

This isn't ready yet and definitely needs more content, but serving 5 different use-cases at the same time is one of the issues with the other page. Someone who hasn't heard about Dask yet is best served with a clear cut intro that holds their hand for the first few steps.

We probably want something similar on another page where we can send user to that don't care about DataFrames.

I wouldn't want to remove the other page completely, but just offer information for the second 10 minutes basically, but again, the PR isn't ready yet

@mrocklin
Copy link
Member

mrocklin commented May 3, 2024

but serving 5 different use-cases at the same time is one of the issues with the other page

I agree that this is a challenge.

intro that holds their hand for the first few steps.

Not if those first few steps don't lead in the direction that they care about. I think that this is harmful in those cases. If I'm coming to Dask for general purpose parallel computing, and I see that the quickstart is all about pandas, then there's a non-trivial possibility that I move on.

I acknowledge that solving both of these problems simultaneously is hard

We probably want something similar on another page where we can send user to that don't care about DataFrames

I'm not sure I understand what this would be. A second quickstart?

the PR isn't ready yet

I'll hold off. Maybe I'm just communicating early that I'll be -1 on a quickstart that strongly emphasizes dataframes at the expense of other APIs.

@mrocklin
Copy link
Member

mrocklin commented May 3, 2024

Just thinking out loud here, but another solution would be to have a dataframes-specific quickstart in the dataframes section. Then different external resources could link to that page, rather than to a single overall dask quickstart.

@jrbourbeau
Copy link
Member

another solution would be to have a dataframes-specific quickstart in the dataframes section

FWIW I was thinking something similar. Could imagine a Dask Array quickstart as well.

Copy link
Contributor

github-actions bot commented May 3, 2024

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

     15 files  ±0       15 suites  ±0   3h 24m 19s ⏱️ - 3m 11s
 13 121 tests ±0   12 190 ✅ +1     931 💤 ±0  0 ❌  - 1 
162 468 runs  ±0  142 410 ✅  - 2  20 058 💤 +3  0 ❌  - 1 

Results for commit d9c6a9d. ± Comparison against base commit efb4a62.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants