Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Part 3: Clustering #14

Open
emilygrabowski opened this issue Jan 7, 2021 · 0 comments
Open

Part 3: Clustering #14

emilygrabowski opened this issue Jan 7, 2021 · 0 comments

Comments

@emilygrabowski
Copy link
Contributor

  1. I've noticed that the number of algorithms introduced in the classification and clustering sections is much less than in regression. For consistency, I would suggest including more examples of algorithms in the first and third notebooks (even if it is only there as a reference for later)
  2. Dataset: I would encourage using an imported dataset (rather than made up dots, or random shapes), and the same dataset for both types of clustering. This would work well with the same dataset as classification. (although for clustering sticking to 2-D makes a lot of sense!) Using a dataset would also allow us to emphasize how the user has to interpret the results of a clustering algorithm.
  3. I would also suggest ordering the notebooks such that classification and clustering are next to each other since they illustrate the difference between supervised and unsupervised learning.
  4. For both sections, I would include information on evaluating cluster fit with metrics, and how that can play into parameter selection.
  5. For the challenge section, I think that solutions are already included in this notebook.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant