Part 3: Clustering #14

emilygrabowski · 2021-01-07T22:14:23Z

I've noticed that the number of algorithms introduced in the classification and clustering sections is much less than in regression. For consistency, I would suggest including more examples of algorithms in the first and third notebooks (even if it is only there as a reference for later)
Dataset: I would encourage using an imported dataset (rather than made up dots, or random shapes), and the same dataset for both types of clustering. This would work well with the same dataset as classification. (although for clustering sticking to 2-D makes a lot of sense!) Using a dataset would also allow us to emphasize how the user has to interpret the results of a clustering algorithm.
I would also suggest ordering the notebooks such that classification and clustering are next to each other since they illustrate the difference between supervised and unsupervised learning.
For both sections, I would include information on evaluating cluster fit with metrics, and how that can play into parameter selection.
For the challenge section, I think that solutions are already included in this notebook.

Provide feedback