Skip to content

howardyclo/kmeans-dbscan-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kmeans-dbscan-tutorial

A clustering tutorial with scikit-learn for beginners.

Contents

  1. Introduction to k-means, k-means++ and DBSCAN (Density-Based Spatial Clustering Algorithm with Noise).

  2. Explore common drawbacks of k-means, such as:

  • Need to choose the right number of clusters.
  • Cannot handle Noise Data and Outliers.
  • Cannot handle Non-spherical Data. And of course, present solutions for the above drawbacks.
  1. Introduction to supervised and unsupervised methods for measuring cluster quality such as homogeneity, completeness and the Silhouette Coefficient (part of section 2).

  2. Two simple exercises (k-means & DBSCAN) along with the tutorial.

Get Started

  • Please refer to the slides in slides/ or review then on google drive, there are Chinese version and English version.
  • Codes are in tutorial_and_labs/, each .ipynb has its corresponding .html.

Releases

No releases published

Packages

No packages published