K_Medoid_Clustering

In this project I revisit clustering, one of my favourite analytic methods, to explore and analyse a real-world dataset that included a mix of categorical and numerical feature. This required a different approach from the classical K-means algorithm that cannot be no directly applied to categorical data.

Instead, I used the K-medoids algorithm, also known as PAM (Partitioning Around Medoids), that has the advantage of working on distances other than numerical and lends itself well to analyse mixed-type data.

The silhouette coefficient helped to establish the optimal number of clusters, whilst t-SNE ( t-distributed stochastic neighbour embedding), a dimensionality reduction technique akin Principal Component Analysis and UMAP, unveiled good separation between clusters as well as closeness of elements within clusters, confirming the segmentation relevance.

Finally, I condensed the insight generated from the analysis into a number of actionable and data-driven recommendations that, applied correctly, could help improve product sign up.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
00_data		00_data
02_output		02_output
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

00_data

00_data

02_output

02_output

README.md

README.md

Repository files navigation

K_Medoid_Clustering

About

Releases

Packages

DiegoUsaiUK/K_Medoid_Clustering

Folders and files

Latest commit

History

Repository files navigation

K_Medoid_Clustering

About

Topics

Resources

Stars

Watchers

Forks