Skip to content

Year 1 Data Science (HVE) course assignment (2022): cluster the data, make a dashboard with some exploratory plots

Notifications You must be signed in to change notification settings

juliazubko/clustering_dashplotly

Repository files navigation

clustering_dashplotly

Clusters the data and makes a dashboard with some basic plots

[UMAP, DBSCAN, Agglomerative, Dash, Plotly]

(Year 1 Data Science(HVE) course assignment 2022)

(Addressing outdated logic in this code to improve efficiency. Refactoring in progress to address god object concerns, inefficient loops in data processing, enhance modularity, etc)

main

  • Takes in pre-processed data (no NaNs, encoded);

  • Scales the data;

  • Makes 2D UMAP embedding;

  • Performs DBSCAN and AgglomerativeClusterer hyperparameter tuning (for-loops);

  • Runs DBSCAN and AgglomerativeClusterer on the data, appends obtained cluster labels to the original dataframe;

  • Plots the results (basic Dash Plotly dashboard)

    • 3 exploratory scatterplots (UMAP data embedding, Dbscan clustering results on the embedding, Agglomerative clustering results on the embedding) with some clustering evaluation metrics displayed (Silhouette, Davis-Bouldin, Calinski-Harabasz)
    • 2 callback-changeable plots: bar- and donut chart (displays feature distribution per chosen algorithm, per chosen cluster)

Namnlös

About

Year 1 Data Science (HVE) course assignment (2022): cluster the data, make a dashboard with some exploratory plots

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages