Skip to content

hiroyuki-kasai/SSPW-kmeans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SSPW k-means: Sparse simplex projection-based Wasserstein k-means


Authors: Hiroyuki Kasai and Takumi Fukunaga

Last page update: June 08, 2021

Latest version: 1.0.0 (see Release notes for more info)


Introduction

This repository contains the code of sparse simplex projection-based Wasserstein k-means, called SSPW k-means, that is a faster Wasserstein k-means algorithm for histogram data by reducing Wasserstein distance computations and exploiting sparse simplex projection. We shrink data samples, centroids, and the ground cost matrix, which leads to considerable reduction of the computations used to solve optimal transport problems without loss of clustering quality. Furthermore, SSPW k-means dynamically reduced the computational complexity by removing lower-valued data samples and harnessing sparse simplex projection while keeping the degradation of clustering quality lower.


Paper

T. Fukunaga and H. Kasai, "Wasserstein k-means with sparse simplex projection," ICPR2020. Publisher's site, arXiv.


Folders and files

./                      - Top directory.
./README.md             - This readme file.
./run_me_first.m        - The scipt that you need to run first.
./demo.m                - A demonstration script. 
|algorithms             - Contains the implementation file of the proposed SSPW k-means
|tools                  - Contains some files for execution.
|datasets               - Contains some datasets.

First to do

Run run_me_first for path configurations.

%% First run the setup script
run_me_first; 

Demonstration

Run demo for a demonstration.

%% Execute a demonstration script.
demo; 

Notes

  • Some parts are borrowed from below:

    • Staib, Matthew and Jegelka, Stefanie, "Wasserstein k-means++ for Cloud Regime Histogram Clustering," Proceedings of the Seventh International Workshop on Climate Informatics: CI 2017, 2017, Code.

Problems or questions

If you have any problems or questions, please contact the author: Hiroyuki Kasai (email: hiroyuki dot kasai at waseda dot jp)


Release Notes

  • Version 1.0.0 (June 08, 2021)
    • Initial version.