K-Means Clustering

This is a Python implementation of the K-Means clustering algorithm using K-Means++ centroid initialization strategy.

K-Means clustering is a form of unsupervised learning where you have data and know its attributes but not its classification. It relies on the fact that similar data will stick together to form clusters, and the KM algorithm is meant to identify these clusters.

Here's an example visualization:

Dependencies

pandas
matplotlib
Python (3.6+)
GNU/Linux

Execution

You can clone these files to your computer with the below:

$ git clone https://github.com/stratzilla/k-means-clustering

Execute the script as the below:

$ ./k-means.py <File> <Type> <K> <Epochs>

Where the following are the arguments:

 <File> -- the data to perform k-means clustering upon
 <Type> -- the type of metric space to use:
        -- 1 - Euclidean Metric
        -- 2 - Manhattan Metric
        -- 3 - Chebyshev Metric
 <K> -- k-parameter, or how many clusters [2, 25]
 <Epochs> -- how many epochs for k-means to perform [1, 100]

Each epoch frame will be saved as an image in ./plots/. This directory will be made if it does not exist.

Using no arguments will remind the user of this. If the arguments are appropriate, the output to console will be the Dunn Index of the clustering.

Data

In /data are a few data sets as found here, each with a nomimal k=15.

Tutorial

I wrote a tutorial to go alongside this repository: here!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
images		images
.gitattributes		.gitattributes
README.md		README.md
k_means.py		k_means.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

images

images

.gitattributes

.gitattributes

README.md

README.md

k_means.py

k_means.py

Repository files navigation

K-Means Clustering

Dependencies

Execution

Data

Tutorial

About

Releases

Packages

Languages

stratzilla/k-means-clustering

Folders and files

Latest commit

History

Repository files navigation

K-Means Clustering

Dependencies

Execution

Data

Tutorial

About

Topics

Resources

Stars

Watchers

Forks

Languages