Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I get the cluster number to which each sample belongs? #64

Open
sumanth-bmsce opened this issue Mar 28, 2020 · 11 comments
Open

How do I get the cluster number to which each sample belongs? #64

sumanth-bmsce opened this issue Mar 28, 2020 · 11 comments
Labels

Comments

@sumanth-bmsce
Copy link

Hi,

Is there any way I can get the cluster number to which each datapoint in that dataset belongs to?

@JustGlowing
Copy link
Owner

Hi, since this is a recurring question I decided to add an example here: https://github.com/JustGlowing/minisom/blob/master/examples/Clustering.ipynb

@sumanth-bmsce
Copy link
Author

Hi,

Thanks for quickly answering my question. Depending on the dataset info i.e. no of clusters and number of data points in each class, we can calculate the centroids for the coordinates returned by som.winner and use it to find the cluster number and calculate the clustering accuracy?

@JustGlowing
Copy link
Owner

JustGlowing commented Mar 28, 2020 via email

@sumanth-bmsce
Copy link
Author

Is it possible that I can contribute computing the cluster numbers and clustering accuracy in the way which I described before to this repo?

@JustGlowing
Copy link
Owner

JustGlowing commented Mar 28, 2020 via email

@JustGlowing
Copy link
Owner

PS: I updated the example so that also the centroids are plotted.

@JustGlowing JustGlowing changed the title Get the cluster number to which each datapoint belongs How do I get the cluster number to which each sample belongs? May 27, 2020
@JustGlowing JustGlowing reopened this May 27, 2020
@Edmond-Lee-Zse-Wong
Copy link

Hi,
my original dataset has no labels, is there any way to sign every datapoints after som training?
PS: I use a 9*9 som and the distance map shows that the dataset is approximately divided to 2 clusters
Thx !

@JustGlowing
Copy link
Owner

hi @Edmond-Lee-Zse-Wong check the example linked above.

@Edmond-Lee-Zse-Wong
Copy link

Hi
I've checked it last night but the problem hasn't been tackled. As you can see 11 winner neurons are activated, but my dataset has no possibility to be divided to 11 clusters since it is a simple bidimensional dataset(236 points). The som size is calculated by an experiance formulation: size = 5√N, where N is the number of datapoints. The only solution is to reduce the size of som?
Figure_1
Figure_2

@JustGlowing
Copy link
Owner

It's likely that some clusters are spread across different neurons. The som size is not adequate for the task that you are trying to perform.

@Edmond-Lee-Zse-Wong
Copy link

Yes I think so, maybe I should reduce the size to 2 or 3.
Thx for your help !!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants