Do you have any suggestions on setting the max number of iteration in training som? #66

zhenzonglei · 2020-04-15T15:24:19Z

Hi,
I just found there is a inconsistency statement for the verbose output of the train method.
Line 347 states that if verbose is true, the status of the training will be printed at each iteration. But in line 361, the status is only printed after all iterations. I guess the code between 361-363 should be indented.
Thanks

JustGlowing · 2020-04-15T15:40:44Z

Hi, thanks for checking that! However, the status is printed at every iteration via the generator defined outside the class. Only there errors are printed in there end. It's unpractical to recompute there errors at each iteration.

…

On Wed, Apr 15, 2020, 16:24 Zonglei Zhen ***@***.***> wrote: Hi, I just found there is a inconsistency statement for the verbose output of the train method. Line 347 states that if verbose is true, the status of the training will be printed at each iteration. But in line 361, the status is only printed after all iterations. I guess the code between 361-363 should be indented. Thanks — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#66>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABFTNGOZG4DR5KAOM47KPTLRMXGTLANCNFSM4MIVLB7A> .

zhenzonglei · 2020-04-16T16:41:58Z

Got it! Thanks very much.
BTW，do you have any suggestions on setting the max number of iteration in training som? For example, if I have 10,000 samples, what is a reasonable number for the iteration?
Thanks again

JustGlowing · 2020-04-16T17:11:29Z

Hi again, the number of iterations required for convergence depends on many factors. The main ones are size of the som and shape of the data. The only way to know if you reached convergence is to look at the learning curve and check if it reached a plateau (see the Iris example).

If you have a som 100-by-100, start with 10000 iterations so that each sample is observed at least once and check the results. Increase the the number of iterations if you think that the error is on a downward trajectory.

V-for-Vaggelis · 2020-06-25T20:46:02Z

Let me extend this question a little further with some emphasis on the topographic error. I have a dataset with around 360 rows, and small correlations between features. After plotting the learning curves like in the "Iris" example I noticed the quantization error indeed shows a decreasing behavior that reaches a plateau and the topographic error shows a fluctuating behavior that tends to become stable as well. The problem is it fluctuates around 0.8 which is too large. Since the t.e is an indication of how representative the SOM is I believe it is an important issue.

The question is whether there is a parameter that if properly tuned can decrease the t.e, or if it is inevitable to get a non-representative SOM for low-correlated data ?

JustGlowing · 2020-06-26T06:07:59Z

hi @V-for-Vaggelis,

Have you tried inspecting the results visually? You want to check that the u-matrix (that you can get with the method distance_map()) is smooth.

You can obtain a smooth mapping no matter how the data is correlated.

JustGlowing · 2020-06-26T13:24:05Z

@V-for-Vaggelis also, to really understand if the som has converged you can check the weights step by step and stop when the they don't change anymore (||W_i - W_i-1|| < epsilon).

V-for-Vaggelis · 2020-06-26T18:05:13Z

@JustGlowing A weird thing happened. I updated minisom and would not print t.e anymore. So I print it myself and got 0.09 for the same data. Could it be a bug you had fixed? I also got the distance map as you advised. In general it has a smooth behavior, but there is a small red area (large distances). I guess it means this small area of the grid can't be trusted to draw conclusions.

Also another thing, is there a paper I can refer to in my thesis for minisom or should I just link to the repo?

JustGlowing · 2020-06-28T04:59:31Z

@V-for-Vaggelis there was a bug fix released in December related to the quantization error.

Can you please cite MiniSom as follows:

G. Vettigli, "MiniSom: minimalistic and NumPy-based implementation of the Self Organizing Map,". Available:
whttps://github.com/JustGlowing/minisom.

Yifeng-J · 2021-02-27T05:50:22Z

Hi，I am using Minisom to cluster data, and I find it is so convenient. So thanks for your contributions. However, I am confused about how to properly select initial parameters, eg: sigma,learning rate and max_iteration. In the issue, you said " The only way to know if you reached convergence is to look at the learning curve and check if it reached a plateau", but I want to know use which indicator to plot learning curve, quantization error?

And finally, I want to know is there any way I can get the cluster number to which each datapoint in that dataset belongs to. In the Cluster example you set each neuron as a cluster but it is not properly in my experiment.

Thanks .

JustGlowing · 2021-02-27T07:04:26Z

hi @Yifeng-J,

Here's an example of how to plot the learning curve: https://github.com/JustGlowing/minisom/blob/master/examples/BasicUsage.ipynb

I'd recommend to use the quantization error unless you're trying to optimize your own custom metric.

Regarding the cluster index, that example you pointed out shows the most convenient way to solve the issues. However, you can do more complex stuff, like grouping different neurons and assigning the cluster index according to that.

Yifeng-J · 2021-02-27T07:15:08Z

@JustGlowing Ok,thanks for your answering. I will try some other method to solve the cluster index problem. I hope you can give me some suggestions on how to choose initial parameters, because I can't find any information about how to choose it properly on the Internet.

JustGlowing · 2021-02-27T07:29:07Z

I'd suggest you too start with the default parameters and plot the results as showed into the documentation. Then you can tweaking the parameters. You'll get a grasp once you try a couple of edge cases (eg set sigma too high or too low). Remember that there's no optimal set of parameters, but you can find a set that is good enough for you.

Yifeng-J · 2021-02-27T07:31:37Z

@JustGlowing Ok, I get it. Thank you very much!

atheeraa · 2021-04-10T23:32:36Z

Hello guys, I'm trying to use minisom for clustering a 16-dimensions embeddings with 7 classes,
I'm not sure how to set the size of maps,

if for example I set it to 77 i'd get 49 clusters
33 = 9 clusters

I read your rule of thumb, but it doesn't work for me, because I'd have to set it to 16*16 and by doing so I'd get 265 clusters!

Would appreciate the help

JustGlowing · 2021-04-11T05:57:20Z

Hi @atheeraa , you have to set input_len to 16 and create a map of size 3x3. This will give you 9 clusters and you can merge two of the closest clusters to get the 8 that you need.

atheeraa · 2021-04-12T21:05:07Z

Thank you for your reply!
I have another question regarding the visualization, I have a 16 dimensions embeddings, how do you suggest I plot the map, when following the clustering example you provided, I can see that I can change the x and y of the scatter function, but I don't know how do I show the whole data at once.

Again, thank you for your replies, I appreciate your help.

JustGlowing · 2021-04-13T05:45:01Z

hi again @atheeraa , you want to have a look at this example https://github.com/JustGlowing/minisom/blob/master/examples/BasicUsage.ipynb

Overture-Y · 2022-11-25T11:38:36Z

Thank you so much for your wonderful working, I'm trying to use minisom in some cluster task, but in the cluster example, "som.winner" process the data one by one, which cost so much time if the amount of input is huge, if the input 's shape is (m, n ), how to process the array without “for”？ thank you.

zhenzonglei changed the title ~~verbose output for train methods~~ verbose output for the train method Apr 15, 2020

JustGlowing closed this as completed Apr 16, 2020

JustGlowing changed the title ~~verbose output for the train method~~ number of iterations Apr 16, 2020

JustGlowing reopened this Apr 16, 2020

JustGlowing added the question label Apr 16, 2020

JustGlowing changed the title ~~number of iterations~~ Do you have any suggestions on setting the max number of iteration in training som? May 27, 2020

JustGlowing mentioned this issue Oct 24, 2020

How to know that the training is good enough? #87

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do you have any suggestions on setting the max number of iteration in training som? #66

Do you have any suggestions on setting the max number of iteration in training som? #66

zhenzonglei commented Apr 15, 2020

JustGlowing commented Apr 15, 2020 via email

zhenzonglei commented Apr 16, 2020

JustGlowing commented Apr 16, 2020

V-for-Vaggelis commented Jun 25, 2020

JustGlowing commented Jun 26, 2020

JustGlowing commented Jun 26, 2020

V-for-Vaggelis commented Jun 26, 2020

JustGlowing commented Jun 28, 2020

Yifeng-J commented Feb 27, 2021

JustGlowing commented Feb 27, 2021

Yifeng-J commented Feb 27, 2021

JustGlowing commented Feb 27, 2021

Yifeng-J commented Feb 27, 2021

atheeraa commented Apr 10, 2021

JustGlowing commented Apr 11, 2021

atheeraa commented Apr 12, 2021

JustGlowing commented Apr 13, 2021

Overture-Y commented Nov 25, 2022

Do you have any suggestions on setting the max number of iteration in training som? #66

Do you have any suggestions on setting the max number of iteration in training som? #66

Comments

zhenzonglei commented Apr 15, 2020

JustGlowing commented Apr 15, 2020 via email

zhenzonglei commented Apr 16, 2020

JustGlowing commented Apr 16, 2020

V-for-Vaggelis commented Jun 25, 2020

JustGlowing commented Jun 26, 2020

JustGlowing commented Jun 26, 2020

V-for-Vaggelis commented Jun 26, 2020

JustGlowing commented Jun 28, 2020

Yifeng-J commented Feb 27, 2021

JustGlowing commented Feb 27, 2021

Yifeng-J commented Feb 27, 2021

JustGlowing commented Feb 27, 2021

Yifeng-J commented Feb 27, 2021

atheeraa commented Apr 10, 2021

JustGlowing commented Apr 11, 2021

atheeraa commented Apr 12, 2021

JustGlowing commented Apr 13, 2021

Overture-Y commented Nov 25, 2022