-
Notifications
You must be signed in to change notification settings - Fork 222
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproduce detectCommunity #1219
Comments
@ychen983384 Have you tried turning off parallelism? PLM(G, par="none") |
@clstaudt |
Yes, with parallelism the order of execution (in terms of nodes moving from one cluster to the other) is not deterministic. Without parallelism you get the same result for consequetive runs:
The output is |
@ychen983384 As @fabratu has explained, the parallelism makes it nondeterministic. Turning off parallelism for your graph kind of defeats the purpose of using this particular algorithm. We never considered determinism to be an important property when building it. |
@clstaudt @fabratu Thank you both for the clear explanation and the great tools you developed. Scientific community and journals have been emphasizing more and more on reproducibility of both computational and experimental research work which might make determinism more important. With determinism, the parallel tools you have developed such as PLM and parLeiden would be more attractive to the scientific community since huge data set come from single cell multi-omics technology. Is it feasible to implement these community detection algorithms (such as louvain and leiden) with both parallelism and determinism or it is theoretically impossible? |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Runs on the same graph generates different results. Is there a way to make this reproducible on the same graph?
first run
Communities = nk.community.detectCommunities(GData, algo=nk.community.PLM(G=GData,refine=True, gamma=0.5))
Communities detected in 1.21094 [s]
solution properties:
communities 10
min community size 12562
max community size 41579
avg. community size 24544.8
imbalance 1.69399
edge cut 404216
edge cut (portion) 0.054895
modularity 0.801823
second run
Communities = nk.community.detectCommunities(GData, algo=nk.community.PLM(G=GData,refine=True, gamma=0.5))
Communities detected in 1.18332 [s]
solution properties:
communities 11
min community size 13536
max community size 43623
avg. community size 22313.5
imbalance 1.95496
edge cut 426229
edge cut (portion) 0.0578845
modularity 0.808974
The text was updated successfully, but these errors were encountered: