Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.ArrayIndexOutOfBoundsException when using -q Modularity #19

Open
ghost opened this issue Feb 11, 2021 · 6 comments
Open

java.lang.ArrayIndexOutOfBoundsException when using -q Modularity #19

ghost opened this issue Feb 11, 2021 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@ghost
Copy link

ghost commented Feb 11, 2021

The following command
java -cp ~/Bioinformatics/networkanalysis/build/libs/networkanalysis-1.1.0-5-ga3f342d.jar nl.cwts.networkanalysis.run.RunNetworkClustering -q CPM -m 1 -w -o /tmp/edges_clustering.txt edges.txt
runs just fine. However, if the -q option is set to "Modularity", I get the following crash:

RunNetworkClustering version 1.1.0
By Vincent Traag, Ludo Waltman, and Nees Jan van Eck
Centre for Science and Technology Studies (CWTS), Leiden University
Reading edge list from 'edges.txt'.
Reading edge list took 0s.
Network consists of 18386 nodes and 423926 edges with a total edge weight of 24546.50103795767.
Using singleton initial clustering.
Running Leiden algorithm.
Quality function: Modularity
Resolution parameter: 1.0
Minimum cluster size: 1
Number of random starts: 1
Number of iterations: 10
Randomness parameter: 0.01
Random number generator seed: random
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 5 out of bounds for length 5
at nl.cwts.networkanalysis.LocalMergingAlgorithm.findClustering(LocalMergingAlgorithm.java:190)
at nl.cwts.networkanalysis.LeidenAlgorithm.improveClusteringOneIteration(LeidenAlgorithm.java:228)
at nl.cwts.networkanalysis.LeidenAlgorithm.improveClusteringOneIteration(LeidenAlgorithm.java:276)
at nl.cwts.networkanalysis.IterativeCPMClusteringAlgorithm.improveClustering(IterativeCPMClusteringAlgorithm.java:91)
at nl.cwts.networkanalysis.run.RunNetworkClustering.main(RunNetworkClustering.java:413)

I've attached the input file here
edges.txt
I'm not sure what property of this input file would violate the preconditions of the software. The nodes are 0-indexed numbers, there are no duplicate edges, the lefthand node ID is always less than the righthand node ID, etc.

@vtraag
Copy link
Contributor

vtraag commented Feb 11, 2021

The problem seems to be caused by the edges with zero weights. For the time being you can circumvent the problem by simply removing those edges. This can be done without any problem since edges with zero weights have no effect anyway.

Nonetheless, this is a bug that should be corrected. We will provide a fix at some later time.

@vtraag vtraag self-assigned this Feb 11, 2021
@vtraag vtraag added the bug Something isn't working label Feb 11, 2021
@vtraag
Copy link
Contributor

vtraag commented Feb 11, 2021

In particular, the root cause of the problem is that we check whether a neighboring cluster is already added in these conditionals:

We should probably use a boolean array isClusterAdded instead of relying on the edge weight. This is the most robust way to address this issue I think. @neesjanvaneck, what do you think?

@ghost
Copy link
Author

ghost commented Feb 11, 2021

Ah, I see. Makes perfect sense. I didn't intend to have any edges with zero weight. I'm happy to check for that condition and remove those edges, which are meaningless. Would you like me to close this issue (as far as I'm concerned, it's already resolved)?

@vtraag
Copy link
Contributor

vtraag commented Feb 11, 2021

No problem, I imagine that you did not check this condition prior to running the algorithm. No, let's leave the issue open, as the program shouldn't crash on this input.

@ghost
Copy link
Author

ghost commented Feb 11, 2021

I suppose a smaller change would simply be to filter out edges with zero weight in the code that reads the edges from the file. This way, you can enforce your own precondition, and leave the downstream code unchanged. The advantage is that it would avoid adding another variable for book-keeping.

@vtraag
Copy link
Contributor

vtraag commented Feb 12, 2021

Yes, we would probably do that as well. However, the program still should not crash, even if an edge of zero weight would somehow be included.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant