Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The calculated metric for the quality of the overlapping community segmentation results is incorrect #1024

Open
destiny1009 opened this issue Jan 8, 2023 · 4 comments
Assignees
Labels

Comments

@destiny1009
Copy link

destiny1009 commented Jan 8, 2023

Using the simple community division case to calculate the overlapping NMI value gives a result of 0, but in fact, the result is not 0.
image

Second, when using the f1-score metric, I found that I could not get a result and simply quit the program. Is it because I made a mistake in using it?
image

Third, regarding the overlapping communities, only the graph. cover type can be used to calculate the overlapping related metrics. But the graph.cover type cannot be obtained by reading the file, only manual code can be written to read the file and then add them one by one, this part can be optimized perhaps?

image

@destiny1009 destiny1009 changed the title Calculation of overlapping communities reveals incorrect metrics The calculated metric for the quality of the overlapping community segmentation results is incorrect Jan 8, 2023
@destiny1009
Copy link
Author

Can anyone help me explain this?

@fabratu
Copy link
Member

fabratu commented Jan 17, 2023

Hi,

and sorry for the delayed response. Concerning your first question. Manually setting the cover includes some not really intuitive procedures (we should improve the documentation in this regard). The easiest way is to create a singleton clustering and remove nodes (if that's applicable for your example) from their singleton subsets.

Sample code:

import networkit as nk

G = nk.Graph(5)
G.addEdge(0,1)
G.addEdge(1,2)
G.addEdge(2,3)
G.addEdge(3,4)

f1= nk.Cover(n=5)
f1.allToSingletons()
print(f1.getSubsetIds())
for s in range(1, f1.numberOfSubsets()+1):
    f1.removeFromSubset(s, s-1)
f1.addToSubset(1,1)
f1.addToSubset(1,3)
f1.addToSubset(1,4)
f1.addToSubset(2,2)
f1.addToSubset(2,3)
f1.addToSubset(3,1)
f1.addToSubset(3,2)

print("f1 1: ",f1.getMembers(1))
print("f1 2: ",f1.getMembers(2))
print("f1 3: ",f1.getMembers(3))

f2= nk.Cover(n=5)
f2.allToSingletons()
for s in range(1, f2.numberOfSubsets()+1):
    f2.removeFromSubset(s, s-1)
f2.addToSubset(1,1)
f2.addToSubset(1,2)
f2.addToSubset(1,3)
f2.addToSubset(2,3)
f2.addToSubset(2,4)

print("f2 1: ", f2.getMembers(1))
print("f2 2: ", f2.getMembers(2))

nk.community.OverlappingNMIDistance(normalization=nk.community.Normalization.Max).getDissimilarity(G,f1,f2)

Output:

{1, 2, 3, 4, 5}
f1 1:  {1, 3, 4}
f1 2:  {2, 3}
f1 3:  {1, 2}
f2 1:  {1, 2, 3}
f2 2:  {3, 4}

0.6395516101947395

Note that cover subset IDs start with 1, while the node IDs start with 0.

@fabratu
Copy link
Member

fabratu commented Jan 17, 2023

Concerning your second question:

There is indeed a bug in the code - getValues() is the culprit here. As a workaround, you can run the algorithm and iterate over the nodes in the graphs to extract the values.

sim = nk.community.CoverF1Similarity(G,f1,f2)
sim.run()
for i in G.iterNodes():
    print(sim.getValue(i))

Output:

0.0
0.8
0.8
0.8
0.0

@fabratu fabratu self-assigned this Jan 17, 2023
@fabratu fabratu added the bug label Jan 17, 2023
@fabratu
Copy link
Member

fabratu commented Jan 17, 2023

Concerning your third question:

You can use the PartitionWriter/ CoverWriter (or their respective reader) to read or write Covers/Partitions. See here for documentation: https://networkit.github.io/dev-docs/python_api/graphio.html?highlight=cover#networkit.graphio.CoverWriter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants