Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Graph: Support 32-bit node IDs #427

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

avdgrinten
Copy link
Contributor

This PR allows NetworKit configurations that store adjacency lists consisting of 32-bit node IDs. In the best case, this should roughly half the memory required for NetworKit::Graph objects, at the cost of supporting fewer nodes.

Because we expect some code in NetworKit to depend on 64-bit nodes, we do not change the node data type - this type is still a uint64_t. Instead, we change the std::vectors that store the adjacency lists to use a 32-bit integer and convert at the NetworKit::Graph boundary.

This requires some changes in the GraphBuilder (and in Curveball's clone of GraphBuilder). One thing to note is that GraphBuilder::swapAdjacency is much more expensive when 32-bit node IDs are used; however, I do not think that this is a bottleneck for existing algorithms (if you disagree, please comment!).

Furthermore, the resulting NetworKit library is not ABI compatible with the default build. Users of the library are currently expected to pass -DNETWORKIT_U32_NODES when compiling programs that link against a 32-bit node ID build. This mechanism should probably be revisited (possibly in an update of this PR): instead of forcing the user to pass such a definition, we can install a configuration header for NetworKit.

@manpen
Copy link
Contributor

manpen commented Oct 9, 2019

No review, but an observation: Did you consider how none interacts with this? While I do not fully understand the circumstances under which none finds its way into outEdges, there a lot of checks in Graph.cpp for it. I believe this might break.

@avdgrinten
Copy link
Contributor Author

Good objection. I did not consider none. I will review the checks in Graph.cpp. It might make sense to add an assertion that prevents none from being added to the graph.

@avdgrinten
Copy link
Contributor Author

I took a quick look at the checks. none is mainly used as a return value in functions like indexInInEdgeArray. There are one or two checks that test for none in neighbor arrays, however, I do not think that none can ever be added to a graph. In particular, this would break the assert(exists[u]) / assert(exists[v]) assertions in addEdge. I will remove the unnecessary checks and add explicit assertions for none.

@avdgrinten
Copy link
Contributor Author

This PR needs some comments on the general approach and whether it should be adapted or not. Regarding the motivation for this PR: In an application, I had to fit graphs with ~3B edges (+ other overhead) into the 96 GiB RAM available to our compute nodes - this was not possible with 64 bit node IDs. Loading larger graphs might also be interesting to users who run NetworKit on their laptops (with less total memory available). Besides allowing the storage of larger graphs, this PR might also give a small performance boost for certain algorithms to due increased locality and reducing the use of memory bandwidth. The only downside is that this feature requires a recompilation but I do not see a way to avoid this without impact efficiency.

@fabratu
Copy link
Member

fabratu commented Apr 30, 2020

I like the idea to make it possible to use or develop on smaller machines. For consumer-based workstations (aka the affordable ones) 128 GiB RAM is currently the normal cap, for notebooks normally 64 GiB. Also common clusters like the SuperMUC-NG use 96 GiB per node. Don't know if that's realistic, but I can see a use-case here.

Also adaption seems not to expensive, needs update of the readme/documentation though.

@fabratu fabratu changed the base branch from Dev to master May 29, 2020 16:54
fabratu added a commit to fabratu/networkit that referenced this pull request Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants