-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Graph: Support 32-bit node IDs #427
base: master
Are you sure you want to change the base?
Conversation
No review, but an observation: Did you consider how |
Good objection. I did not consider |
I took a quick look at the checks. |
This PR needs some comments on the general approach and whether it should be adapted or not. Regarding the motivation for this PR: In an application, I had to fit graphs with ~3B edges (+ other overhead) into the 96 GiB RAM available to our compute nodes - this was not possible with 64 bit node IDs. Loading larger graphs might also be interesting to users who run NetworKit on their laptops (with less total memory available). Besides allowing the storage of larger graphs, this PR might also give a small performance boost for certain algorithms to due increased locality and reducing the use of memory bandwidth. The only downside is that this feature requires a recompilation but I do not see a way to avoid this without impact efficiency. |
I like the idea to make it possible to use or develop on smaller machines. For consumer-based workstations (aka the affordable ones) 128 GiB RAM is currently the normal cap, for notebooks normally 64 GiB. Also common clusters like the SuperMUC-NG use 96 GiB per node. Don't know if that's realistic, but I can see a use-case here. Also adaption seems not to expensive, needs update of the readme/documentation though. |
This PR allows NetworKit configurations that store adjacency lists consisting of 32-bit node IDs. In the best case, this should roughly half the memory required for
NetworKit::Graph
objects, at the cost of supporting fewer nodes.Because we expect some code in NetworKit to depend on 64-bit nodes, we do not change the
node
data type - this type is still auint64_t
. Instead, we change thestd::vector
s that store the adjacency lists to use a 32-bit integer and convert at theNetworKit::Graph
boundary.This requires some changes in the GraphBuilder (and in Curveball's clone of GraphBuilder). One thing to note is that
GraphBuilder::swapAdjacency
is much more expensive when 32-bit node IDs are used; however, I do not think that this is a bottleneck for existing algorithms (if you disagree, please comment!).Furthermore, the resulting NetworKit library is not ABI compatible with the default build. Users of the library are currently expected to pass
-DNETWORKIT_U32_NODES
when compiling programs that link against a 32-bit node ID build. This mechanism should probably be revisited (possibly in an update of this PR): instead of forcing the user to pass such a definition, we can install a configuration header for NetworKit.