Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency check for partition boundaries failed #34

Open
JohnMalliotakis opened this issue Aug 10, 2022 · 0 comments
Open

Consistency check for partition boundaries failed #34

JohnMalliotakis opened this issue Aug 10, 2022 · 0 comments

Comments

@JohnMalliotakis
Copy link

Hi, I'm trying to run PageRank over two MPI (MPICH v4.0.2) hosts, with two NUMA nodes each. The input graph is very large (>4B vertices) therefore I converted Gemini to use uint64_t vertex IDs. Everything seems to work fine until the locality chunking phase and the subsequent computation of partition offsets.

At that point Gemini fails on the assertion at line 854 in core/graph.hpp, and indeed by adding some debug prints I can see that the two machines have computed different partition offsets for NUMA node 1.

I was able to avoid this failure by adding an MPI_Allreduce call before the MPI_Allreduce call which sets up the global_partition_offset array, which uses MPI_MAX to immediately store the max computed partition offsets directly into the local partition offset array. However I am not sure if this is entirely correct.

Any ideas on a possible cause for the issue (also is my solution correct) ?

Note: I have forked the repo so you can take a look at my modifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant