Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LBVP gets stuck at the build_solver stage #284

Open
csskene opened this issue Feb 29, 2024 · 1 comment
Open

LBVP gets stuck at the build_solver stage #284

csskene opened this issue Feb 29, 2024 · 1 comment
Assignees

Comments

@csskene
Copy link
Contributor

csskene commented Feb 29, 2024

Hi
I've encountered an issue when trying to solve an LBVP to find the vector potential from a magnetic field. I've attached a code which shows the issue. By setting the resolution to be small the code gets stuck at the build solver phase when the number of processors is increased to 4. I've also attached a debug log which shows that it stalls at this step on processor 0

2024-02-29 13:35:17,943 transforms 0/4 DEBUG :: Building FFTW FFT plan for (dtype, gshape, axis) = (<class 'numpy.float64'>, (3, 12, 2, 6), 1)

and similar for the other processors 1->3.
This resolution is way too small for this problem, but a similar error occurs at higher resolutions on a cluster.
Best,
Calum

B_lbvp.txt
dedalus_p0.log
dedalus_p1.log
dedalus_p2.log
dedalus_p3.log

@bpbrown
Copy link
Contributor

bpbrown commented May 9, 2024

@csskene I've downloaded your script, and can run it on 1, 2 or 3 cores but not 4 (like you described). This looks a lot like a racing condition to me, where some cores are not participating in a global operation.

In particular, if you add these lines:

rank = MPI.COMM_WORLD.rank
print(f"rank {rank:d}, g:{B['g'].shape:}, c:{B['c'].shape}")

and run on 4 cores, you'll see:

rank 3, g:(3, 12, 0, 6), c:(3, 0, 5, 6)
rank 0, g:(3, 12, 2, 6), c:(3, 2, 5, 6)
rank 2, g:(3, 12, 2, 6), c:(3, 2, 5, 6)
rank 1, g:(3, 12, 2, 6), c:(3, 2, 5, 6)

so rank 3 is missing shape in the grid (in theta) and in the coeffs (in the m's).

Let me look into this a bit more and get back to you.

@bpbrown bpbrown self-assigned this May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants