Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues for using MPI in GUI interactive mode #182

Open
rcoreilly opened this issue Mar 26, 2023 · 1 comment
Open

Issues for using MPI in GUI interactive mode #182

rcoreilly opened this issue Mar 26, 2023 · 1 comment

Comments

@rcoreilly
Copy link
Member

mpi depends entirely on each proc executing the same sequence of AllGather etc calls at the same times. If any node doesn't, everything just waits and then probably timeouts with an error.. When running a fixed -nogui run, there is no problem here.

But when running interactively, each node needs to get the user's commands to start, stop, step, Init, etc, so they can all stay sync'd. Thus, we need an additional outer-loop of communication where the proc > 0 nodes wait for commands and then run them, all the while checking to see if a stop command has come in.

Probably this should be done using something other than mpi, because it needs to be non-blocking and more dynamic. Someone with appropriate network communication knowledge should probably take this on..

@siboehm
Copy link
Member

siboehm commented Mar 27, 2023

I wouldn't use a different protocol, mostly because if we just MPI for everything we only have to do the MPI_World setup once. With a different protocol it'll get complicated once we have cross-machine MPI with ssh setups etc.

Can't we just put a MPI.BCast from the root node (where the GUI runs) to all other procs into the GUI loop, that tells the other procs about current user input (start, stop etc)? It should really be blocking, else you'll run into the same issues with timed-out AllReduces. Using blocking will add a ~10μs of latency, which will be fast enough to not be noticeable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants