Process pinning with MPIPoolExecutor #476
-
Hello! Question: what's the recommended way of pinning the worker processes when using an MPIPoolExecutor? Context. My user runs a Python application that calls into a function provided by my library.
The user has zero vision whatsoever of MPI. He/she will run the app as Underneath, I tried setting an env var, e.g.
And lastly, based on what I read around including the documentation, MPI_Info doesn't appear to be the right mechanism to achieve this, though I may be wrong. So basically I don't know how to do this or if this is a current limitation of mpi4py (in which case I might be willing to contribute depending on the amount of effort required) or something else. Sort-of related question: the process spawning the MPI pool at some point performs Thanks a lot for reaching the bottom of this issue. Really appreciate your invaluable work for entire open-source / computational science community (I've been using mpi4py for > 10 years) |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 4 replies
-
You should look at the documentation of the backend MPI library about whether they support passing pinning hints to spawned processes, for example, maybe via the Are you aware about using
Isn't just great to hide so much mess and nonsense from users? 🤣
Why would you even consider there is a limitation is in mpi4py? How would you achieve pinning in a pure-C application using MPI spawning?
Ask Open MPI folks, please. I've spent hours and hours over the last couple years setting up automated testing to report regressions to them, and despite some improvements, the whole spawn thing is still unreliable, currently broken in the v5.0.x branch. I have 99.99% certainty that any issue you may experience is not mpi4py's fault, and unless proven incorrect, there's nothing I can do.
If I had coded things the obvious MPI way, and given the default way MPI implementation works, the waiting would have been a 100% CPU busy wait. MPI implementations do not care about your CPU burning, they care about beating each other a latency contests. However, your CPU burning will not be what happens by the default when using mpi4py.futures. I you let me get technical, what mpi4py does is a recurrent MPI polling with exponential backoff (details in the source code). |
Beta Was this translation helpful? Give feedback.
-
I know right! 😂 😂
I don't know honestly I just admit my ignorance 😬 first time I try to use the dynamic spawning mechanism and the boundary between the underlying mpi and mpi4py is still a bit blurred :) I'll get in touch with the OpenMPI team to see what they have to say, if you don't mind leaving this discussion open for the time being I'll be back with explanation for future readers Thanks for the great reply! |
Beta Was this translation helpful? Give feedback.
-
@mrogowski Did we ever figure out a way to do pinning of spawned processes when you were benchmarking for our mpi4py.futures paper? |
Beta Was this translation helpful? Give feedback.
-
I found what I needed, tested it, and I think it works. So according to the OpenMPI documentation I can pass https://www.open-mpi.org/doc/v4.1/man3/MPI_Comm_spawn.3.php so all I had to do was:
and it seems like I'm getting what I needed Thanks! |
Beta Was this translation helpful? Give feedback.
-
also, about your earlier comment:
I think I kinda disagree here with what you might have wanted to do. I think this:
is actually quite sensical, at least for the use I'm making of this mechanism, which is admittedly oriented towards data-parallelism rather than task-parallelism. Because in my case, which yeah might slightly bend the original intent of the MPIPoolExecutor, I want all of the physical cores at my disposal for a signficant amount of time to execute a data parallel numerical simulation. So I'm happy the master process doesn't get too much in the way. Perhaps this could be controllable with an option to specify the polling time (defaulting to what you have in place if left unspecified, and behaving like a busy wait if set to 0) Anyway, I'm guilty of diverging here :) Again, thanks a lot for the prompt answer. Really appreciated it 👍 |
Beta Was this translation helpful? Give feedback.
I found what I needed, tested it, and I think it works.
So according to the OpenMPI documentation I can pass
bind_to
andmap_by
to MPI_Info:https://www.open-mpi.org/doc/v4.1/man3/MPI_Comm_spawn.3.php
so all I had to do was:
and it seems like I'm getting what I needed
Thanks!