You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you have a problem launching MPI or OpenSHMEM applications, be sure to read this.
If you have a problem running MPI or OpenSHMEM applications (i.e., after launching them), be sure to read this.
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
It should be the latest version from openFoam installation (a 4 version) but I also built the latest version from your website (version5)
I probably have two version 4 and 5
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
I have installed latest version of openFoam 2312 which comes with a openMPI version 4.
I also built the latest version 5 from your website
If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.
Please describe the system on which you are running
Operating system/version: Ubuntu 2204
Computer hardware: MB=ASUS KRPA-U16 and
AMD EPYC7713P on both machines
Network type: I am using one ethernet port to connect the two machines
Details of the problem
I have two machines A and B with identical HW and SW. They seem to have no problems in ssh and sharing a folder (on A).
I can regularly run an example (as the hello_c.c from example folder) or a openfoam simulation in parallel using the 64 cores of a single machine with a command : ...$ mpirun -np 64 ./hello . Either on A and B machine.
If I try to run both machines as for example ...$ mpirun --hostfile /etc/hosts -np 168 ./hello the terminal hangs and no output is shown (error messages neither).
I am attaching some of the configurations of my system and the strace final part of the command ...$ strace mpirun --hostfile /etc/hosts -np 128 ./hello
Note: If you include verbatim output (or a code block), please use a GitHub Markdown code block like below:
Thanks for the reply but I am not sure I understand what you mean I should change.
In the past using a etc.hosts file as the ones I attached in the former message I managed to have openMPI run correctly with two nodes.
Even if I did not mention it, I have also tried -this time - using a "machines" text file to read nodes from A and B (just these names or IP and the names). If I just write A and have openMPI run with 64 cores $ mpirun --hostfile machines -np 64 ./hello , it works. If I change the name to B or A and B (and changing -np to 128 ) and launch mpirun from A , it does not work. Justs hungs as with the other situation (/etc/hosts)
make sure there is no firewall between both hosts (passwordless ssh is necessary but not enough). With Open MPI v4, try restricting to known interfaces, for example mpirun --mca btl_tcp_if_include eth0 --mca oob_tcp_if_include eth0 ...
Please submit all the information below so that we can understand the working environment that is the context for your question.
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
It should be the latest version from openFoam installation (a 4 version) but I also built the latest version from your website (version5)
I probably have two version 4 and 5
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
I have installed latest version of openFoam 2312 which comes with a openMPI version 4.
I also built the latest version 5 from your website
If you are building/installing from a git clone, please copy-n-paste the output from
git submodule status
.Please describe the system on which you are running
Details of the problem
I have two machines A and B with identical HW and SW. They seem to have no problems in ssh and sharing a folder (on A).
I can regularly run an example (as the hello_c.c from example folder) or a openfoam simulation in parallel using the 64 cores of a single machine with a command : ...$ mpirun -np 64 ./hello . Either on A and B machine.
If I try to run both machines as for example ...$ mpirun --hostfile /etc/hosts -np 168 ./hello the terminal hangs and no output is shown (error messages neither).
I am attaching some of the configurations of my system and the strace final part of the command ...$ strace mpirun --hostfile /etc/hosts -np 128 ./hello
Note: If you include verbatim output (or a code block), please use a GitHub Markdown code block like below:
documents.zip
The text was updated successfully, but these errors were encountered: