New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed job while using mpi parallel #1170
Comments
It seems that IMAIN is currently set to ISTANDARD_OUTPUT. You can change that by modifying the file setup/constants.h.in
! uncomment this to write to standard output (i.e. to the screen)
integer, parameter :: IMAIN = ISTANDARD_OUTPUT
! uncomment this to write messages to a text file
! integer, parameter :: IMAIN = 42
to
! uncomment this to write to standard output (i.e. to the screen)
! integer, parameter :: IMAIN = ISTANDARD_OUTPUT
! uncomment this to write messages to a text file
integer, parameter :: IMAIN = 42
Do not forget to configure and compile after changing the constants.h.in file.
Best,
Hom Nath
…________________________________
From: name_phobic ***@***.***>
Sent: Tuesday, December 6, 2022 4:40 AM
To: SPECFEM/specfem2d ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [SPECFEM/specfem2d] Failed job while using mpi parallel (Issue #1170)
When changing the "NUMBER_OF_SIMULTANEOUS_RUNS = 3" (in the Par_file) and running using mpi with 3 processes, the job was failing with an error message:
"must not have IMAIN == ISTANDARD_OUTPUT when NUMBER_OF_SIMULTANEOUS_RUNS > 1 otherwise output to screen is mingled. Change this in specfem/setup/constant.h.in and recompile.
Error detected, aborting MPI... proc 0"
This was performed using an HPC with 3 nodes.
I want to generate the results for 39 source positions simultaneously. Kindly help me figure this out.
—
Reply to this email directly, view it on GitHub<https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSPECFEM%2Fspecfem2d%2Fissues%2F1170&data=05%7C01%7Chomnath.gharti%40queensu.ca%7C32a76ed1563c4d163a4308dad76dde86%7Cd61ecb3b38b142d582c4efb2838b925c%7C1%7C0%7C638059164123085611%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=uRF%2FF5Q2DL0%2Fhom9uHFS%2BYSh9I4NL7X%2BlG6uJHFk1mY%3D&reserved=0>, or unsubscribe<https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABMCQ4XLN34NOYNNSL3BY3LWL4CXVANCNFSM6AAAAAASVJAUUM&data=05%7C01%7Chomnath.gharti%40queensu.ca%7C32a76ed1563c4d163a4308dad76dde86%7Cd61ecb3b38b142d582c4efb2838b925c%7C1%7C0%7C638059164123085611%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=z0suJmYX9aXSyuTWiG2REzJcEQeCEbIdpyt6s5Qvgek%3D&reserved=0>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Hello Sir Thank you very much for responding quickly. Also, i want to understand what changes do i need to make and in which file in order to run the same model for different source locations (parallely as in using mpi). It would be great if you could help me out. Thankyou very much. |
Make sure that the path to the MPI include is included in LD_INCLUDE_PATH, or you can directly set MPI include path in the configure command using MPI_INC variable:
For example,
./configure CC=icc CXX=icpc FC=ifort MPIFC=mpiifort MPI_INC=/scinet/intel/psxe/2020u4/compilers_and_libraries_2020.4.304/linux/mpi/intel64/include
Your compilers and path may be different.
Best,
Hom Nath
…________________________________
From: name_phobic ***@***.***>
Sent: Tuesday, December 6, 2022 12:00 PM
To: SPECFEM/specfem2d ***@***.***>
Cc: Hom Nath Gharti ***@***.***>; Comment ***@***.***>
Subject: Re: [SPECFEM/specfem2d] Failed job while using mpi parallel (Issue #1170)
Hello Sir
Thank you very much for responding quickly.
I tried running the solver after making the necessary changes suggested by you. I got the following error :
"configure: error: MPI header not found; try setting MPI_INC."
Also, i want to understand what changes do i need to make and in which file in order to run the same model for different source locations (parallely as in using mpi).
It would be great if you could help me out.
Thankyou very much.
—
Reply to this email directly, view it on GitHub<https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSPECFEM%2Fspecfem2d%2Fissues%2F1170%23issuecomment-1339682108&data=05%7C01%7Chomnath.gharti%40queensu.ca%7C8f63eae18cbf4750ef8008dad7ab6711%7Cd61ecb3b38b142d582c4efb2838b925c%7C1%7C0%7C638059428407784673%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hgsFqGQ1COe6AEXJ6bnLvt8J6ekMS8ZZ3LzAQMQ4868%3D&reserved=0>, or unsubscribe<https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABMCQ4X2YYLFKO26VFB76TTWL5WLPANCNFSM6AAAAAASVJAUUM&data=05%7C01%7Chomnath.gharti%40queensu.ca%7C8f63eae18cbf4750ef8008dad7ab6711%7Cd61ecb3b38b142d582c4efb2838b925c%7C1%7C0%7C638059428407940847%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=aX5Po9whgnOTEg5xu0gt%2BIm7x6kURt9XfZ9OmjqOHwM%3D&reserved=0>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Hello Sir I have updated the path now. I am getting the following error" Error: the number of MPI processes 1 is not a multiple of NUMBER_OF_SIMULTANEOUS_RUNS = 2 MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD I will be really grateful if you could help me figure this out. Thank you very much. |
You should run specfem2d using the total cores = NUMBER_OF_SIMULTANEOUS_RUNS*NPROC:
mpiexec -n 8 ./bin/xspecfem2D
You should also create two directories: run0001 and run0002, where you should put your input files.
…________________________________
From: name_phobic ***@***.***>
Sent: Wednesday, December 7, 2022 9:25 AM
To: SPECFEM/specfem2d ***@***.***>
Cc: Hom Nath Gharti ***@***.***>; Comment ***@***.***>
Subject: Re: [SPECFEM/specfem2d] Failed job while using mpi parallel (Issue #1170)
Hello Sir
I have updated the path now.
After setting the variable "NUMBER_OF_SIMULTANEOUS_RUNS = 2" and "NPROC =4" since NPROC is required to be a multiple of NUMBER_OF_SIMULTANEOUS_RUNS in the "Par_file"
I am getting the following error"
Error: the number of MPI processes 1 is not a multiple of NUMBER_OF_SIMULTANEOUS_RUNS = 2
the number of MPI processes is not a multiple of NUMBER_OF_SIMULTANEOUS_RUNS. Make sure you call meshfem2D with mpirun.
Error detected, aborting MPI... proc 0
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 30.
________________________________
I will be really grateful if you could help me figure this out.
Thank you very much.
—
Reply to this email directly, view it on GitHub<https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSPECFEM%2Fspecfem2d%2Fissues%2F1170%23issuecomment-1341043982&data=05%7C01%7Chomnath.gharti%40queensu.ca%7Cfea51032f7234298227e08dad85ee3a0%7Cd61ecb3b38b142d582c4efb2838b925c%7C1%7C0%7C638060199298058885%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6A6IQZJbJdrSaC7oCgsFGga1ZMj0VieI9%2FMwI1PY24E%3D&reserved=0>, or unsubscribe<https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABMCQ4SWU5I6RDM7AS6HP4DWMCM5PANCNFSM6AAAAAASVJAUUM&data=05%7C01%7Chomnath.gharti%40queensu.ca%7Cfea51032f7234298227e08dad85ee3a0%7Cd61ecb3b38b142d582c4efb2838b925c%7C1%7C0%7C638060199298058885%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MRr8DJOfuvZfWuPyfQ1aUsnKck0tm2Hn%2BQXltbwlJD4%3D&reserved=0>.
You are receiving this because you commented.Message ID: ***@***.***>
|
@homnath , I followed what you have said so far. I am running two simulations (NUMBER_OF_SIMULTANEOUS_RUNS = 2 & NPROC =4). I have also created run0001 and run0002 directories, both containing DATA and OUTPUT_FILES directories. Then this is what I have got when I ran
I looked at the /specfem2d/src/specfem2D/save_model_files.f90 code (is this the correct file to look at?). On line 125, it says: There are a bunch of lines with the mentioned statement in the comment above, but could you please give me some hint/info how to proceed and potentially run my simulations simultaneously? Thanks very much, |
When changing the "NUMBER_OF_SIMULTANEOUS_RUNS = 3" (in the Par_file) and running using mpi with 3 processes, the job was failing with an error message:
"must not have IMAIN == ISTANDARD_OUTPUT when NUMBER_OF_SIMULTANEOUS_RUNS > 1 otherwise output to screen is mingled. Change this in specfem/setup/constant.h.in and recompile.
Error detected, aborting MPI... proc 0"
This was performed using an HPC with 3 nodes.
I want to generate the results for 39 source positions simultaneously. Kindly help me figure this out.
The text was updated successfully, but these errors were encountered: