Parallel post-processing with CGNS : performance issues #746
Replies: 3 comments 9 replies
-
If you have 50000 cores does it means that you will be writing 50000 CGNS Zones ? If this the case then it is expected that the CGNS library will perform badly when creating all the Zones. Concerning keeping the file open during the simulation I will let @brtnfld answer I because I tend to write each time step in separate files so I have no feedback on writing all timesteps of a huge computation in the same file. How do you handle corruption in this case ? |
Beta Was this translation helpful? Give feedback.
-
Hello Mickael and thanks for your reply ! For the moment, I only write via the CGNS library and I am not using (for the moment) the reading API ... Say I have n procs 1 - I loop on the n procs and I write only the meta-data information (so all procs write same information)
2- Now each proc writes his data So if I understand correctly, you are telling me that the call of cg_zone_write and cg_grid_write in the loop is a bad practice and should be called once (for all processors) with the global nb_elem and nb_som ? Thanks |
Beta Was this translation helpful? Give feedback.
-
@brtnfld : Would it be possible to write a list of Zone_t nodes with only one call at the HDF5 level to get better performance ? |
Beta Was this translation helpful? Give feedback.
-
Dear all,
I started recently using CGNS as a post-processing format for our CFD code.
I am really happy with CGNS, its really well documented and clear to use. However, my objective was to find the best open source library that can be used to write the results of a simulation, in a "single file" and in an efficient parallel way.
After finishing my c++ class, and after doing a simple weak scaling test for my new post-processing class that uses CGNS, I noted the following : the time cost increases with the number of procs. Here is the graph
I don't know if this is expected or classical for CGNS, but I was expecting the cost to be somehow stable ... so what will happen if I run my code on 50000 MPI cores ... this will be too costy !
Any one can give me some advice ? Have you noted the same issue or you managed to obtain an efficient performant parallel data write with CGNS ?
here are some more precisions of my code :
Thanks
Regards
Beta Was this translation helpful? Give feedback.
All reactions