How does one even use PCGNS? #709

ferranj2 · 2023-05-12T05:46:07Z

ferranj2
May 12, 2023

Greetings.

I am trying to develop a CFD code with some colleagues and we looked to CGNS for a file exchange format.
The documentation for sequential CGNS and the few examples available online were enough to get us started for serial I/O.

However, when it comes to the parallel version of the library (PCGNS), the examples are remarkably scanty. The official website provides two versions (one Fortran, the other C) of essentially the same structured, single zone write. The next best thing I found online was a set of slides by a fellow named Hauser, but following his instructions produces buggy code (segmentation faults mostly).

My current objective is to write a mesh that is distributed over multiple MPI processes in parallel to .cgns format across many Zones. I understand that composing the file structure (i.e., the nodes) is not inherently a parallel/concurrent activity. So that I will relegate to a single process. However, once the structure is set, I should be able to concurrently tell a bunch of MPI processes to write their data (coordinates, flow solution) simultaneously. This I cannot seem to do.

The best I could come up with was to have one process (rank-0) open a file in sequential mode with cg_open(cgns_filename, CG_MODE_WRITE, &fn), then the same process writes the base node and the zones. Via MPI_Send and MPI_Recv, the other processes let rank-0 know how the Zones should be written. So far, rank-0 creates the zones and places the GridCoordinates node under each Zone. After rank-0 does this, it closes the file and then all processes open the file with cgp_file_open with CG_MODE_MODIFY.

The idea being that now that the file is constructed, all processes can reference their respective zone and write their coordinates. I attempted this with cgp_coord_write but I get segmentation faults (SIGSEGV). I ran valgrind to attempt to determine where they occur, but it is unable to trace the SIGSEGV to CGNS's API's and blames the underlying HDF5 APIs instead.

Any help, insight, or links to better PCGNS resources would be greatly appreciated.

Here pseudo-code of what I have right now:

*Get rank and comm_size*

int B;
int Z;
int G;
int fn;
int topology_dim = *some val*;
int coordinate_dim = *some val*;
cgsize_t size[3];
size[0] = *# vertices in rank's partition*;
size[1] = *# cells in rank's partition*;
size[0] = 0; //I cannot sort by boundary vertices.

if(rank == 0){
    cg_open(cgns_filename, CG_MODE_WRITE, &fn)
    cg_base_write(fn, "Base", topology_dim, coordinate_dim, &B);
    cg_goto(fn, B, "end");
    if(multizone){
      /* Rank-0 creates its own Zone */
      cg_zone_write(fn, B, "Zone0", size, CGNS_ENUMV(Unstructured), &Z);
      cg_grid_write(fn, B, Z, "GridCoordinates",&G);
      cgsize_t size_other[3];//
      /* Rank-0 begins to create the Zones for the other ranks */
      for(int rr = 1; rr < comm_size; rr++){
        MPI_Recv(&size_other, 3, CGNS_CGSIZE_T_TO_MPI_DATATYPE, rr,0, comm, MPI_STATUS_IGNORE)
        memset(zonename, 0, sizeof(zonename));
        sprintf(zonename,"Zone%d",rr);
        cg_zone_write(fn, B, zonename, size_other, CGNS_ENUMV(Unstructured), &Z);
        MPI_Send(&zonename, 10,MPI_CHAR, rr,0, comm);
        /* Create the Nodes to store the coordinates. */
        cg_grid_write(fn, B, Z, "GridCoordinates",&G);
      }//end "rr".
    }else{
     //WIP
    }//end if (multizone)
    cg_close(fn);//Rank-0 closes the file it opened in serial mode.
  }else{//The actions that the other ranks do to assist rank-0.
    if(multizone){
      /* Send/receive information about the assigned Zone_t */
      MPI_Send(&size, 3,CGNS_CGSIZE_T_TO_MPI_DATATYPE, 0,0, comm);//NOTE: MACRO USAGE FOR MPI_DATATYPE
      MPI_Recv(&zonename, 10, MPI_CHAR,0,0, comm, MPI_STATUS_IGNORE))
    }else{
        //WIP.
    }//end if (multizone)

  cgp_mpi_comm(comm);
  cgp_pio_mode(CGP_INDEPENDENT;
  cgp_open(cgns_filename, CG_MODE_MODIFY, &fn);
  cg_goto(fn, B, "end");
  Z = (int)rank + 1;
  cgp_coord_write(fn, B, Z, RealDouble, "GridCoordinatesA", &C); //<- This causes SIGSEGV and Valgrind blames HDF5

mennodeij · 2023-05-12T07:17:15Z

mennodeij
May 12, 2023

As an example of writing polyhedra in parallel, I came up with the code below a while ago. It has some assumptions, e.g. no shared vertices or faces between ranks, which is unlikely in decomposed CFD grids. You would have to do more bookkeeping to get that correct I think.

Maybe the example is useful to get started with parallel cgns? For starters, you don't need to first open the file serially, write the bases and zones, then re-open it in parallel mode with cgp_open.

Maybe it can be added to the examples if it is indeed useful.

/*
  Parallel writing of polyhedral cells to CGNS file.
  This requires parallel CGNS and HDF5 libraries. 
  
  HDF5 v1.12.1 (latest). 
  configure with:
  ./configure --prefix=<path/to/install> --enable-fortran --enable-fortran2003 --enable-hl --enable-parallel --enable-shared
  make && make install
  
  CGNS 4.2.0 (latest). 
  configure with:
  mkdir build; cd build
  export HDF5_ROOT=<path/to/install>
  cmake </path/to/CGNS/source> -DCMAKE_INSTALL_PREFIX=<path/to/install> \
                               -DCGNS_ENABLE_PARALLEL=1 \
                               -DCGNS_BUILD_CGNSTOOLS=1 \
                               -DCGNS_ENABLE_FORTRAN=1 \
                               -DCGNS_ENABLE_HDF5=1 \
                               -DHDF5_NEED_MPI=1 \
                               -DCGNS_BUILD_SHARED=1 \
                               -DCGNS_USE_SHARED=1 \
                               -DCGNS_BUILD_CGNSTOOLS=1 \
                               -DCMAKE_Fortran_FLAGS=-fPIC
  make && make install

*/

#include <assert.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#ifdef _WIN32
#include <io.h>
#define unlink _unlink
#else
#include <unistd.h>
#endif

#include "mpi.h"
#include "pcgnslib.h"

void callCGNS(int error) {
  if (error != CG_OK)
    cgp_error_exit();
}

int main(int argc, char **argv) {
  // set up MPI
  int comm_size;
  int comm_rank;
  int ierr = 0;

  // parallel/serial read communicators
  MPI_Comm comm_parallel = MPI_COMM_WORLD;
  MPI_Comm comm_serial = MPI_COMM_SELF;

  MPI_Init(NULL, NULL);
  MPI_Comm_size(comm_parallel, &comm_size);
  MPI_Comm_rank(comm_parallel, &comm_rank);

  printf("hello from %d/%d\n", comm_rank, comm_size);

  // Every process has 10 vertices to build three polyhedra
  // 1) a central cube, and 2/3) two pyramids sticking out
  // above and below the cube
#define NNODES 10
  double xs[NNODES] = {0, 1, 1, 0, 0, 1, 1, 0, 0.5,  0.5};
  double ys[NNODES] = {0, 0, 1, 1, 0, 0, 1, 1, 0.5,  0.5};
  double zs[NNODES] = {0, 0, 0, 0, 1, 1, 1, 1, 2.0, -1.0};

#define FACEARRAY_SIZE 48
  cgsize_t faces[FACEARRAY_SIZE] = {
    1, 2, 3, 4, // cube bottom
    5, 6, 7, 8, // cube top
    1, 2, 6, 5, // cube front
    2, 3, 7, 6, // cube right
    3, 4, 8, 7, // cube back
    1, 4, 8, 5, // cube left
    5, 6, 9,    // top pyramid
    6, 7, 9,    // top pyramid
    7, 8, 9,    // top pyramid
    8, 5, 9,    // top pyramid
    1, 2, 10,   // bottom pyramid
    2, 3, 10,   // bottom pyramid
    3, 4, 10,   // bottom pyramid
    4, 1, 10    // bottom pyramid
  };

#define NFACES 14
  cgsize_t face_offsets[1+NFACES] = {
    0, // initial offset is zero
    4, 8, 12, 16, 20, 24, // cube
    27, 30, 33, 36,       // top pyramid
    39, 42, 45, 48        // bottom pyramid
  };

#define CELLARRAY_SIZE 16
  cgsize_t cells[CELLARRAY_SIZE] = {
    1,  2,  3,  4,  5, 6, //cube
    2,  7,  8,  9, 10,   // top pyramid
    1, 11, 12, 13, 14 // bottom pyramid
  };

#define NCELLS 3
  cgsize_t cell_offsets[1+NCELLS] = {
    0,
    6, 11, 16
  };

  // move each grid away from the others by shifting 4 units in Z-direction
  for(int i = 0; i < NNODES; ++i){
    zs[i] = zs[i] + 4*comm_rank;
  }

  // there are no shared faces between grids on different ranks
  for(int i = 0; i < FACEARRAY_SIZE; ++i) {
    faces[i] = faces[i] + comm_rank * NNODES;
  }
  for(int i = 0; i < CELLARRAY_SIZE; ++i) {
    cells[i] = cells[i] + comm_rank * NFACES;
  }
  for(int i = 0; i <= NFACES; ++i)
  {
    face_offsets[i] = face_offsets[i] + comm_rank*FACEARRAY_SIZE;
  }
  for(int i = 0; i <= NCELLS; ++i)
  {
    cell_offsets[i] = cell_offsets[i] + comm_rank*CELLARRAY_SIZE;
  }

  int F, B, Z, Cx, Cy, Cz, S, E;
  // open the file
  callCGNS(cg_configure(CG_CONFIG_COMPRESS, 0));
  callCGNS(cgp_mpi_comm(comm_parallel));
  callCGNS(cgp_open("./par-polyhedra.cgns", CG_MODE_WRITE, &F));
  
  // write the base
  int cell_dim = 3;
  int phys_dim = 3;
  callCGNS(cg_base_write(F, "Base_Volume_Elements", cell_dim, phys_dim, &B));

  // write the zone
  cgsize_t zoneSize[9];
  zoneSize[0] = NNODES * comm_size;
  zoneSize[1] = NCELLS * comm_size;
  zoneSize[2] = 0;
  callCGNS(cg_zone_write(F, B, "Zone_Interior", zoneSize, Unstructured, &Z));

  // start writing coordinates
  callCGNS(cgp_coord_write(F, B, Z, RealDouble, "CoordinateX", &Cx));
  callCGNS(cgp_coord_write(F, B, Z, RealDouble, "CoordinateY", &Cy));
  callCGNS(cgp_coord_write(F, B, Z, RealDouble, "CoordinateZ", &Cz));
  
  // write coordinate data
  cgsize_t start = 1;
  cgsize_t end   = NNODES;
  start += comm_rank*NNODES;
  end   += comm_rank*NNODES;
  callCGNS(cgp_coord_write_data(F, B, Z, Cx, &start, &end, xs));
  callCGNS(cgp_coord_write_data(F, B, Z, Cy, &start, &end, ys));
  callCGNS(cgp_coord_write_data(F, B, Z, Cz, &start, &end, zs));

  // start writing the NGON_n faces section
  start = 1;
  end   = comm_size*NFACES;
  cgsize_t offsetsTotalSize = comm_size * FACEARRAY_SIZE;
  callCGNS(cgp_poly_section_write(F, B, Z, "Elements 2D", NGON_n, start, end,
                                  offsetsTotalSize, 0, &E));
  // write the faces data of this process
  start = 1 + comm_rank *NFACES;
  end   = (1+ comm_rank)*NFACES;
  callCGNS(cgp_poly_elements_write_data(F, B, Z, E, start, end, faces, face_offsets));

  // start writing the NFACE_n cells section
  start = 1;
  end = comm_size*NCELLS;
  offsetsTotalSize = comm_size * CELLARRAY_SIZE;
  callCGNS(cgp_poly_section_write(F, B, Z, "Elements 3D", NFACE_n, start, end,
                                  offsetsTotalSize, 0, &E));
  
  // write the cells data of this process
  start = 1 + comm_rank *NCELLS;
  end   = (1+ comm_rank)*NCELLS;
  callCGNS(cgp_poly_elements_write_data(F, B, Z, E, start, end, cells, cell_offsets));

  // close the file and finalize MPI
  callCGNS(cgp_close(F));
  MPI_Finalize();
  return 0;
}

1 reply

ferranj2 May 12, 2023
Author

Thank you for the reply and for the sample script.
Unfortunately, this didn't help.

Does your code work for single-zone files only?
I'm stuck trying to get the processes to write multiple zones simultaneously.
If so, then I indeed have to do some sequential I/O on rank-0 prior to using PCGNS.

When I run the following:

*Get MPI comm & rank*

const char* cgns_filename = "test.cgns";
int F, B, Z;
int topology_dim = 3;
int coordinate_dim = 3;
int vertices = *some val*;
int dom_elem = *some_val*;
cgsize_t size[3];
size[0] = (cgsize_t)vertices;
size[1] = (cgsize_t)dom_elem;
size[2] = 0;

char zonename[10];
memset(zonename, 0, sizeof(zonename));
sprintf(zonename,"Zone%d",rank);

cg_configure(CG_CONFIG_COMPRESS, 0);
cgp_mpi_comm(comm); //comm = MPI_COMM_WORLD
cgp_open(cgns_filename, CG_MODE_WRITE, &F)
cg_base_write(F,"Base",topology_dim, coordinate_dim,&B);
cg_zone_write(fn, B, zonename, size, Unstructured, &Z);
cgp_close(F);

The output CGNS file has a single Zone even though all MPI processes call cg_zone_write with their own custom Zone names.

Since, apparently, there is no minimum reproducible example for writing a MULTIZONE .cgns file in parallel, I need to know whether the parallel write is possible only to a single zone. Or, whether it is possible to create a MULTIZONE .cgns file sequentially (meaning, Bases, Zones, Sections, Coordinates, Boundary Conditions, etc but not the DataArray_t nodes), and then in parallel have all ranks make their respective edits.

gsjaardema · 2023-05-12T13:00:50Z

gsjaardema
May 12, 2023
Collaborator

I don't have a standalone snippet of code to show, but the IOSS library had CGNS input and output in both parallel and serial. It may provide some insights on parallel CGNS output. See https://github.com/sandialabs/seacas/tree/master/packages/seacas/libraries/ioss/src/cgns. The Iocgns_ParallelDatabase.C and Iocgns_DecompositionData.C have most of the parallel read/write calls. It won't be too easy to understand, but it does work and seems to scale fairly well. Apologies for the lack of documentation.

0 replies

mennodeij · 2023-05-15T06:21:07Z

mennodeij
May 15, 2023

The idea being that now that the file is constructed, all processes can reference their respective zone and write their coordinates.

I missed out on this in the first read, sorry.

You could let each process write their own CGNS file, and create a single "whole grid" CGNS file that contains file links to each part. It does mean that you will have a lot of files, but you don't need to use any parallel CNGS routines, and it will scale as well as your storage can process the data.
The idea, I think, of parallel CGNS is to have multiple process all write to one file that is then usable by any number of processes, i.e. contains the complete grid in a single zone.

1 reply

ferranj2 Jun 6, 2023
Author

Hi!

Sorry for not replying.
For my application, I can easily produce these sequential CGNS files.
However, instead of "links", could they be merged somehow into a single file?
I wouldn't want these links to be more than just a stepping stone.

If CGNS doesn't offer this capability, then perhaps the merging could be done via HDF5?

adrianjhpc · 2023-06-06T13:57:12Z

adrianjhpc
Jun 6, 2023

I do parallel CGNS with a single file and each process writes a separate zone. Works fine for me. However, the flow to do it is somewhat counter intuitive. I'm using Fortran, the same should work fine for C with the equivalent calls:

      call cgp_pio_mode_f(CGP_INDEPENDENT, ierr)
      if(ierr .ne. CG_OK) then
         write(*,*) 'cgp_pio_mode_f error writeparallelcgnsheader'
         call cg_error_print_f()
         call abortmpi()
      end if

      call cgp_open_f(flowtec,CG_MODE_WRITE,fid(n),ierr)
      if(ierr .ne. CG_OK) then
         write(*,*) 'cg_open_f error'
         call cg_error_print_f()
      end if
      call cg_base_write_f(fid(n),'gridbase',3,3,basenum,ierr)
      if(ierr .ne. CG_OK) then
         write(*,*) 'cg_base_write_f error'
         call cg_error_print_f()
      end if

!     AJ Write all the blocks as zones.
      do i = 1,nblocks
         blocknum = i

         imax = g_i_imax(blocknum,nl)
         jmax = g_j_jmax(blocknum,nl)
         kmax = g_k_kmax(blocknum,nl)

         sizes(1,1) = imax+1+1
         sizes(2,1) = jmax+1+1
         sizes(3,1) = kmax+1+1

         sizes(1,2) = imax+1
         sizes(2,2) = jmax+1
         sizes(3,2) = kmax+1

         sizes(1,3) = 0
         sizes(2,3) = 0
         sizes(3,3) = 0
         write(zonename, "(A5,I6)") "block",blocknum
         call cg_zone_write_f(fid(n),basenum,zonename,sizes,
     &        Structured,blocknum,ierr)
         if(ierr .ne. CG_OK) then
            write(*,*)
     &           'cg_zone_write_f error in writeparallelcgnsheader'
            call cg_error_print_f()
         end if
         call cg_sol_write_f(fid(n),basenum,blocknum,'FlowSolution1',
     &        CellCenter,solnum,ierr)
         if(ierr .ne. CG_OK) then
            write(*,*)
     &           'cg_sol_write_f error in writeparallelcgnsheader'
            call cg_error_print_f()
         end if
         call cgp_coord_write_f(fid(n),basenum,blocknum,RealDouble,
     &        'CoordinateX',xnum,ierr)
         if(ierr .ne. CG_OK) then
            write(*,*) 'X cgp_coord_write_f error'
            call cg_error_print_f()
         end if
         call cgp_coord_write_f(fid(n),basenum,blocknum,RealDouble,
     &        'CoordinateY',ynum,ierr)
         if(ierr .ne. CG_OK) then
            write(*,*) 'Y cgp_coord_write_f error'
            call cg_error_print_f()
         end if
         call cgp_coord_write_f(fid(n),basenum,blocknum,RealDouble,
     &        'CoordinateZ',znum,ierr)
         if(ierr .ne. CG_OK) then
            write(*,*) 'Z cgp_coord_write_f error'
            call cg_error_print_f()
         end if
         call cgp_field_write_f(fid(n),basenum,blocknum,solnum,
     &        RealDouble,'Density',flownum,ierr)
         if(ierr .ne. CG_OK) then
            write(*,*) 'cgp_field_write_f density error'
            call cg_error_print_f()
         end if
         call cgp_field_write_f(fid(n),basenum,blocknum,solnum,
     &        RealDouble,'VelocityX',flownum,ierr)
         if(ierr .ne. CG_OK) then
            write(*,*) 'cgp_field_write_f velocityX error'
            call cg_error_print_f()
         end if
         call cgp_field_write_f(fid(n),basenum,blocknum,solnum,
     &        RealDouble,'VelocityY',flownum,ierr)
         if(ierr .ne. CG_OK) then
            write(*,*) 'cgp_field_write_f velocityY error'
            call cg_error_print_f()
         end if
         call cgp_field_write_f(fid(n),basenum,blocknum,solnum,
     &        RealDouble,'VelocityZ',flownum,ierr)

etc....

The above code, run by all processes, sets up the file structure/metadata. The counter intuitive part is all processes write out all the metadata for all zones. Strange, but seems to be required. (happy to be told otherwise, but that was what was required to get it working for me).

Then I close the file (I don't think this is strictly necessary, but at one point I thought it the safest thing to do), reopen it and then write the data. The data can then be written in parallel, and separately, with code like this:

            xnum = 1
            call cgp_coord_write_data_f(fid(n),basenum,zonenum,xnum,
     &           minrange,maxrange,localx,ierr)
            if(ierr .ne. CG_OK) then
               write(*,*) 'X cgp_coord_write_data_f error'
               call cg_error_print_f()
            end if
            ynum = 2
            call cgp_coord_write_data_f(fid(n),basenum,zonenum,ynum,
     &           minrange,maxrange,localy,ierr)
            if(ierr .ne. CG_OK) then
               write(*,*) 'Y cgp_coord_write_data_f error'
               call cg_error_print_f()
            end if
            znum = 3
            call cgp_coord_write_data_f(fid(n),basenum,zonenum,znum,
     &           minrange,maxrange,localz,ierr)
            if(ierr .ne. CG_OK) then
               write(*,*) 'Z cgp_coord_write_data_f error'
               call cg_error_print_f()
            end if
            solnum = 1
            flownum = 1
            call cgp_field_write_data_f(fid(n),basenum,zonenum,solnum,
     &           flownum,minrange,maxrange,density,ierr)
            if(ierr .ne. CG_OK) then
               write(*,*) 'cgp_field_write_data_f density error'
               call cg_error_print_f()
            end if
            flownum = flownum + 1
            call cgp_field_write_data_f(fid(n),basenum,zonenum,solnum,
     &           flownum,minrange,maxrange,velocityX,ierr)
            if(ierr .ne. CG_OK) then
               write(*,*) 'cgp_field_write_data_f velocityX error'
               call cg_error_print_f()
            end if
            flownum = flownum + 1
            call cgp_field_write_data_f(fid(n),basenum,zonenum,solnum,
     &           flownum,minrange,maxrange,velocityY,ierr)
            if(ierr .ne. CG_OK) then
               write(*,*) 'cgp_field_write_data_f velocityY error'
               call cg_error_print_f()
            end if

etc...

Where each process is writing its own zones independently by specifying zonenum. If you have the same number of zones across processes you could enable CGP_COLLECTIVE.

It took me a while to figure out the specific control flow needed to keep CGNS happy, but the above works for me now.

Happy to provide more details if it helps.

2 replies

ferranj2 Jun 6, 2023
Author

Thank you for the input.

I was actually doing the exact same thing (in principle) to get mine started.
My grids, however, are all Unstructured and I need to write-out the GridConnectivity_t nodes.

Instead of having all my processors write-out the meta data, I just have Process-0 do it by itself.
I get Process-0 to know how to size everything by having it MPI_Recv information from other processes.
Once Process-0 is done, it closes the file.

Then, I reopen the file, but, not as a CGNS file, rather, as an HDF5 file.
Then, I am able to populate the CGNS file in parallel by using HDF5 routines.
My approach will obviously not work for CGNS files based on the ADF format.
So, far it seems to be mostly working.
Once I have it working, I will see about posting a reproducible example.

MicK7 Jun 13, 2023
Maintainer

Maybe to write connectivities, you could try cgp_section_write or cgp_poly_section_write to directly create the node in parallel mode. Sizing everything will still be needed but it would prevent a close/open cycle.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does one even use PCGNS? #709

{{title}}

Replies: 4 comments 4 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

How does one even use PCGNS? #709

ferranj2 May 12, 2023

Replies: 4 comments · 4 replies

mennodeij May 12, 2023

ferranj2 May 12, 2023 Author

gsjaardema May 12, 2023 Collaborator

mennodeij May 15, 2023

ferranj2 Jun 6, 2023 Author

adrianjhpc Jun 6, 2023

ferranj2 Jun 6, 2023 Author

MicK7 Jun 13, 2023 Maintainer

ferranj2
May 12, 2023

Replies: 4 comments 4 replies

mennodeij
May 12, 2023

ferranj2 May 12, 2023
Author

gsjaardema
May 12, 2023
Collaborator

mennodeij
May 15, 2023

ferranj2 Jun 6, 2023
Author

adrianjhpc
Jun 6, 2023

ferranj2 Jun 6, 2023
Author

MicK7 Jun 13, 2023
Maintainer