Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ken's PRs #376

Open
5 of 10 tasks
cwsmith opened this issue Oct 14, 2022 · 12 comments
Open
5 of 10 tasks

Ken's PRs #376

cwsmith opened this issue Oct 14, 2022 · 12 comments
Assignees

Comments

@cwsmith
Copy link
Contributor

cwsmith commented Oct 14, 2022

@cwsmith
Copy link
Contributor Author

cwsmith commented Nov 4, 2022

meeting notes 11/4/2022

  • for a (partial) summary of how parallel mesh construction works see Construct a distributed parallel PUMI mesh from own file format #245
  • the current HEAD of develop has added multi-topology support via repeated calls to assemble followed by finalize (see

    core/apf/apfConvert.h

    Lines 34 to 67 in 8311337

    /** \brief assemble a mixed-cell-type mesh from just a connectivity array
    \details construct is now split into two functions,
    assemble and finalise. The premise of assemble being
    that it is called multiple times for a given cell type,
    across several different cell types in the input mesh. */
    void assemble(Mesh2* m, const int* conn, int nelem, int etype,
    GlobalToVert& globalToVert);
    /** \brief finalise construction of a mixed-cell-type mesh from just a connectivity array
    \details construct is now split into two functions,
    assemble and finalise. Once the mixed cell type mesh
    is assembled finalise should be called. Doing it this
    way provides non-breaking changes for current users of
    construct, which now just calls assemble and finalise. */
    void finalise(Mesh2* m, GlobalToVert& globalToVert);
    /** \brief construct a mesh from just a connectivity array
    \details this function is here to interface with very
    simple mesh formats. Given a set of elements described
    only in terms of the ordered global ids of their vertices,
    this function builds a reasonable apf::Mesh2 structure
    and as a side effect returns a map from global ids
    to local vertices. This functions assumes a uniform
    cell type. Use a combination of assemble and finalise for
    meshes loaded with mixed cell types.
    This is a fully scalable parallel mesh construction
    algorithm, no processor incurs memory or runtime costs
    proportional to the global mesh size.
    Note that all vertices will have zero coordinates, so
    it is often good to use apf::setCoords after this. */
    void construct(Mesh2* m, const int* conn, int nelem, int etype,
    GlobalToVert& globalToVert);
    )
  • the branch that exists for matchedNodeElementReader has one call to the construct api which takes in a numElements*maxDownwardVerts array for element-to-vertex connectivity (where vertices are defined by their global vertex id)
  • the mesh generator that feeds matchedNodeElementReader already produces elements in groups by topological type and can/will be changed to create files that have one array (numElements(elmType)*numDownwardVerts(elmType))
    • a test input for this will be created for the next meeting on 11/11
  • with the mesh generator change the modifications to construct will be abandoned and assemble will be called instead
    • the MeshInfo struct that handles these arrays in the matchedNodeElementReader.cpp driver will have to be modified to support split topology arrays

@cwsmith
Copy link
Contributor Author

cwsmith commented Nov 18, 2022

meeting notes 11/18/2022

  • multi topo file format
    • coordinate file does not change
    • element file
      • header per block/topology:
        • num verts per element
        • num elements
  • source of matching info
    • if the mesh generator does not provide per-entity matching info (remote copies), then a serial code (ideally in PUMI) should derive the entity level matching info from model attributes specified on model edges/faces
  • matching support in chef
    • ideally, a model attribute per face/edge pair defines whether matching is enabled on that pair
      • if a specific pair is enabled, then the mesh entities that are matched must have remote copy information for their matches
      • currently in chef matching is on (for all model face/edge pairs) if the mesh contains matching info and has to be disabled explicitly (which again, turns all model face/edge pair matching off)
      • in the current code, changing the default to 'off' would be helpful
  • extrusion mesh generation with matching
    • two cases, info is the same, but mesh generation and structure differs
      • perfect matching in a simple extrusion across one pair of geometric model faces
      • 2d root plane, any number of points along z, with each layer having varying z depths

@cwsmith
Copy link
Contributor Author

cwsmith commented Dec 2, 2022

meeting notes 12/2/2022

  • CWS will assume the files exist for reading in the multi-topo format discussed last time and write a draft of the reading logic for it
  • KJ will prepare an example multi-topo formatted file

@cwsmith
Copy link
Contributor Author

cwsmith commented Dec 9, 2022

meeting notes 12/9/2022

  • matchedNodeElementReader and serial mesh support for > 2B entities #380 created, compiles, multi-topo reader is a WIP
  • proposed file format for element-to-vertex multi-topology meshes
    • vertex coordinate file reading does not change
    • part ids start at 0
    • each part gets a file that contains a rectangular array, one for each topology present on that part, that provides element to vertexGlobalId connectivity in the order listed in the section of the header file for that part
    • one header file with:
1 1  No idea why we write this
3    also not sure what this is used for
<numelTotal>  <maxNodesPerElement>
Part1
    <numel_topo_1>   <NodesInElementTopo1>
    <numel_topo_2>   <NodesInElementTopo2>
    …. for as many topos as are in  Part 1
Repeat the above bock for  each part.

@cwsmith
Copy link
Contributor Author

cwsmith commented Jan 6, 2023

meeting notes 1/6/2022

  • found latest branch for chef reduction (across ALCF, NASA, colorado viz nodes) and added the branch to the OP list
  • added 4 part test case for matchedNodeElementReader to pumi-meshes SCOREC/pumi-meshes@7733c3e
    • mixed topo
    • in new file format
    • 4 parts: 0,1 - all hex, 2 - mixed, 3 - all wedges

@cwsmith
Copy link
Contributor Author

cwsmith commented Jan 13, 2023

meeting notes 1/13/2022

$ mpirun -np 4 ./test/matchedNodeElmReader   \
  /space/cwsmith/core/pumi-meshes/matchedNodeElementReader/geom3D.cnndt \
  /space/cwsmith/core/pumi-meshes/matchedNodeElementReader/geom3D.coord \
  NULL \
  /space/cwsmith/core/pumi-meshes/matchedNodeElementReader/geom3D.class \
  NULL \
  NULL \
  /space/cwsmith/core/pumi-meshes/matchedNodeElementReader/geom3DHead.cnn  \
  foo.dmg foo.smb

numVerts 352068
2 46941 8
2 18222 6
1 65037 8
3 153102 6
0 64704 8
isMatched 0
CR1 mysize=0 
CR5: self=3,myOffset=0,quotient=0 
0 after residence 
2 after residence 
3 after residence 
1 after residence 
0 done inside remotes 
1 done inside remotes 
2 done inside remotes 
3 done inside remotes 

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 28879 RUNNING AT cranium.scorec.rpi.edu
=   EXIT CODE: 136
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Floating point exception (signal 8)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
cwsmith@cranium: /space/cwsmith/buildPumiOptonSimonOmegaoff $ 

@cwsmith
Copy link
Contributor Author

cwsmith commented Jan 20, 2023

meeting notes 1/20/2022

  • Fixed the FPE bug with cacacac
  • The run now fails in mesh verification ('verify'). See the output below.
  • @KennethEJansen Does the setCoords int overflow of: output mean we are overflowing?
cwsmith@cranium: /space/cwsmith/buildPumiOptonSimonOmegaoff $ ../runMner.sh 
numVerts 352068
3 153102 6
1 65037 8
0 64704 8
2 46941 8
2 18222 6
isMatched 0
constructResidence: self=0,gid=0,ifirst=0  
constructResidence: self=1,gid=0,ifirst=0  
constructResidence: self=2,gid=91452,ifirst=0  
constructResidence: self=3,gid=488,ifirst=0  
constructResidence: self=0,gid=0,ifirst=0,max=352067  
constructResidence: self=1,gid=0,ifirst=0,max=352067  
constructResidence: self=2,gid=91452,ifirst=0,max=352067  
constructResidence: self=3,gid=488,ifirst=0,max=352067  
CR1 mysize=88017 
CR5: self=3,myOffset=264051,quotient=88017 
2 after residence 
0 after residence 
1 after residence 
3 after residence 
0 done inside remotes 
1 done inside remotes 
2 done inside remotes 
3 done inside remotes 
setCoords int overflow of: self=0,mySize=88017,total=352068, n=88017,to=0, quotient=88017, remainder=0 start=0, peers=4, sizeToSend=2112408, nverts=88017 
setCoords int overflow of: self=1,mySize=88017,total=352068, n=88017,to=1, quotient=88017, remainder=0 start=88017, peers=4, sizeToSend=2112408, nverts=88017 
setCoords int overflow of: self=2,mySize=88017,total=352068, n=88017,to=2, quotient=88017, remainder=0 start=176034, peers=4, sizeToSend=2112408, nverts=88017 
setCoords int overflow of: self=3,mySize=88017,total=352068, n=88017,to=3, quotient=88017, remainder=0 start=264051, peers=4, sizeToSend=2112408, nverts=88017 
fathers2D not requested 
seconds to create mesh 1.044
APF FAILED: apf::Verify: edge with 2 adjacent faces
centroid: (0.684542, 0.00348426, 0.0833333)
based on the following:
 - edge is classified on a model region
we would expect the adjacent face count to be at least 3

APF FAILED: apf::Verify: edge with 2 adjacent faces
centroid: (0.53322, -0.00227891, 0.0833333)
based on the following:
 - edge is classified on a model region
we would expect the adjacent face count to be at least 3

APF FAILED: apf::Verify: edge with 2 adjacent faces
centroid: (0.684311, 0.000954821, 0.0833333)
based on the following:
 - edge is classified on a model region
we would expect the adjacent face count to be at least 3

APF FAILED: apf::Verify: edge with 2 adjacent faces
centroid: (0.538636, -0.0196729, 0.0833333)
based on the following:
 - edge is classified on a model region
we would expect the adjacent face count to be at least 3


===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 32670 RUNNING AT cranium.scorec.rpi.edu
=   EXIT CODE: 134
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions

@cwsmith
Copy link
Contributor Author

cwsmith commented Feb 4, 2023

meeting notes 2/3/2023

@cwsmith
Copy link
Contributor Author

cwsmith commented Feb 17, 2023

meeting notes 2/17/2023

  • cgns merged - there is one CI failure that needs to be resolved
  • moved chef part count reduction up to the next PR
  • ken will provide a test that doesn't exit cleanly
    • on colorado: /projects/tools/SCOREC-core/core/pumi-meshes/phasta/4-1-Chef-Tet-Part/4-4-Chef-Part-ts20/run
    • changed split factor from 1 to -2 in adapt.inp
    • fails on free (double free we assume)
  • cws will rebase off develop and take a look at the failure

@cwsmith
Copy link
Contributor Author

cwsmith commented Feb 24, 2023

meeting notes 2/24/2023

@cwsmith
Copy link
Contributor Author

cwsmith commented Mar 17, 2023

meeting notes 3/17/2023

@cwsmith
Copy link
Contributor Author

cwsmith commented Jun 30, 2023

meeting notes 6/30/2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants