Set default allocator id to device instead of unified for CUDA and HIP execution policies #1316

bmhan12 · 2024-04-10T20:37:53Z

This PR:

Makes device the default Umpire allocator id instead of unified for CUDA and HIP execution policies.
Adjusts unit tests and examples to reflect device memory as the new default.
- For tests and examples where device id for memory allocation did not work, unified id is used instead.

Follow-up work is required to revisit tests and examples that still require unified memory, and adjust the source implementation.
These include:

quest DistributedClosestPoint
quest SignedDistance
quest MeshTester
quest IntersectionShaper
spin ImplicitGrid (getCandidates() function call)
mint for_all_cells/ for_all_nodes/ for_all_faces (???)

kennyweiss

Thanks @bmhan12 !

src/axom/core/tests/core_execution_for_all.hpp

src/axom/mint/tests/mint_execution_cell_traversals.cpp

kennyweiss · 2024-04-12T22:20:59Z

src/axom/primal/tests/primal_boundingbox.cpp

@@ -40,7 +40,10 @@ void check_bb_policy()
      box[i].isValid();
    });

-  EXPECT_EQ(box[0], BoundingBoxType(PointType(0.0), PointType(10.0)));
+  BoundingBoxType box_host;


Minor: Would this code be simpler if box used a StackArray instead of allocating and deallocating one bounding box?

kennyweiss · 2024-04-12T22:23:38Z

src/axom/primal/tests/primal_clip.cpp

@@ -329,6 +329,12 @@ void unit_check_poly_clip()
  //      |
  // In addition, vertices 0 and 3 should be marked as clipped.

+  const int current_allocator = axom::getDefaultAllocatorID();


Similarly here:
Do we have a better pattern for allocating/deallocating moving a single object between host and device, e.g. using a StackArray ?

(Not required for this PR, but if we have a cleaner way of doing this, we should prefer to use it in the unit tests)

Do we have a better pattern for allocating/deallocating moving a single object between host and device, e.g. using a StackArray ?

Not that I am aware of. The best I have found is either using axom::allocate & axom::copy, or using axom::Array constructors to move between host and device.

kennyweiss · 2024-04-12T22:26:38Z

src/axom/primal/tests/primal_clip.cpp

+      tet[0] = TetrahedronType(PointType {1, 0, 0},
+                               PointType {1, 1, 0},
+                               PointType {0, 1, 0},
+                               PointType {1, 0, 1});
+
+      hex[0] = HexahedronType(PointType {0, 0, 0},
+                              PointType {1, 0, 0},
+                              PointType {1, 1, 0},
+                              PointType {0, 1, 0},
+                              PointType {0, 0, 1},
+                              PointType {1, 0, 1},
+                              PointType {1, 1, 1},
+                              PointType {0, 1, 1});


Does this change require setDefaultAllocatorID() to have been set above? Or does it work because its constructed within the appropriate ExecPolicy ?

If it's the former, is there a way to make this work without setting the default allocator?

Does this change require setDefaultAllocatorID() to have been set above? Or does it work because its constructed within the appropriate ExecPolicy ?

The latter, and it works because the hexahedron constructor is host-device decorated.

kennyweiss · 2024-04-12T22:28:26Z

src/axom/primal/tests/primal_clip.cpp

+  // Shapes on host
+  TetrahedronType tet_host;
+  OctahedronType oct_host;
+  PolyhedronType res_host;


Likely outside the scope of this PR, but we should probably have a standardized convention for labeling variables as host, device and views and use them consistently within axom and its tests and examples.

Perhaps we should create a separate issue to track this?

Issue #1333 opened to track this issue.

src/axom/primal/tests/primal_polyhedron.cpp

src/axom/quest/IntersectionShaper.hpp

src/axom/spin/tests/spin_implicit_grid.cpp

bmhan12 added Core Issues related to Axom's 'core' component Quest Issues related to Axom's 'quest' component Primal Issues related to Axom's 'primal component GPU Issues related to GPU development labels Apr 10, 2024

kennyweiss approved these changes Apr 12, 2024

View reviewed changes

rhornung67 requested review from white238, rhornung67 and agcapps April 29, 2024 21:10

bmhan12 force-pushed the feature/han12/device_default branch from 7336064 to 73342b1 Compare May 7, 2024 21:26

bmhan12 mentioned this pull request May 7, 2024

Naming convention for host, device, and view variables #1333

Open

bmhan12 added 20 commits May 13, 2024 10:23

Set default allocator to Device instead of Unified for CUDA and HIP

3e2ce63

Fix core_serial_test

d430445

Fix core acceleration example

3f4cf9a

Fix primal_boundingbox_test

ec93558

Fix primal_clip_test

4877020

Simplify primal_boundingbox_test change

76a4f98

Fix primal_numeric_array, primal_point, primal_polyhedron tests

65ef23f

Fix primal_vector and primal_zip tests

d75038a

Fix mint_execution_XXX_traversal tests (use Uniform policy)

53ea696

Use unified memory for device insert core array for all example

842316b

Fix implicit grid tests - unified memory needed for getCandidates() call

49b9e09

Tentatively spin bvh tests working

4e4e870

Use unified memory for distributed closest point

002aa57

Fix signed distance tests

711fd4b

Fix quest discretize tests

79c41ca

Guard unified memory init in core array for all

1dda84b

Cleanup spin_bvh warnings

67394a2

spin implicit grid tests - generalize allocator ids on host

9db7e75

quest discretize tests - generalize allocator ids on host

20d8ce9

Use Unified Memory for IntersectionShaper

a7e9126

bmhan12 added 12 commits May 13, 2024 10:23

Use unified memory for mesh_tester executable

a46c63d

spin_bvh_test - loop should run on host, some cleanup

cfec6f8

WIP - fixing signed distance on HIP

5434d3d

Use unified for signed distance tests

9e4fb92

Generalize allocator ids for non-raja, non-umpire build

012aaee

Handle +raja~umpire failure case

ec67172

One more missing fix

1417bff

Add comment of unified id usage where preferred instead of mandatory

7c4e777

Use explicit allocators for clarity

b278315

Helper method for unified mint tests

076eda6

clarify variable name change

f56435c

Use helper function for IntersectionShaper

835c22e

bmhan12 force-pushed the feature/han12/device_default branch from 73342b1 to 835c22e Compare May 13, 2024 20:02

bmhan12 mentioned this pull request May 13, 2024

Address outstanding tests and examples that rely on unified memory #1339

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set default allocator id to device instead of unified for CUDA and HIP execution policies #1316

Set default allocator id to device instead of unified for CUDA and HIP execution policies #1316

bmhan12 commented Apr 10, 2024

kennyweiss left a comment

kennyweiss Apr 12, 2024

kennyweiss Apr 12, 2024

bmhan12 May 7, 2024

kennyweiss Apr 12, 2024

bmhan12 May 7, 2024

kennyweiss Apr 12, 2024

bmhan12 May 7, 2024

Set default allocator id to device instead of unified for CUDA and HIP execution policies #1316

Are you sure you want to change the base?

Set default allocator id to device instead of unified for CUDA and HIP execution policies #1316

Conversation

bmhan12 commented Apr 10, 2024

kennyweiss left a comment

Choose a reason for hiding this comment

kennyweiss Apr 12, 2024

Choose a reason for hiding this comment

kennyweiss Apr 12, 2024

Choose a reason for hiding this comment

bmhan12 May 7, 2024

Choose a reason for hiding this comment

kennyweiss Apr 12, 2024

Choose a reason for hiding this comment

bmhan12 May 7, 2024

Choose a reason for hiding this comment

kennyweiss Apr 12, 2024

Choose a reason for hiding this comment

bmhan12 May 7, 2024

Choose a reason for hiding this comment