New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set default allocator id to device instead of unified for CUDA and HIP execution policies #1316
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bmhan12 !
@@ -40,7 +40,10 @@ void check_bb_policy() | |||
box[i].isValid(); | |||
}); | |||
|
|||
EXPECT_EQ(box[0], BoundingBoxType(PointType(0.0), PointType(10.0))); | |||
BoundingBoxType box_host; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Would this code be simpler if box
used a StackArray
instead of allocating and deallocating one bounding box?
@@ -329,6 +329,12 @@ void unit_check_poly_clip() | |||
// | | |||
// In addition, vertices 0 and 3 should be marked as clipped. | |||
|
|||
const int current_allocator = axom::getDefaultAllocatorID(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here:
Do we have a better pattern for allocating/deallocating moving a single object between host and device, e.g. using a StackArray
?
(Not required for this PR, but if we have a cleaner way of doing this, we should prefer to use it in the unit tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a better pattern for allocating/deallocating moving a single object between host and device, e.g. using a StackArray ?
Not that I am aware of. The best I have found is either using axom::allocate
& axom::copy
, or using axom::Array
constructors to move between host and device.
tet[0] = TetrahedronType(PointType {1, 0, 0}, | ||
PointType {1, 1, 0}, | ||
PointType {0, 1, 0}, | ||
PointType {1, 0, 1}); | ||
|
||
hex[0] = HexahedronType(PointType {0, 0, 0}, | ||
PointType {1, 0, 0}, | ||
PointType {1, 1, 0}, | ||
PointType {0, 1, 0}, | ||
PointType {0, 0, 1}, | ||
PointType {1, 0, 1}, | ||
PointType {1, 1, 1}, | ||
PointType {0, 1, 1}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change require setDefaultAllocatorID()
to have been set above? Or does it work because its constructed within the appropriate ExecPolicy
?
If it's the former, is there a way to make this work without setting the default allocator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change require setDefaultAllocatorID() to have been set above? Or does it work because its constructed within the appropriate ExecPolicy ?
The latter, and it works because the hexahedron constructor is host-device decorated.
// Shapes on host | ||
TetrahedronType tet_host; | ||
OctahedronType oct_host; | ||
PolyhedronType res_host; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely outside the scope of this PR, but we should probably have a standardized convention for labeling variables as host, device and views and use them consistently within axom and its tests and examples.
Perhaps we should create a separate issue to track this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue #1333 opened to track this issue.
7336064
to
73342b1
Compare
73342b1
to
835c22e
Compare
This PR:
Follow-up work is required to revisit tests and examples that still require unified memory, and adjust the source implementation.
These include: