-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Task/rhornung67/device numeric limits #1196
Draft
rhornung67
wants to merge
7
commits into
develop
Choose a base branch
from
task/rhornung67/device-numeric-limits
base: develop
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
87c4d77
Add "wrapper" capability to access std::numeric_limits functionality
rhornung67 9ad3a06
add basic tests for numeric_limits host and device operations
rhornung67 b255612
Add some comments to tests
rhornung67 f401e6e
Run clang format
rhornung67 90aed51
Convert core tests to use axom::numeric_limits
rhornung67 ec4e42f
Revert type change
rhornung67 7aadf4b
Experiment with test for device code
rhornung67 File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
// Copyright (c) 2017-2023, Lawrence Livermore National Security, LLC and | ||
// other Axom Project Developers. See the top-level LICENSE file for details. | ||
// | ||
// SPDX-License-Identifier: (BSD-3-Clause) | ||
|
||
/*! | ||
* | ||
* \file NumericLimits.hpp | ||
* | ||
* \brief Header file containing portability layer for std::numeric_limits | ||
* capabilities | ||
* | ||
*/ | ||
|
||
#ifndef AXOM_NUMERICLIMITS_HPP_ | ||
#define AXOM_NUMERICLIMITS_HPP_ | ||
|
||
#include "axom/config.hpp" // for compile-time definitions | ||
|
||
#include <limits> | ||
|
||
#if defined(AXOM_USE_CUDA) | ||
#include <cuda/std/limits> | ||
#endif | ||
|
||
namespace axom | ||
{ | ||
#if defined(AXOM_USE_CUDA) && defined(AXOM_DEVICE_CODE) | ||
template <typename T> | ||
using numeric_limits = cuda::std::numeric_limits<T>; | ||
#else | ||
template <typename T> | ||
using numeric_limits = std::numeric_limits<T>; | ||
#endif | ||
|
||
} // namespace axom | ||
|
||
#endif // AXOM_NUMERICLIMITS_HPP_ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,170 @@ | ||
// Copyright (c) 2017-2023, Lawrence Livermore National Security, LLC and | ||
// other Axom Project Developers. See the top-level LICENSE file for details. | ||
// | ||
// SPDX-License-Identifier: (BSD-3-Clause) | ||
|
||
#include "axom/config.hpp" // for compile time definitions | ||
|
||
#include "axom/core/NumericLimits.hpp" | ||
|
||
// for gtest macros | ||
#include "gtest/gtest.h" | ||
|
||
//------------------------------------------------------------------------------ | ||
// UNIT TESTS | ||
//------------------------------------------------------------------------------ | ||
|
||
//------------------------------------------------------------------------------ | ||
TEST(core_NumericLimits, check_CPU) | ||
{ | ||
// | ||
// Tests to compare axom::numeric_limits to std::numeric_limits | ||
// to ensure that Axom type aliasing is correct. | ||
// | ||
EXPECT_TRUE(axom::numeric_limits<int>::lowest() == | ||
std::numeric_limits<int>::lowest()); | ||
EXPECT_TRUE(axom::numeric_limits<int>::min() == std::numeric_limits<int>::min()); | ||
EXPECT_TRUE(axom::numeric_limits<int>::max() == std::numeric_limits<int>::max()); | ||
EXPECT_TRUE(axom::numeric_limits<int>::is_signed == | ||
std::numeric_limits<int>::is_signed); | ||
|
||
EXPECT_TRUE(axom::numeric_limits<float>::lowest() == | ||
std::numeric_limits<float>::lowest()); | ||
EXPECT_TRUE(axom::numeric_limits<float>::min() == | ||
std::numeric_limits<float>::min()); | ||
EXPECT_TRUE(axom::numeric_limits<float>::max() == | ||
std::numeric_limits<float>::max()); | ||
|
||
EXPECT_TRUE(axom::numeric_limits<double>::lowest() == | ||
std::numeric_limits<double>::lowest()); | ||
EXPECT_TRUE(axom::numeric_limits<double>::min() == | ||
std::numeric_limits<double>::min()); | ||
EXPECT_TRUE(axom::numeric_limits<double>::max() == | ||
std::numeric_limits<double>::max()); | ||
} | ||
|
||
//------------------------------------------------------------------------------ | ||
#if defined(AXOM_USE_CUDA) | ||
// | ||
// Tests to ensure axom::numeric_limits type alias does the correct thing | ||
// in host and CUDA device code. | ||
// | ||
|
||
// | ||
// Simple device kernel | ||
// | ||
__global__ void cuda_kernel(int* a, size_t* b, float* c, double* d) | ||
{ | ||
a[0] = axom::numeric_limits<int>::min(); | ||
b[0] = axom::numeric_limits<size_t>::max(); | ||
c[0] = axom::numeric_limits<float>::lowest(); | ||
d[0] = axom::numeric_limits<double>::max(); | ||
} | ||
|
||
TEST(core_NumericLimits, check_CUDA) | ||
{ | ||
// | ||
// Device memory allocation and initialiation for a few different types. | ||
// | ||
int* a; | ||
(void)cudaMalloc(&a, sizeof(int)); | ||
(void)cudaMemset(a, 0, sizeof(int)); | ||
|
||
size_t* b; | ||
(void)cudaMalloc(&b, sizeof(size_t)); | ||
(void)cudaMemset(b, 0, sizeof(size_t)); | ||
|
||
float* c; | ||
(void)cudaMalloc(&c, sizeof(float)); | ||
(void)cudaMemset(c, 0, sizeof(float)); | ||
|
||
double* d; | ||
(void)cudaMalloc(&d, sizeof(double)); | ||
(void)cudaMemset(d, 0, sizeof(double)); | ||
|
||
// | ||
// Set values in device code. | ||
// | ||
cuda_kernel<<<1, 1>>>(a, b, c, d); | ||
|
||
// | ||
// Copy device values back to host and compare with expectations.... | ||
// | ||
int ha; | ||
size_t hb; | ||
float hc; | ||
double hd; | ||
(void)cudaMemcpy(&ha, a, sizeof(int), cudaMemcpyDeviceToHost); | ||
(void)cudaMemcpy(&hb, b, sizeof(size_t), cudaMemcpyDeviceToHost); | ||
(void)cudaMemcpy(&hc, c, sizeof(float), cudaMemcpyDeviceToHost); | ||
(void)cudaMemcpy(&hd, d, sizeof(double), cudaMemcpyDeviceToHost); | ||
|
||
EXPECT_TRUE(ha == axom::numeric_limits<int>::min()); | ||
EXPECT_TRUE(hb == axom::numeric_limits<size_t>::max()); | ||
EXPECT_TRUE(hc == axom::numeric_limits<float>::lowest()); | ||
EXPECT_TRUE(hd == axom::numeric_limits<double>::max()); | ||
} | ||
#endif | ||
|
||
//------------------------------------------------------------------------------ | ||
#if defined(AXOM_USE_HIP) | ||
// | ||
// Tests to ensure axom::numeric_limits type alias does the correct thing | ||
// in host and CUDA device code. | ||
// | ||
|
||
// | ||
// Simple device kernel | ||
// | ||
__global__ void hip_kernel(int* a, size_t* b, float* c, double* d) | ||
{ | ||
a[0] = axom::numeric_limits<int>::min(); | ||
b[0] = axom::numeric_limits<size_t>::max(); | ||
c[0] = axom::numeric_limits<float>::lowest(); | ||
d[0] = axom::numeric_limits<double>::max(); | ||
} | ||
|
||
TEST(core_NumericLimits, check_HIP) | ||
{ | ||
// | ||
// Device memory allocation and initialiation for a few different types. | ||
// | ||
int* a; | ||
(void)hipMalloc(&a, sizeof(int)); | ||
(void)hipMemset(a, 0, sizeof(int)); | ||
|
||
size_t* b; | ||
(void)hipMalloc(&b, sizeof(size_t)); | ||
(void)hipMemset(b, 0, sizeof(size_t)); | ||
|
||
float* c; | ||
(void)hipMalloc(&c, sizeof(float)); | ||
(void)hipMemset(c, 0, sizeof(float)); | ||
|
||
double* d; | ||
(void)hipMalloc(&d, sizeof(double)); | ||
(void)hipMemset(d, 0, sizeof(double)); | ||
|
||
// | ||
// Set values in device code. | ||
// | ||
hip_kernel<<<1, 1>>>(a, b, c, d); | ||
|
||
// | ||
// Copy device values back to host and compare with expectations.... | ||
// | ||
int ha; | ||
size_t hb; | ||
float hc; | ||
double hd; | ||
(void)hipMemcpy(&ha, a, sizeof(int), hipMemcpyDeviceToHost); | ||
(void)hipMemcpy(&hb, b, sizeof(size_t), hipMemcpyDeviceToHost); | ||
(void)hipMemcpy(&hc, c, sizeof(float), hipMemcpyDeviceToHost); | ||
(void)hipMemcpy(&hd, d, sizeof(double), hipMemcpyDeviceToHost); | ||
|
||
EXPECT_TRUE(ha == axom::numeric_limits<int>::min()); | ||
EXPECT_TRUE(hb == axom::numeric_limits<size_t>::max()); | ||
EXPECT_TRUE(hc == axom::numeric_limits<float>::lowest()); | ||
EXPECT_TRUE(hd == axom::numeric_limits<double>::max()); | ||
} | ||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to remove
if defined(AXOM_DEVICE_CODE)
and just have theAXOM_USE_CUDA
guard? It's my impression thatcuda::std
should work on both the host and device.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it does, that would be preferable. I need to look into that. @kennyweiss discussed this PR yesterday. That resulted in some concerns about several things in the code that we should discuss as a team. Unfortunately, that will have to wait for a couple of weeks as next Monday is a LLNL holiday and NECDC is the week after that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@publixsubfan
cuda::std
does work in device and host code. I ran into some issues with some Axom tests wherecuda::std::numeric_limits
does not supportlong double
. The intent of my change was to usestd::numeric_limits
in host code for all builds. However, it's not clear to me that we need to supportlong double
. long double is automatically converted to double in device code and attempting to pass long double data between host and device code is problematic.