You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
oneDNN validation for Nvidia backend hits a correctness issue on forward linear resampling for specific shapes under benchdnn.
Version
Latest master.
Environment
Hardware:
NVIDIA A100 80GB PCIe
(A10 should also work for most cases).
Software
SYCL Compiler with Nvidia support.
Any version that compiles without issues, preferable no later than April.
[Optional] TBB
Any version.
[Optional] OpenCL CPU
Latest version is preferable.
Optional means that CPU backend can be enabled if dependency is satisfied. Otherwise, should be switched off.
For a full suite validation, use --batch=test_resampling_gpu instead of a specific test case.
Helper env vars:
CUDA_LOGINFO_DBG=1 CUDA_LOGDEST_DBG=stdout -- enables cuda API dump
CUDNN_LOGINFO_DBG=1 CUDNN_LOGDEST_DBG=stdout -- enables cudnn API dump
DNNL_VERBOSE=all (or desired level) -- enables oneDNN execution information
Helper tips:
benchdnn supports verbosity through -vX. Most info is available at v6. It's possible to dump destination with -v99 when really needed.
benchdnn documentation is here: https://github.com/oneapi-src/oneDNN/tree/master/tests/benchdnn (scroll down). Reorder doc and others may be found through links.
benchdnn binary also supports --help command, which will tip to use --bnorm --help to dump all supported options.
Observed behavior
Failures are reproducible within a single run, there are total of 55 failures of similar nature.
The nature of the mismatch is unknown. Since both diff and rdiff are high for fp32, there's a need to understand what's going on. The only assumption I have is original grid is not aligned between oneDNN and cuDNN. To check that, suggest to update resampling source tensor filling to a single value and see how it affects the output. Additional findings may base on findings from this one.
Expected behavior
The issue is not appearing during the single run validation nor under full batch.
The text was updated successfully, but these errors were encountered:
Summary
oneDNN validation for Nvidia backend hits a correctness issue on forward linear resampling for specific shapes under benchdnn.
Version
Latest master.
Environment
Hardware:
NVIDIA A100 80GB PCIe
(A10 should also work for most cases).
Software
SYCL Compiler with Nvidia support.
Any version that compiles without issues, preferable no later than April.
[Optional] TBB
Any version.
[Optional] OpenCL CPU
Latest version is preferable.
Steps to reproduce
Build
mkdir -p build
cd build
cmake .. -DCMAKE_BUILD_TYPE=release (or debug) -DDNNL_CPU_RUNTIME=DPCPP (or NONE) -DDNNL_GPU_RUNTIME=DPCPP -DDNNL_GPU_VENDOR=NVIDIA -DONEDNN_BUILD_GRAPH=OFF
cmake --build . --target benchdnn
Run
<env_vars> ./build/tests/benchdnn/benchdnn --resampling --engine=gpu --alg=linear ic32iw151ow300
For a full suite validation, use
--batch=test_resampling_gpu
instead of a specific test case.Helper env vars:
CUDA_LOGINFO_DBG=1 CUDA_LOGDEST_DBG=stdout -- enables cuda API dump
CUDNN_LOGINFO_DBG=1 CUDNN_LOGDEST_DBG=stdout -- enables cudnn API dump
DNNL_VERBOSE=all (or desired level) -- enables oneDNN execution information
Helper tips:
benchdnn supports verbosity through -vX. Most info is available at v6. It's possible to dump destination with -v99 when really needed.
benchdnn documentation is here: https://github.com/oneapi-src/oneDNN/tree/master/tests/benchdnn (scroll down). Reorder doc and others may be found through links.
benchdnn binary also supports
--help
command, which will tip to use--bnorm --help
to dump all supported options.Observed behavior
Failures are reproducible within a single run, there are total of 55 failures of similar nature.
The nature of the mismatch is unknown. Since both diff and rdiff are high for fp32, there's a need to understand what's going on. The only assumption I have is original grid is not aligned between oneDNN and cuDNN. To check that, suggest to update resampling source tensor filling to a single value and see how it affects the output. Additional findings may base on findings from this one.
Expected behavior
The issue is not appearing during the single run validation nor under full batch.
The text was updated successfully, but these errors were encountered: