Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported SM: 0x601 failure of TensorRT 10.0.1 on 1080ti #3826

Open
ZanderFoster opened this issue Apr 25, 2024 · 2 comments
Open

Unsupported SM: 0x601 failure of TensorRT 10.0.1 on 1080ti #3826

ZanderFoster opened this issue Apr 25, 2024 · 2 comments
Assignees
Labels
triaged Issue has been triaged by maintainers

Comments

@ZanderFoster
Copy link

ZanderFoster commented Apr 25, 2024

Description

UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ..\aten\src\ATen\native\cudnn\Conv_v8.cpp:919.)
return F.conv2d(input, weight, bias, self.stride,
Model summary (fused): 168 layers, 3006233 parameters, 0 gradients, 8.1 GFLOPs

PyTorch: starting from 'Models\Realtime\v8n_v2.pt' with input shape (1, 3, 416, 416) BCHW and output shape(s) (1, 7, 3549) (5.9 MB)

ONNX: starting export with onnx 1.16.0 opset 17...
ONNX: simplifying with onnxsim 0.4.36...
ONNX: export success ✅ 1.1s, saved as 'Models\Realtime\v8n_v2.onnx' (11.6 MB)

TensorRT: starting export with TensorRT 10.0.1...
[04/25/2024-15:52:53] [TRT] [I] [MemUsageChange] Init CUDA: CPU +3, GPU +0, now: CPU 12063, GPU 1801 (MiB)
[04/25/2024-15:52:54] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +214, GPU +0, now: CPU 12464, GPU 1801 (MiB)
[04/25/2024-15:52:54] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
[04/25/2024-15:52:54] [TRT] [I] ----------------------------------------------------------------
[04/25/2024-15:52:54] [TRT] [I] Input filename: Models\Realtime\v8n_v2.onnx
[04/25/2024-15:52:54] [TRT] [I] ONNX IR version: 0.0.8
[04/25/2024-15:52:54] [TRT] [I] Opset version: 17
[04/25/2024-15:52:54] [TRT] [I] Producer name: pytorch
[04/25/2024-15:52:54] [TRT] [I] Producer version: 2.3.0
[04/25/2024-15:52:54] [TRT] [I] Domain:
[04/25/2024-15:52:54] [TRT] [I] Model version: 0
[04/25/2024-15:52:54] [TRT] [I] Doc string:
[04/25/2024-15:52:54] [TRT] [I] ----------------------------------------------------------------
TensorRT: input "images" with shape(1, 3, 416, 416) DataType.FLOAT
TensorRT: output "output0" with shape(1, 7, 3549) DataType.FLOAT
TensorRT: building FP32 engine as Models\Realtime\v8n_v2.engine
[04/25/2024-15:52:54] [TRT] [I] BuilderFlag::kTF32 is set but hardware does not support TF32. Disabling TF32.
[04/25/2024-15:52:54] [TRT] [I] BuilderFlag::kTF32 is set but hardware does not support TF32. Disabling TF32.
[04/25/2024-15:52:54] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[04/25/2024-15:52:54] [TRT] [E] 1: Unsupported SM: 0x601
[04/25/2024-15:52:54] [TRT] [E] 1: [caskUtils.cpp::nvinfer1::rt::task::trtSmToCask::193] Error Code 1: Internal Error (Unsupported SM: 0x601)

line 722, in export_engine
with build(network, config) as engine, open(f, "wb") as t:
TypeError: 'NoneType' object does not support the context manager protocol

TensorRT Version: 10.0.1

NVIDIA GPU: 1080ti

NVIDIA Driver Version: whatever installs with cuda

CUDA Version: 11.8

CUDNN Version: 8.5.0

Operating System: windows

Python Version (if applicable): 3.11

@lix19937
Copy link

lix19937 commented Apr 27, 2024

[04/25/2024-15:52:54] [TRT] [E] 1: Unsupported SM: 0x601
[04/25/2024-15:52:54] [TRT] [E] 1: [caskUtils.cpp::nvinfer1::rt::task::trtSmToCask::193] Error Code 1: Internal Error (Unsupported SM: 0x601)

'Unsupported SM' means that TensorRT 10.0.1 doesn't support GTX 1080TI's SM 6.1 (Pascal arch), you may downgrade TensorRT version to 9.1.0 or 8.5

There was an up to 28% performance regression compared to TensorRT 8.5 on Transformer networks in FP16 precision on NVIDIA Volta GPUs, and up to 85% performance regression on NVIDIA Pascal GPUs. Disabling the kDISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805 preview flag was a workaround. This issue has been fixed.

ref
https://docs.nvidia.com/deeplearning/tensorrt/release-notes/index.html#rel-8-6-1

@zerollzeng zerollzeng self-assigned this Apr 28, 2024
@zerollzeng zerollzeng added the triaged Issue has been triaged by maintainers label Apr 28, 2024
@zerollzeng
Copy link
Collaborator

Checking internally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

3 participants