TRTEXEC Failure when trying to build TensorRT engine from ONNX Model; Error from `graphShapeAnalyzer.cpp` #3846

timf34 · 2024-05-07T14:58:35Z

Error portion of logs:

 Error[4]: [graphShapeAnalyzer.cpp::nvinfer1::builder::`anonymous-namespace'::ShapeAnalyzerImpl::analyzeShapes::2084] Error Code 4: Miscellaneous (ITopKLayer /TopK: /TopK: K exceeds the maximum value allowed (3840).)
[05/07/2024-15:50:04] [E] Engine could not be created from network
[05/07/2024-15:50:04] [E] Building engine failed
[05/07/2024-15:50:04] [E] Failed to create engine from model or file.
[05/07/2024-15:50:04] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v100001] # C:\Program Files\TensorRT-10.0.1.6\bin\trtexec.exe --onnx=C:\Users\timf3\PycharmProjects\BallNet\footandball_model.onnx --minShapes=input:1x3x1080x1920 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --fp16 --saveEngine=resnet_engine.trt

I am running this on the most recent version of TensorRT, and am using an Nvidia GeForce 3050Ti on a Windows 11 laptop.

Here is the output from nvcc --version:

PS C:\Program Files\TensorRT-10.0.1.6\bin> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

Here are the full logs:

PS C:\Program Files\TensorRT-10.0.1.6\bin> ./trtexec.exe --onnx=C:\Users\timf3\PycharmProjects\BallNet\footandball_model.onnx --minShapes=input:1x3x1080x1920 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --fp16 --sav
eEngine=resnet_engine.trt
&&&& RUNNING TensorRT.trtexec [TensorRT v100001] # C:\Program Files\TensorRT-10.0.1.6\bin\trtexec.exe --onnx=C:\Users\timf3\PycharmProjects\BallNet\footandball_model.onnx --minShapes=input:1x3x1080x1920 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --fp16 --saveEngine=resnet_engine.trt
[05/07/2024-15:49:51] [I] === Model Options ===
[05/07/2024-15:49:51] [I] Format: ONNX
[05/07/2024-15:49:51] [I] Model: C:\Users\timf3\PycharmProjects\BallNet\footandball_model.onnx
[05/07/2024-15:49:51] [I] Output:
[05/07/2024-15:49:51] [I] === Build Options ===
[05/07/2024-15:49:51] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default, tacticSharedMem: default
[05/07/2024-15:49:51] [I] avgTiming: 8
[05/07/2024-15:49:51] [I] Precision: FP32+FP16
[05/07/2024-15:49:51] [I] LayerPrecisions:
[05/07/2024-15:49:51] [I] Layer Device Types:
[05/07/2024-15:49:51] [I] Calibration:
[05/07/2024-15:49:51] [I] Refit: Disabled
[05/07/2024-15:49:51] [I] Strip weights: Disabled
[05/07/2024-15:49:51] [I] Version Compatible: Disabled
[05/07/2024-15:49:51] [I] ONNX Plugin InstanceNorm: Disabled
[05/07/2024-15:49:51] [I] TensorRT runtime: full
[05/07/2024-15:49:51] [I] Lean DLL Path:
[05/07/2024-15:49:51] [I] Tempfile Controls: { in_memory: allow, temporary: allow }
[05/07/2024-15:49:51] [I] Exclude Lean Runtime: Disabled
[05/07/2024-15:49:51] [I] Sparsity: Disabled
[05/07/2024-15:49:51] [I] Safe mode: Disabled
[05/07/2024-15:49:51] [I] Build DLA standalone loadable: Disabled
[05/07/2024-15:49:51] [I] Allow GPU fallback for DLA: Disabled
[05/07/2024-15:49:51] [I] DirectIO mode: Disabled
[05/07/2024-15:49:51] [I] Restricted mode: Disabled
[05/07/2024-15:49:51] [I] Skip inference: Disabled
[05/07/2024-15:49:51] [I] Save engine: resnet_engine.trt
[05/07/2024-15:49:51] [I] Load engine:
[05/07/2024-15:49:51] [I] Profiling verbosity: 0
[05/07/2024-15:49:51] [I] Tactic sources: Using default tactic sources
[05/07/2024-15:49:51] [I] timingCacheMode: local
[05/07/2024-15:49:51] [I] timingCacheFile:
[05/07/2024-15:49:51] [I] Enable Compilation Cache: Enabled
[05/07/2024-15:49:51] [I] errorOnTimingCacheMiss: Disabled
[05/07/2024-15:49:51] [I] Preview Features: Use default preview flags.
[05/07/2024-15:49:51] [I] MaxAuxStreams: -1
[05/07/2024-15:49:51] [I] BuilderOptimizationLevel: -1
[05/07/2024-15:49:51] [I] Calibration Profile Index: 0
[05/07/2024-15:49:51] [I] Weight Streaming: Disabled
[05/07/2024-15:49:51] [I] Debug Tensors:
[05/07/2024-15:49:51] [I] Input(s)s format: fp32:CHW
[05/07/2024-15:49:51] [I] Output(s)s format: fp32:CHW
[05/07/2024-15:49:51] [I] Input build shape (profile 0): input=1x3x1080x1920+1x3x1080x1920+1x3x1080x1920
[05/07/2024-15:49:51] [I] Input calibration shapes: model
[05/07/2024-15:49:51] [I] === System Options ===
[05/07/2024-15:49:51] [I] Device: 0
[05/07/2024-15:49:51] [I] DLACore:
[05/07/2024-15:49:51] [I] Plugins:
[05/07/2024-15:49:51] [I] setPluginsToSerialize:
[05/07/2024-15:49:51] [I] dynamicPlugins:
[05/07/2024-15:49:51] [I] ignoreParsedPluginLibs: 0
[05/07/2024-15:49:51] [I]
[05/07/2024-15:49:51] [I] === Inference Options ===
[05/07/2024-15:49:51] [I] Batch: Explicit
[05/07/2024-15:49:51] [I] Input inference shape : input=1x3x1080x1920
[05/07/2024-15:49:51] [I] Iterations: 10
[05/07/2024-15:49:51] [I] Duration: 3s (+ 200ms warm up)
[05/07/2024-15:49:51] [I] Sleep time: 0ms
[05/07/2024-15:49:51] [I] Idle time: 0ms
[05/07/2024-15:49:51] [I] Inference Streams: 1
[05/07/2024-15:49:51] [I] ExposeDMA: Disabled
[05/07/2024-15:49:51] [I] Data transfers: Enabled
[05/07/2024-15:49:51] [I] Spin-wait: Disabled
[05/07/2024-15:49:51] [I] Multithreading: Disabled
[05/07/2024-15:49:51] [I] CUDA Graph: Disabled
[05/07/2024-15:49:51] [I] Separate profiling: Disabled
[05/07/2024-15:49:51] [I] Time Deserialize: Disabled
[05/07/2024-15:49:51] [I] Time Refit: Disabled
[05/07/2024-15:49:51] [I] NVTX verbosity: 0
[05/07/2024-15:49:51] [I] Persistent Cache Ratio: 0
[05/07/2024-15:49:51] [I] Optimization Profile Index: 0
[05/07/2024-15:49:51] [I] Weight Streaming Budget: Disabled
[05/07/2024-15:49:51] [I] Inputs:
[05/07/2024-15:49:51] [I] Debug Tensor Save Destinations:
[05/07/2024-15:49:51] [I] === Reporting Options ===
[05/07/2024-15:49:51] [I] Verbose: Disabled
[05/07/2024-15:49:51] [I] Averages: 10 inferences
[05/07/2024-15:49:51] [I] Percentiles: 90,95,99
[05/07/2024-15:49:51] [I] Dump refittable layers:Disabled
[05/07/2024-15:49:51] [I] Dump output: Disabled
[05/07/2024-15:49:51] [I] Profile: Disabled
[05/07/2024-15:49:51] [I] Export timing to JSON file:
[05/07/2024-15:49:51] [I] Export output to JSON file:
[05/07/2024-15:49:51] [I] Export profile to JSON file:
[05/07/2024-15:49:51] [I]
[05/07/2024-15:49:51] [I] === Device Information ===
[05/07/2024-15:49:51] [I] Available Devices:
[05/07/2024-15:49:51] [I]   Device 0: "NVIDIA GeForce RTX 3050 Ti Laptop GPU" UUID: GPU-173983d4-c1c9-ad3a-5330-e883c4542db5
[05/07/2024-15:49:51] [I] Selected Device: NVIDIA GeForce RTX 3050 Ti Laptop GPU
[05/07/2024-15:49:51] [I] Selected Device ID: 0
[05/07/2024-15:49:51] [I] Selected Device UUID: GPU-173983d4-c1c9-ad3a-5330-e883c4542db5
[05/07/2024-15:49:51] [I] Compute Capability: 8.6
[05/07/2024-15:49:51] [I] SMs: 20
[05/07/2024-15:49:51] [I] Device Global Memory: 4095 MiB
[05/07/2024-15:49:51] [I] Shared Memory per SM: 100 KiB
[05/07/2024-15:49:51] [I] Memory Bus Width: 128 bits (ECC disabled)
[05/07/2024-15:49:51] [I] Application Compute Clock Rate: 1.035 GHz
[05/07/2024-15:49:51] [I] Application Memory Clock Rate: 5.501 GHz
[05/07/2024-15:49:51] [I]
[05/07/2024-15:49:51] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[05/07/2024-15:49:51] [I]
[05/07/2024-15:49:51] [I] TensorRT version: 10.0.1
[05/07/2024-15:49:51] [I] Loading standard plugins
[05/07/2024-15:49:51] [I] [TRT] [MemUsageChange] Init CUDA: CPU +92, GPU +0, now: CPU 22297, GPU 792 (MiB)
[05/07/2024-15:50:04] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +2601, GPU +310, now: CPU 25624, GPU 1102 (MiB)
[05/07/2024-15:50:04] [I] Start parsing network model.
[05/07/2024-15:50:04] [I] [TRT] ----------------------------------------------------------------
[05/07/2024-15:50:04] [I] [TRT] Input filename:   C:\Users\timf3\PycharmProjects\BallNet\footandball_model.onnx
[05/07/2024-15:50:04] [I] [TRT] ONNX IR version:  0.0.6
[05/07/2024-15:50:04] [I] [TRT] Opset version:    11
[05/07/2024-15:50:04] [I] [TRT] Producer name:    pytorch
[05/07/2024-15:50:04] [I] [TRT] Producer version: 2.0.1
[05/07/2024-15:50:04] [I] [TRT] Domain:
[05/07/2024-15:50:04] [I] [TRT] Model version:    0
[05/07/2024-15:50:04] [I] [TRT] Doc string:
[05/07/2024-15:50:04] [I] [TRT] ----------------------------------------------------------------
[05/07/2024-15:50:04] [W] [TRT] ModelImporter.cpp:680: Make sure output 614 has Int64 binding.
[05/07/2024-15:50:04] [I] Finished parsing network model. Parse time: 0.0570069
[05/07/2024-15:50:04] [I] Set shape of input tensor input for optimization profile 0 to: MIN=1x3x1080x1920 OPT=1x3x1080x1920 MAX=1x3x1080x1920
[05/07/2024-15:50:04] [E] Error[4]: [graphShapeAnalyzer.cpp::nvinfer1::builder::`anonymous-namespace'::ShapeAnalyzerImpl::analyzeShapes::2084] Error Code 4: Miscellaneous (ITopKLayer /TopK: /TopK: K exceeds the maximum value allowed (3840).)
[05/07/2024-15:50:04] [E] Engine could not be created from network
[05/07/2024-15:50:04] [E] Building engine failed
[05/07/2024-15:50:04] [E] Failed to create engine from model or file.
[05/07/2024-15:50:04] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v100001] # C:\Program Files\TensorRT-10.0.1.6\bin\trtexec.exe --onnx=C:\Users\timf3\PycharmProjects\BallNet\footandball_model.onnx --minShapes=input:1x3x1080x1920 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --fp16 --saveEngine=resnet_engine.trt

I'm not sure how to fix this or how to go about debugging it. I have found the TopK layer using Netron, but I can't associate it to where it is in my network (I am using a custom CNN architecture).

The text was updated successfully, but these errors were encountered:

lix19937 · 2024-05-10T16:13:26Z

/TopK: K exceeds the maximum value allowed (3840)

topK K > 3840, FAILED.

zerollzeng · 2024-05-12T07:15:12Z

It's known limitation, and we are actively working on remove it.

zerollzeng self-assigned this May 12, 2024

zerollzeng added the triaged Issue has been triaged by maintainers label May 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TRTEXEC Failure when trying to build TensorRT engine from ONNX Model; Error from `graphShapeAnalyzer.cpp` #3846

TRTEXEC Failure when trying to build TensorRT engine from ONNX Model; Error from `graphShapeAnalyzer.cpp` #3846

timf34 commented May 7, 2024 •

edited

lix19937 commented May 10, 2024

zerollzeng commented May 12, 2024

TRTEXEC Failure when trying to build TensorRT engine from ONNX Model; Error from graphShapeAnalyzer.cpp #3846

TRTEXEC Failure when trying to build TensorRT engine from ONNX Model; Error from graphShapeAnalyzer.cpp #3846

Comments

timf34 commented May 7, 2024 • edited

lix19937 commented May 10, 2024

zerollzeng commented May 12, 2024

TRTEXEC Failure when trying to build TensorRT engine from ONNX Model; Error from `graphShapeAnalyzer.cpp` #3846

TRTEXEC Failure when trying to build TensorRT engine from ONNX Model; Error from `graphShapeAnalyzer.cpp` #3846

timf34 commented May 7, 2024 •

edited