Inconsistent results between TensorRT and ONNX #3850

Sukeysun · 2024-05-09T10:06:06Z

Description

When attempting to convert the ONNX model 'pose_detect' of mediapipe to TensorRT, I observed a significant loss in model accuracy. The model fails to maintain its original precision after the conversion process.

Environment

TensorRT Version: 8.6.1

CUDA Version: 11.8

Relevant Files

Model link:
https://storage.googleapis.com/ailia-models/blazepose-fullbody/pose_detection.onnx

Steps To Reproduce

Download the ONNX model from the provided link.
Use the following command to compare the ONNX output and TensorRT output:

from polygraphy.backend.onnx import BytesFromOnnx, ModifyOutputs as ModifyOnnxOutputs, OnnxFromPath
from polygraphy.backend.onnxrt import OnnxrtRunner, SessionFromOnnx
from polygraphy.backend.trt import EngineBytesFromNetwork, EngineFromBytes, ModifyNetworkOutputs, NetworkFromOnnxPath, TrtRunner
from polygraphy.comparator import Comparator, CompareFunc
from polygraphy.exception import PolygraphyException
import numpy as np

input_data_path = './test.npy'
onnx_path = './pose_detection.onnx'
save_inputs_path = './inputs.json'
onnx_output_path ='./onnx_outputs.json'
# Data Loader
data_loader = []
np_array = np.load(input_data_path)[np.newaxis,::]  
data_loader.append({"input_name": np_array})
# Loaders
parse_network_from_onnx = NetworkFromOnnxPath(onnx_path)
set_network_outputs = ModifyNetworkOutputs(parse_network_from_onnx)
build_engine = EngineBytesFromNetwork(set_network_outputs)
deserialize_engine = EngineFromBytes(build_engine)
load_onnx = OnnxFromPath(onnx_path)
modify_outputs = ModifyOnnxOutputs(load_onnx)
serialize_onnx = BytesFromOnnx(modify_outputs)
build_onnxrt_session = SessionFromOnnx(serialize_onnx)

# Runners
runners = [
    TrtRunner(deserialize_engine),
    OnnxrtRunner(build_onnxrt_session),
]

# Runner Execution
results = Comparator.run(runners, data_loader=data_loader, save_inputs_path=save_inputs_path)

# Save results
results.save(onnx_output_path)

success = True
# Accuracy Comparison
compare_func = CompareFunc.simple(rtol={'': 1e-04}, atol={'': 1e-04})
success &= bool(Comparator.compare_accuracy(results, compare_func=compare_func))

# Report Results
if not success:
    raise PolygraphyException('FAILED')

The full traceback of errors encountered is as follows: [Error details here]


[I] trt-runner-N0-05/09/24-17:56:13     | Activating and starting inference
[W] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[I] Configuring with profiles:[
        Profile 0:
            {input_1 [min=[1, 224, 224, 3], opt=[1, 224, 224, 3], max=[1, 224, 224, 3]]}
    ]
[I] Building engine with configuration:
    Flags                  | []
    Engine Capability      | EngineCapability.DEFAULT
    Memory Pools           | [WORKSPACE: 16070.94 MiB, TACTIC_DRAM: 16070.94 MiB]
    Tactic Sources         | [CUBLAS, CUBLAS_LT, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]
    Profiling Verbosity    | ProfilingVerbosity.DETAILED
    Preview Features       | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]
[I] Finished engine building in 24.815 seconds
[I] Saving inference input data to ./inputs.json
[W] Input tensor: input_1 | Buffer name (input_name) does not match expected input name (input_1).
[W] Input tensor: input_1 | Buffer dtype (uint8) does not match expected input dtype (float32), attempting to cast. 
[I] trt-runner-N0-05/09/24-17:56:13    
    ---- Inference Input(s) ----
    {input_1 [dtype=float32, shape=(1, 224, 224, 3)]}
[I] trt-runner-N0-05/09/24-17:56:13    
    ---- Inference Output(s) ----
    {Identity [dtype=float32, shape=(1, 2254, 12)],
     Identity_1 [dtype=float32, shape=(1, 2254, 1)]}
[I] trt-runner-N0-05/09/24-17:56:13     | Completed 1 iteration(s) in 2.487 ms | Average inference time: 2.487 ms.
[I] onnxrt-runner-N0-05/09/24-17:56:13  | Activating and starting inference
[I] Loading model: /home/ai-server/Downloads/pose_detection.onnx
[I] Creating ONNX-Runtime Inference Session with providers: ['CPUExecutionProvider']
[W] Input tensor: input_1 | Buffer name (input_name) does not match expected input name (input_1).
[W] Input tensor: input_1 | Buffer dtype (uint8) does not match expected input dtype (float32), attempting to cast. 
[I] onnxrt-runner-N0-05/09/24-17:56:13 
    ---- Inference Input(s) ----
    {input_1 [dtype=float32, shape=(1, 224, 224, 3)]}
[I] onnxrt-runner-N0-05/09/24-17:56:13 
    ---- Inference Output(s) ----
    {Identity [dtype=float32, shape=(1, 2254, 12)],
     Identity_1 [dtype=float32, shape=(1, 2254, 1)]}
[I] onnxrt-runner-N0-05/09/24-17:56:13  | Completed 1 iteration(s) in 3.367 ms | Average inference time: 3.367 ms.
[I] Saving inference results to ./onnx_outputs.json
[I] Accuracy Comparison | trt-runner-N0-05/09/24-17:56:13 vs. onnxrt-runner-N0-05/09/24-17:56:13
[I]     Comparing Output: 'Identity' (dtype=float32, shape=(1, 2254, 12)) with 'Identity' (dtype=float32, shape=(1, 2254, 12))
[I]         Tolerance: [abs=0.0001, rel=0.0001] | Checking elemwise error
[I]         trt-runner-N0-05/09/24-17:56:13: Identity | Stats: mean=25.786, std-dev=105.89, var=11212, median=11.683, min=-725.47 at (0, 2068, 11), max=889.33 at (0, 2198, 2), avg-magnitude=53.075
[I]             ---- Histogram ----
                Bin Range      |  Num Elems | Visualization
                (-725 , -564 ) |         17 | 
                (-564 , -403 ) |         21 | 
                (-403 , -241 ) |        106 | 
                (-241 , -79.6) |       1045 | #
                (-79.6, 81.9 ) |      23028 | ########################################
                (81.9 , 243  ) |       2159 | ###
                (243  , 405  ) |        199 | 
                (405  , 566  ) |        150 | 
                (566  , 728  ) |        183 | 
                (728  , 889  ) |        140 | 
[I]         onnxrt-runner-N0-05/09/24-17:56:13: Identity | Stats: mean=25.786, std-dev=105.89, var=11212, median=11.683, min=-725.47 at (0, 2068, 11), max=889.33 at (0, 2198, 2), avg-magnitude=53.075
[I]             ---- Histogram ----
                Bin Range      |  Num Elems | Visualization
                (-725 , -564 ) |         17 | 
                (-564 , -403 ) |         21 | 
                (-403 , -241 ) |        106 | 
                (-241 , -79.6) |       1045 | #
                (-79.6, 81.9 ) |      23028 | ########################################
                (81.9 , 243  ) |       2159 | ###
                (243  , 405  ) |        199 | 
                (405  , 566  ) |        150 | 
                (566  , 728  ) |        183 | 
                (728  , 889  ) |        140 | 
[I]         Error Metrics: Identity
[I]             Minimum Required Tolerance: elemwise error | [abs=0.0016785] OR [rel=0.022601] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=5.8989e-05, std-dev=9.3465e-05, var=8.7357e-09, median=2.6703e-05, min=0 at (0, 0, 8), max=0.0016785 at (0, 1643, 2), avg-magnitude=5.8989e-05
[I]                 ---- Histogram ----
                    Bin Range            |  Num Elems | Visualization
                    (0       , 0.000168) |      24712 | ########################################
                    (0.000168, 0.000336) |       1723 | ##
                    (0.000336, 0.000504) |        416 | 
                    (0.000504, 0.000671) |        118 | 
                    (0.000671, 0.000839) |         41 | 
                    (0.000839, 0.00101 ) |         25 | 
                    (0.00101 , 0.00117 ) |          5 | 
                    (0.00117 , 0.00134 ) |          3 | 
                    (0.00134 , 0.00151 ) |          2 | 
                    (0.00151 , 0.00168 ) |          3 | 
[I]             Relative Difference | Stats: mean=8.9918e-06, std-dev=0.00021124, var=4.4621e-08, median=1.0973e-06, min=0 at (0, 0, 8), max=0.022601 at (0, 120, 0), avg-magnitude=8.9918e-06
[I]                 ---- Histogram ----
                    Bin Range          |  Num Elems | Visualization
                    (0      , 0.00226) |      27038 | ########################################
                    (0.00226, 0.00452) |          5 | 
                    (0.00452, 0.00678) |          2 | 
                    (0.00678, 0.00904) |          0 | 
                    (0.00904, 0.0113 ) |          1 | 
                    (0.0113 , 0.0136 ) |          0 | 
                    (0.0136 , 0.0158 ) |          0 | 
                    (0.0158 , 0.0181 ) |          0 | 
                    (0.0181 , 0.0203 ) |          0 | 
                    (0.0203 , 0.0226 ) |          2 | 
[E]         FAILED | Output: 'Identity' | Difference exceeds tolerance (rel=0.0001, abs=0.0001)
[I]     Comparing Output: 'Identity_1' (dtype=float32, shape=(1, 2254, 1)) with 'Identity_1' (dtype=float32, shape=(1, 2254, 1))
[I]         Tolerance: [abs=0.0001, rel=0.0001] | Checking elemwise error
[I]         trt-runner-N0-05/09/24-17:56:13: Identity_1 | Stats: mean=-555.49, std-dev=1091, var=1.1904e+06, median=-80.048, min=-4176.4 at (0, 2120, 0), max=-1.0668 at (0, 1632, 0), avg-magnitude=555.49
[I]         onnxrt-runner-N0-05/09/24-17:56:13: Identity_1 | Stats: mean=-555.49, std-dev=1091, var=1.1904e+06, median=-80.048, min=-4176.4 at (0, 2120, 0), max=-1.0667 at (0, 1632, 0), avg-magnitude=555.49
[I]         Error Metrics: Identity_1
[I]             Minimum Required Tolerance: elemwise error | [abs=0.0070801] OR [rel=2.9114e-05] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=0.00019007, std-dev=0.00042142, var=1.7759e-07, median=4.9591e-05, min=0 at (0, 0, 0), max=0.0070801 at (0, 1642, 0), avg-magnitude=0.00019007
[I]             Relative Difference | Stats: mean=1.2865e-06, std-dev=2.2594e-06, var=5.1048e-12, median=5.3635e-07, min=0 at (0, 0, 0), max=2.9114e-05 at (0, 651, 0), avg-magnitude=1.2865e-06
[I]         PASSED | Output: 'Identity_1' | Difference is within tolerance (rel=0.0001, abs=0.0001)
[E]     FAILED | Mismatched outputs: ['Identity']
[E] Accuracy Summary | trt-runner-N0-05/09/24-17:56:13 vs. onnxrt-runner-N0-05/09/24-17:56:13 | Passed: 0/1 iterations | Pass Rate: 0.0%

The text was updated successfully, but these errors were encountered:

lix19937 · 2024-05-10T15:53:20Z

[I] Absolute Difference | Stats: mean=5.8989e-05, std-dev=9.3465e-05, var=8.7357e-09, median=2.6703e-05, min=0 at (0, 0, 8), max=0.0016785 at (0, 1643, 2), avg-magnitude=5.8989e-05

max=0.0016785

You can ref https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy/examples/cli/debug/02_reducing_failing_onnx_models

polygraphy run pose_detection.onnx --onnxrt \
    --save-inputs inputs.json \
    --onnx-outputs mark all --save-outputs layerwise_golden.json

polygraphy run pose_detection.onnx --trt \
   --validate --trt-outputs mark all --save-outputs trt_out.json

compare each layer

zerollzeng · 2024-05-12T07:21:37Z

Please also try the latest TRT release.

Sukeysun changed the title ~~XXX failure of TensorRT X.Y when running XXX on GPU XXX~~ Inconsistent results between TensorRT and ONNX May 9, 2024

zerollzeng added the triaged Issue has been triaged by maintainers label May 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent results between TensorRT and ONNX #3850

Inconsistent results between TensorRT and ONNX #3850

Sukeysun commented May 9, 2024

lix19937 commented May 10, 2024 •

edited

zerollzeng commented May 12, 2024

Inconsistent results between TensorRT and ONNX #3850

Inconsistent results between TensorRT and ONNX #3850

Comments

Sukeysun commented May 9, 2024

Description

Environment

Relevant Files

Steps To Reproduce

lix19937 commented May 10, 2024 • edited

zerollzeng commented May 12, 2024

lix19937 commented May 10, 2024 •

edited