Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent results between TensorRT and ONNX #3850

Open
Sukeysun opened this issue May 9, 2024 · 2 comments
Open

Inconsistent results between TensorRT and ONNX #3850

Sukeysun opened this issue May 9, 2024 · 2 comments
Labels
triaged Issue has been triaged by maintainers

Comments

@Sukeysun
Copy link

Sukeysun commented May 9, 2024

Description

When attempting to convert the ONNX model 'pose_detect' of mediapipe to TensorRT, I observed a significant loss in model accuracy. The model fails to maintain its original precision after the conversion process.

Environment

TensorRT Version: 8.6.1

CUDA Version: 11.8

Relevant Files

Model link:
https://storage.googleapis.com/ailia-models/blazepose-fullbody/pose_detection.onnx

Steps To Reproduce

  1. Download the ONNX model from the provided link.
  2. Use the following command to compare the ONNX output and TensorRT output:
from polygraphy.backend.onnx import BytesFromOnnx, ModifyOutputs as ModifyOnnxOutputs, OnnxFromPath
from polygraphy.backend.onnxrt import OnnxrtRunner, SessionFromOnnx
from polygraphy.backend.trt import EngineBytesFromNetwork, EngineFromBytes, ModifyNetworkOutputs, NetworkFromOnnxPath, TrtRunner
from polygraphy.comparator import Comparator, CompareFunc
from polygraphy.exception import PolygraphyException
import numpy as np

input_data_path = './test.npy'
onnx_path = './pose_detection.onnx'
save_inputs_path = './inputs.json'
onnx_output_path ='./onnx_outputs.json'
# Data Loader
data_loader = []
np_array = np.load(input_data_path)[np.newaxis,::]  
data_loader.append({"input_name": np_array})
# Loaders
parse_network_from_onnx = NetworkFromOnnxPath(onnx_path)
set_network_outputs = ModifyNetworkOutputs(parse_network_from_onnx)
build_engine = EngineBytesFromNetwork(set_network_outputs)
deserialize_engine = EngineFromBytes(build_engine)
load_onnx = OnnxFromPath(onnx_path)
modify_outputs = ModifyOnnxOutputs(load_onnx)
serialize_onnx = BytesFromOnnx(modify_outputs)
build_onnxrt_session = SessionFromOnnx(serialize_onnx)

# Runners
runners = [
    TrtRunner(deserialize_engine),
    OnnxrtRunner(build_onnxrt_session),
]

# Runner Execution
results = Comparator.run(runners, data_loader=data_loader, save_inputs_path=save_inputs_path)

# Save results
results.save(onnx_output_path)

success = True
# Accuracy Comparison
compare_func = CompareFunc.simple(rtol={'': 1e-04}, atol={'': 1e-04})
success &= bool(Comparator.compare_accuracy(results, compare_func=compare_func))

# Report Results
if not success:
    raise PolygraphyException('FAILED')

The full traceback of errors encountered is as follows: [Error details here]


[I] trt-runner-N0-05/09/24-17:56:13     | Activating and starting inference
[W] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[I] Configuring with profiles:[
        Profile 0:
            {input_1 [min=[1, 224, 224, 3], opt=[1, 224, 224, 3], max=[1, 224, 224, 3]]}
    ]
[I] Building engine with configuration:
    Flags                  | []
    Engine Capability      | EngineCapability.DEFAULT
    Memory Pools           | [WORKSPACE: 16070.94 MiB, TACTIC_DRAM: 16070.94 MiB]
    Tactic Sources         | [CUBLAS, CUBLAS_LT, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]
    Profiling Verbosity    | ProfilingVerbosity.DETAILED
    Preview Features       | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]
[I] Finished engine building in 24.815 seconds
[I] Saving inference input data to ./inputs.json
[W] Input tensor: input_1 | Buffer name (input_name) does not match expected input name (input_1).
[W] Input tensor: input_1 | Buffer dtype (uint8) does not match expected input dtype (float32), attempting to cast. 
[I] trt-runner-N0-05/09/24-17:56:13    
    ---- Inference Input(s) ----
    {input_1 [dtype=float32, shape=(1, 224, 224, 3)]}
[I] trt-runner-N0-05/09/24-17:56:13    
    ---- Inference Output(s) ----
    {Identity [dtype=float32, shape=(1, 2254, 12)],
     Identity_1 [dtype=float32, shape=(1, 2254, 1)]}
[I] trt-runner-N0-05/09/24-17:56:13     | Completed 1 iteration(s) in 2.487 ms | Average inference time: 2.487 ms.
[I] onnxrt-runner-N0-05/09/24-17:56:13  | Activating and starting inference
[I] Loading model: /home/ai-server/Downloads/pose_detection.onnx
[I] Creating ONNX-Runtime Inference Session with providers: ['CPUExecutionProvider']
[W] Input tensor: input_1 | Buffer name (input_name) does not match expected input name (input_1).
[W] Input tensor: input_1 | Buffer dtype (uint8) does not match expected input dtype (float32), attempting to cast. 
[I] onnxrt-runner-N0-05/09/24-17:56:13 
    ---- Inference Input(s) ----
    {input_1 [dtype=float32, shape=(1, 224, 224, 3)]}
[I] onnxrt-runner-N0-05/09/24-17:56:13 
    ---- Inference Output(s) ----
    {Identity [dtype=float32, shape=(1, 2254, 12)],
     Identity_1 [dtype=float32, shape=(1, 2254, 1)]}
[I] onnxrt-runner-N0-05/09/24-17:56:13  | Completed 1 iteration(s) in 3.367 ms | Average inference time: 3.367 ms.
[I] Saving inference results to ./onnx_outputs.json
[I] Accuracy Comparison | trt-runner-N0-05/09/24-17:56:13 vs. onnxrt-runner-N0-05/09/24-17:56:13
[I]     Comparing Output: 'Identity' (dtype=float32, shape=(1, 2254, 12)) with 'Identity' (dtype=float32, shape=(1, 2254, 12))
[I]         Tolerance: [abs=0.0001, rel=0.0001] | Checking elemwise error
[I]         trt-runner-N0-05/09/24-17:56:13: Identity | Stats: mean=25.786, std-dev=105.89, var=11212, median=11.683, min=-725.47 at (0, 2068, 11), max=889.33 at (0, 2198, 2), avg-magnitude=53.075
[I]             ---- Histogram ----
                Bin Range      |  Num Elems | Visualization
                (-725 , -564 ) |         17 | 
                (-564 , -403 ) |         21 | 
                (-403 , -241 ) |        106 | 
                (-241 , -79.6) |       1045 | #
                (-79.6, 81.9 ) |      23028 | ########################################
                (81.9 , 243  ) |       2159 | ###
                (243  , 405  ) |        199 | 
                (405  , 566  ) |        150 | 
                (566  , 728  ) |        183 | 
                (728  , 889  ) |        140 | 
[I]         onnxrt-runner-N0-05/09/24-17:56:13: Identity | Stats: mean=25.786, std-dev=105.89, var=11212, median=11.683, min=-725.47 at (0, 2068, 11), max=889.33 at (0, 2198, 2), avg-magnitude=53.075
[I]             ---- Histogram ----
                Bin Range      |  Num Elems | Visualization
                (-725 , -564 ) |         17 | 
                (-564 , -403 ) |         21 | 
                (-403 , -241 ) |        106 | 
                (-241 , -79.6) |       1045 | #
                (-79.6, 81.9 ) |      23028 | ########################################
                (81.9 , 243  ) |       2159 | ###
                (243  , 405  ) |        199 | 
                (405  , 566  ) |        150 | 
                (566  , 728  ) |        183 | 
                (728  , 889  ) |        140 | 
[I]         Error Metrics: Identity
[I]             Minimum Required Tolerance: elemwise error | [abs=0.0016785] OR [rel=0.022601] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=5.8989e-05, std-dev=9.3465e-05, var=8.7357e-09, median=2.6703e-05, min=0 at (0, 0, 8), max=0.0016785 at (0, 1643, 2), avg-magnitude=5.8989e-05
[I]                 ---- Histogram ----
                    Bin Range            |  Num Elems | Visualization
                    (0       , 0.000168) |      24712 | ########################################
                    (0.000168, 0.000336) |       1723 | ##
                    (0.000336, 0.000504) |        416 | 
                    (0.000504, 0.000671) |        118 | 
                    (0.000671, 0.000839) |         41 | 
                    (0.000839, 0.00101 ) |         25 | 
                    (0.00101 , 0.00117 ) |          5 | 
                    (0.00117 , 0.00134 ) |          3 | 
                    (0.00134 , 0.00151 ) |          2 | 
                    (0.00151 , 0.00168 ) |          3 | 
[I]             Relative Difference | Stats: mean=8.9918e-06, std-dev=0.00021124, var=4.4621e-08, median=1.0973e-06, min=0 at (0, 0, 8), max=0.022601 at (0, 120, 0), avg-magnitude=8.9918e-06
[I]                 ---- Histogram ----
                    Bin Range          |  Num Elems | Visualization
                    (0      , 0.00226) |      27038 | ########################################
                    (0.00226, 0.00452) |          5 | 
                    (0.00452, 0.00678) |          2 | 
                    (0.00678, 0.00904) |          0 | 
                    (0.00904, 0.0113 ) |          1 | 
                    (0.0113 , 0.0136 ) |          0 | 
                    (0.0136 , 0.0158 ) |          0 | 
                    (0.0158 , 0.0181 ) |          0 | 
                    (0.0181 , 0.0203 ) |          0 | 
                    (0.0203 , 0.0226 ) |          2 | 
[E]         FAILED | Output: 'Identity' | Difference exceeds tolerance (rel=0.0001, abs=0.0001)
[I]     Comparing Output: 'Identity_1' (dtype=float32, shape=(1, 2254, 1)) with 'Identity_1' (dtype=float32, shape=(1, 2254, 1))
[I]         Tolerance: [abs=0.0001, rel=0.0001] | Checking elemwise error
[I]         trt-runner-N0-05/09/24-17:56:13: Identity_1 | Stats: mean=-555.49, std-dev=1091, var=1.1904e+06, median=-80.048, min=-4176.4 at (0, 2120, 0), max=-1.0668 at (0, 1632, 0), avg-magnitude=555.49
[I]         onnxrt-runner-N0-05/09/24-17:56:13: Identity_1 | Stats: mean=-555.49, std-dev=1091, var=1.1904e+06, median=-80.048, min=-4176.4 at (0, 2120, 0), max=-1.0667 at (0, 1632, 0), avg-magnitude=555.49
[I]         Error Metrics: Identity_1
[I]             Minimum Required Tolerance: elemwise error | [abs=0.0070801] OR [rel=2.9114e-05] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=0.00019007, std-dev=0.00042142, var=1.7759e-07, median=4.9591e-05, min=0 at (0, 0, 0), max=0.0070801 at (0, 1642, 0), avg-magnitude=0.00019007
[I]             Relative Difference | Stats: mean=1.2865e-06, std-dev=2.2594e-06, var=5.1048e-12, median=5.3635e-07, min=0 at (0, 0, 0), max=2.9114e-05 at (0, 651, 0), avg-magnitude=1.2865e-06
[I]         PASSED | Output: 'Identity_1' | Difference is within tolerance (rel=0.0001, abs=0.0001)
[E]     FAILED | Mismatched outputs: ['Identity']
[E] Accuracy Summary | trt-runner-N0-05/09/24-17:56:13 vs. onnxrt-runner-N0-05/09/24-17:56:13 | Passed: 0/1 iterations | Pass Rate: 0.0%
@Sukeysun Sukeysun changed the title XXX failure of TensorRT X.Y when running XXX on GPU XXX Inconsistent results between TensorRT and ONNX May 9, 2024
@lix19937
Copy link

lix19937 commented May 10, 2024

[I] Absolute Difference | Stats: mean=5.8989e-05, std-dev=9.3465e-05, var=8.7357e-09, median=2.6703e-05, min=0 at (0, 0, 8), max=0.0016785 at (0, 1643, 2), avg-magnitude=5.8989e-05

max=0.0016785

You can ref https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy/examples/cli/debug/02_reducing_failing_onnx_models

polygraphy run pose_detection.onnx --onnxrt \
    --save-inputs inputs.json \
    --onnx-outputs mark all --save-outputs layerwise_golden.json

polygraphy run pose_detection.onnx --trt \
   --validate --trt-outputs mark all --save-outputs trt_out.json

compare each layer

@zerollzeng zerollzeng added the triaged Issue has been triaged by maintainers label May 12, 2024
@zerollzeng
Copy link
Collaborator

Please also try the latest TRT release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

3 participants