Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RetinaFace, calibration int8, tensorRT8.6.1 error [pluginV2Runner.cpp::execute::265] Error Code 2: Internal Error (Assertion status == kSTATUS_SCUESS failed. ) #1456

Closed
ohadjerci opened this issue Mar 14, 2024 · 8 comments

Comments

@ohadjerci
Copy link

Env

  • docker nvcr.io/nvidia/tensorrt:24.01-py3
  • GPU, GeForce RTX 2060
  • OS, Ubuntu18.04
  • Cuda 12.0
  • TensorRT 8.6.1.6-1

About this repo

repo wang-xinyu/tensorrtx/retinaface
model retinaface

Hello,

The FP16 engine is working but with less performance and with some warning for TR 8.6.1.6

  • [W] [TRT] - 100 weights are affected by this issue: Detected subnormal FP16 values.
  • [W] [TRT] - 73 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.

In the other hand, calibration int8 with tr8.6.1.6 leads to errors :

  • [E] [TRT] 2: Assertion scales.size() == 1 failed.
  • [E] [TRT] 2: [pluginV2Runner.cpp::getInputHostScale::88] Error Code 2: Internal Error (Assertion scales.size() == 1 failed. )

Firstly, i tried to generate an engine with a trained model (half precision). Secondly, i reorganize the code like yolov9/v7.. but without sucess.

I’m hoping someone can tell me more about this error msg, or point me to documents that can explain it. Does it mean that the building process failed due while processing a plugin or the scale of the interpolation ?

Any suggestion is highly appreciated. Thanks in advance.

@wang-xinyu
Copy link
Owner

What did you mean of this? 【The FP16 engine is working but with less performance 】

You mean the latency is higher than int8? So that you want to use int8?

@ohadjerci
Copy link
Author

ohadjerci commented Mar 15, 2024

> What did you mean of this? 【The FP16 engine is working but with less performance 】
The engine is built but with higher FN
> You mean the latency is higher than int8? So that you want to use int8?
The error "Assertion status == kSTATUS_SCUESS failed" does not allow engine building

@wang-xinyu
Copy link
Owner

The engine is built but with higher FN

If the fp16 accuracy is not as expected, then it's meaningless to try int8.

Can you try fp32? And also try lower version of tensorrt. i.e. 8.4

@ohadjerci
Copy link
Author

ohadjerci commented Mar 15, 2024

> Can you try fp32? And also try lower version of tensorrt. i.e. 8.4

Yes the calibration int8 for RetinaFace work for previous TR versions, but i would like to know the reason of the error "Assertion scales.size() == 1 failed" with tr 8.6.1. I tried yolov7 and v9 with the same tr version and the calibration int8 work

@ohadjerci
Copy link
Author

Another information, we can no longer use the retinaface code from ubuntu 22.04 because only tensorRT version 8.6.1 is compatible.

@ohadjerci
Copy link
Author

To reproduce the error, just use docker "nvcr.io/nvidia/tensorrt:24.01-py3", install opencv and launch calib int8 from "https://github.com/wang-xinyu/tensorrtx/tree/master/retinaface"

@wang-xinyu
Copy link
Owner

Hi @ohadjerci
The code was developed on TRT 7.x. I guess there are some operations/layers which are deprecated on TRT 8.6.
As of now, we don't have plan to upgrade the code to support TRT 8.6.
It would be great if you can debug and solve the issue.

@ohadjerci
Copy link
Author

No, the issue is not related to deprecated operations.
The solution is to retrain with precision bf16 or fp16 and replace the plugin decoder by the a cpu code.
Also I recommend to use the onnx parser with TRT10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants