Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A PTQ tflite model fails to pass benchmark test #95

Open
liamsun2019 opened this issue Jul 12, 2022 · 6 comments
Open

A PTQ tflite model fails to pass benchmark test #95

liamsun2019 opened this issue Jul 12, 2022 · 6 comments
Labels
bug Something isn't working work/x-small work that can be done within 3 hour

Comments

@liamsun2019
Copy link

My use case:
Apply post training quantization to a pth model and convert to tflite. The generated tflite model fails to pass benchmark test with following error message:
STARTING!
Log parameter values verbosely: [0]
Graph: [out/ptq_model.tflite]
Loaded model out/ptq_model.tflite
ERROR: tensorflow/lite/kernels/concatenation.cc:179 t->params.scale != output->params.scale (3 != -657359264)
ERROR: Node number 154 (CONCATENATION) failed to prepare.
Failed to allocate tensors!
Benchmarking failed.

Pls refer to the attachment. Thanks.
test.zip

@liamsun2019
Copy link
Author

My quantization strategy:
quantizer = PostQuantizer(model, dummy_input, work_dir='out', config={'force_overwrite': True, 'rewrite_graph': True, 'is_input_quantized': None, 'asymmetric': False, 'per_tensor': False})
。。。。。。。。。。。。。。。。。。。。。。。。。。
converter = TFLiteConverter(ptq_model, dummy_input, tflite_path='out/ptq_model.tflite', strict_symmetric_check=False, quantize_target_type='int8')

@liamsun2019
Copy link
Author

The following strategy works:
quantizer = PostQuantizer(model, dummy_input, work_dir='out', config={'force_overwrite': True, 'rewrite_graph': True, 'is_input_quantized': None, 'asymmetric': True, 'per_tensor': True})
。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。
converter = TFLiteConverter(ptq_model, dummy_input, tflite_path='out/ptq_model.tflite', strict_symmetric_check=False, quantize_target_type='uint8')

@liamsun2019
Copy link
Author

Looks int8 per-channel quantization may incur errors.

@peterjc123 peterjc123 added the bug Something isn't working label Jul 13, 2022
@peterjc123
Copy link
Collaborator

The following pattern in your model is the root cause of the problem.

A = sigmoid(X)
B = cat(A, Y)

The output tensor of the sigmoid op has a constant quantization parameter. There are several ways to get it fixed.

  1. Unify the quantization parameters of (Y, B) to the quantization parameters of A and also we need to disable the observers in those variables.
  2. Insert requantization after A, so that we have
A = sigmoid(X)
A_ = requantize(A)
B = cat(A_, Y)

Then, we may just unify the quantization parameters of (A_, Y, B), just like what we do as usual.

@peterjc123
Copy link
Collaborator

Or you may just skip the quantization for this kind of pattern, which seems to be the simplest solution.

@peterjc123
Copy link
Collaborator

  1. Unify the quantization parameters of (Y, B) to the quantization parameters of A and also we need to disable the observers in those variables.

This is simpler I guess. We will try to fix it this way.

@peterjc123 peterjc123 added the work/x-small work that can be done within 3 hour label Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working work/x-small work that can be done within 3 hour
Projects
None yet
Development

No branches or pull requests

2 participants