Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AIMET vs SNPE quantization #2854

Open
Piotr94 opened this issue Apr 4, 2024 · 3 comments
Open

AIMET vs SNPE quantization #2854

Piotr94 opened this issue Apr 4, 2024 · 3 comments

Comments

@Piotr94
Copy link

Piotr94 commented Apr 4, 2024

I would like to ask some questions about the difference between AIMET and SNPE quantization.

I am attempting to perform quantization of the video denoising model.

I started with SNPE and used the following commands:

snpe-onnx-to-dlc -i MODEL_NAME.onnx -o MODEL_NAME2.dlc
snpe-dlc-quantize --input_dlc MODEL_NAME.dlc --input_list Inputlist.txt --use_enhanced_quantizer --use_adjusted_weights_quantizer --axis_quant --output_dlc Quant_MODEL_NAME.dlc  --enable_htp --htp_socs sm8550

For calibration, I used 5 inputs with different levels of noise. Then I tested the quantized model on exactly the same inputs using the command below:
snpe-net-run --container Quant_MODEL_NAME.dlc --input_list Inputlist.txt
The quality is lower, but the difference is acceptable. On average, there is a drop from 37.77 dB to 37.17.

To have more control over quantization, I wanted to use AIMET. I used the code below to obtain the quantized model:

import onnx
from aimet_onnx.batch_norm_fold import fold_all_batch_norms_to_weight
from aimet_onnx.quantsim import QuantizationSimModel
from aimet_common.defs import QuantScheme
import numpy as np
onnx_model = onnx.load("MODEL_NAME.onnx")
input_shape = (1, 13, 320, 320)
dummy_data = np.random.randn(*input_shape).astype(np.float32)
dummy_input = {'0' : dummy_data}
_ = fold_all_batch_norms_to_weight(onnx_model)
sim = QuantizationSimModel(onnx_model, quant_scheme=QuantScheme.post_training_tf_enhanced,
                           rounding_mode='nearest', default_param_bw=8, default_activation_bw=8,
                           use_symmetric_encodings=False, use_cuda=True)

Then, for calibration, I used the same 5 inputs as for SNPE:

def pass_calibration_data(session, args):
    eval_dataset = LabeledDatasetWrapper()
    for input_data, _ in eval_dataset:
        input_dict = {'0' : input_data[None, :]}
        print(input_dict['0'].shape)
        session.run(None, input_dict)
sim.compute_encodings(pass_calibration_data, None)

Later, I evaluated the quantized model on the same 5 inputs, but the obtained results were very poor, with an average PSNR of 25 dB. For evaluation, I used the following code:

eval_dataset = LabeledDatasetWrapper()
for n, (input_data, clean) in enumerate(eval_dataset):
    input_dict = {'0' : input_data[None, :]}
    print(input_dict['0'].shape)
    outputs = sim.session.run(None, input_dict)
    output_name = sim.session.get_outputs()[0].name
    print(psnr(torch.tensor(outputs[0]), torch.tensor(clean)))

I presume that 5 inputs are not enough, but the difference between the obtained results is very confusing for me. Could you tell me what I can do to obtain the same results with AIMET as with SNPE quantization? Are there any mistakes in the current AIMET approach?

@quic-mangal
Copy link
Contributor

quic-mangal commented Apr 15, 2024

@Piotr94, AIMET does not support adjusted_weights_quantizer, could you disable it SNPE as well. Also, are you using Per channel quantization in AIMET, since it is being used with SNPE in your code?

I presume that 5 inputs are not enough, but the difference between the obtained results is very confusing for me.

Yes, 5 is a very small dataset.

@Piotr94
Copy link
Author

Piotr94 commented Apr 26, 2024

Thanks for your answer. Indeed i didn't use per channel quantization in AIMET but I don't see how could I set it. In the documentation for QuantSimModel (link) I couldn't find such option. Can you tell where can I find it/ how can I enable it?

@quic-mangal
Copy link
Contributor

Yes, looks like it is not documented. For per channel you can an example here-
https://github.com/quic/aimet/blob/develop/NightlyTests/torch/test_quantize_resnet18.py

You need to change the config file that is passed to Quantsim as show in-
save_config_file_for_per_channel_quantization

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants