-
Notifications
You must be signed in to change notification settings - Fork 957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pytorch quantization bias is not quantised on aarch64 #1864
Comments
Hi @renato-arantes, do you mean quantized to The example
|
Hi @shu1chen, Your answer is not related to my question that is about Pytorch, and not an example that you said I mentioned, but I did not. Maybe you are answering another question here by mistake? Cheers, |
Hi @renato-arantes, the second here is the same example in the source code as I referred. |
@renato-arantes From oneDNN perspective, the datatype of the bias is user defined. Internally, it can be upconverted (first link). Now why would PyTorch not quantize the bias? This is a question to ask PyTorch maintainers, but in general, there is little reason to quantize the bias tensor as it is small when compared to layer weights and activations. Adding @milpuz01, @snadampal @malfet for more comments. |
Hi all,
I created a simple Pytorch image classification example that correctly classifies a sample image with
76%
accuracy. I then applied static quantization to the model and it continued to correctly classify the sample with73%
accuracy and the size of the model, as expected, dropped from 102.5 MB to 25.6 MB after quantization. However, when analyzing theDNNL_VERBOSE
output from the model run, I could see that the bias isf32
, so it is NOT being quantized or converted tos32
. Is there a special reason not to quantize the bias?By inspect here I can see that the bias is by default
f32
but in the documentation here the quantized bias iss32
.Here is a
DNNL_VERBOSE
output sample:The text was updated successfully, but these errors were encountered: