Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QuantReLU scale factors #859

Closed
phixerino opened this issue Feb 16, 2024 · 2 comments
Closed

QuantReLU scale factors #859

phixerino opened this issue Feb 16, 2024 · 2 comments

Comments

@phixerino
Copy link

Follow up to #791 (I didn't respond in time before the issue was closed, but thank you for the response).

So is scaling_per_output_channel in QuantReLU by default True, or False? And when it's False, then there is one scale factor per tensor?

And the reasoning behind having scaling_per_output_channel=True after the first layer in MobileNetV1 and scaling_per_output_channel=False in ProxylessNAS Mobile14 is arbitrary?

Also related question, what is the role of per_channel_broadcastable_shape and scaling_stats_permute_dims? I couldn't find anything about it and it is in some examples and in some it's not.

@Giuseppe5
Copy link
Collaborator

So is scaling_per_output_channel in QuantReLU by default True, or False? And when it's False, then there is one scale factor per tensor?

The default quantizer used by QuantReLU has scaling_per_output_channel set to False.

And the reasoning behind having scaling_per_output_channel=True after the first layer in MobileNetV1 and scaling_per_output_channel=False in ProxylessNAS Mobile14 is arbitrary?

They were made considering certain trade-offs in terms of accuracy, hardware constraints, etc. These constraints might be different between users.

Also related question, what is the role of per_channel_broadcastable_shape and scaling_stats_permute_dims? I couldn't find anything about it and it is in some examples and in some it's not.

We are going to expand our notebooks with some explanations about this.

@Giuseppe5
Copy link
Collaborator

We are working on a PR to expand the explanation about per channel scale factors with activations.
#867

I am closing this issue, but feel free to re-open in case there are further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants