Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per tensor quantization in smoothquant #1689

Closed
chensterliu opened this issue Mar 22, 2024 · 2 comments
Closed

Per tensor quantization in smoothquant #1689

chensterliu opened this issue Mar 22, 2024 · 2 comments
Assignees

Comments

@chensterliu
Copy link

Hello community,
I've tried the smoothquant flow on an OPT-125m model with the default setting. Unsurprisely the activations are quantized per tensor and weighs are per channel. According to the following table from the SmoothQuant paper I see weights can be also quantized per tensor (smoothquant-O3). Is it possible to apply smoothquant by setting the QuantizeConfig or other stuff? I really aim a per_tensor quantization on both activations and weights due to a limitation of my hardware. Thanks!

t3

@yintong-lu
Copy link
Contributor

Hi Chen, thanks for your response.
Currently weights could only be quantized per-channel in INC SmoothQuant. Please refer to SmoothQuant_doc for more details of our implementation.
Thanks!

@chensterliu
Copy link
Author

I see, thanks for your reply!

@thuang6 thuang6 closed this as completed May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants