Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set the default scale_dtype to FP16 #104

Closed
wenhuach21 opened this issue May 6, 2024 · 1 comment
Closed

Set the default scale_dtype to FP16 #104

wenhuach21 opened this issue May 6, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@wenhuach21
Copy link
Contributor

There's no necessity to use FP32 scale for packing with the autogptq Triton backend. We can instead set FP16 scale dtype as the default. Nonetheless, it's essential to validate accuracy for some models.

@wenhuach21 wenhuach21 added the enhancement New feature or request label May 6, 2024
@wenhuach21
Copy link
Contributor Author

aligned

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant