Set the default scale_dtype to FP16 #104

wenhuach21 · 2024-05-06T13:00:28Z

There's no necessity to use FP32 scale for packing with the autogptq Triton backend. We can instead set FP16 scale dtype as the default. Nonetheless, it's essential to validate accuracy for some models.

wenhuach21 · 2024-05-16T09:05:23Z

aligned

wenhuach21 added the enhancement New feature or request label May 6, 2024

wenhuach21 closed this as completed May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set the default scale_dtype to FP16 #104

Set the default scale_dtype to FP16 #104

wenhuach21 commented May 6, 2024

wenhuach21 commented May 16, 2024

Set the default scale_dtype to FP16 #104

Set the default scale_dtype to FP16 #104

Comments

wenhuach21 commented May 6, 2024

wenhuach21 commented May 16, 2024