Any plan to support gptq? #39

gaoxiao · 2023-03-31T03:39:43Z

Hi there,
Do you have any plan to support 4bit quant like gptq? https://github.com/qwopqwop200/GPTQ-for-LLaMa

deep-diver · 2023-03-31T04:01:06Z

Not yet. I have not looked into gptq yet, so currently dont know how to add it

Originalimoc · 2023-04-03T11:45:51Z

30B albeit 4bit using less than 20G VRAM performing better than 13B(?), sounds should be good

deep-diver · 2023-04-03T11:51:39Z

Yeah but I usually run it in float16 mode which is faster 2023년 4월 3일 (월) 오후 8:46, Originalimoc ***@***.***>님이 작성:

30B albeit 4bit using less than 20G VRAM performing better than 13B(?), sounds should be good — Reply to this email directly, view it on GitHub <#39 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGGR4N45ZFNQXQTRZBZUSXTW7KZ7TANCNFSM6AAAAAAWODOTFM> . You are receiving this because you commented.Message ID: ***@***.***>

Provide feedback