You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Are there some resources that explain how the quantized parameters are structured in a GGUF file?
We are interested in porting HQQ-quantized models into GGUF format, but in order to do that, we need to know exactly how it is stored.
We basically need to know:
The bitpacking logic
axis along which quantization is done
group-sizes associated with different quant types
Thanks!
The text was updated successfully, but these errors were encountered:
Hello!
Are there some resources that explain how the quantized parameters are structured in a GGUF file?
We are interested in porting HQQ-quantized models into GGUF format, but in order to do that, we need to know exactly how it is stored.
We basically need to know:
Thanks!
The text was updated successfully, but these errors were encountered: