Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

several questions #26

Open
wesleysanjose opened this issue May 3, 2023 · 1 comment
Open

several questions #26

wesleysanjose opened this issue May 3, 2023 · 1 comment

Comments

@wesleysanjose
Copy link

i have only 16gb mem so i tried to use local-memory parameter, model loaded and i see converting started, but in the end it says killed still. i see a 20G model file generated. is it considered success?

also i was trying to convert the finetuned bloom model, (https://huggingface.co/BelleGroup/BELLE-7B-2M/tree/main). it was finetuned on 7B but looks like it was fp32 instead of fp16 so it's double sized. do i need to supply any additional param when trying to convert it to ggml? reason is after the conversion, the result becomes non-sense and weird chars.

or should i use their gptq 8bit quantized model to convert?

@wesleysanjose
Copy link
Author

ransformer.h.21.input_layernorm.bias -> layers.21.attention_norm.bias
layers.21.attention_norm.bias 1 (4096,)
transformer.h.21.self_attention.query_key_value.weight -> layers.21.attention.query_key_value.weight
Killed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant