Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantizing and running inference on bloom-176B required some changes #21

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

barsuna
Copy link

@barsuna barsuna commented Apr 2, 2023

  • Most issues are due to fact that embedding layer 250880x14336 is too large to fit into signed integer
  • Above affects the main, quantize, and also ggml code
  • 2nd issue is that main seems to estimate amount of necessary memory on the low side
  • Above is not fixed, i have just added 5GB for weights and doubled the size of context used for model evaluation Being very far away from proficiency in C++, these changes need to be civilized by someone experienced with ggml and c++

- Most issues are due to fact that embedding layer 250880x14336 is too large to fit into signed integer
- Above affects the main, quantize, and also ggml code
- 2nd issue is that main seems to estimate amount of necessary memory on the low side
- Above is not fixed, i have just added 5GB for weights and doubled the size of context used for model evaluation
Being very far away from proficiency in C++, these changes need to be civilized by someone experienced with ggml and c++
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant