Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does it support the new GGMLv3 quantization methods? #286

Open
Exotik850 opened this issue May 29, 2023 · 5 comments
Open

Does it support the new GGMLv3 quantization methods? #286

Exotik850 opened this issue May 29, 2023 · 5 comments

Comments

@Exotik850
Copy link

Tried using the cli application to see how far it had come from being llama-rs, and noticed that an error popped up using one of the newer WizardLM uncensored models using the GGMLv3 method,

llm llama chat --model-path .\Wizard-Vicuna-7B-Uncensored.ggmlv3.q5_1.bin
⣾ Loading model...Error:
   0: Could not load model
   1: invalid file format version 3

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Am I using it the wrong way or is it not supported yet?

@philpax
Copy link
Collaborator

philpax commented May 29, 2023

Hi there! Yes, it's supported, but only on the latest version (main) - we haven't cut a new release yet. Hope to have that sorted soon!

@Exotik850
Copy link
Author

My apologies, should've tried the main branch instead of just trying the release 😅

@philpax
Copy link
Collaborator

philpax commented May 31, 2023

No worries - I'll keep this up for now and pin it for people's reference until we get it out the door :)

@philpax philpax pinned this issue May 31, 2023
@arctic-hen7
Copy link

@philpax have you considered making some 0.2.0-beta.1 etc. releases on crates.io? This pattern has worked very well for some of my own projects in the past.

@philpax
Copy link
Collaborator

philpax commented Aug 21, 2023

Hi there! Yeah, I've considered it, but the main blocker is #221 - I don't want to cut a release where the interface is going to be radically different in the next release. I'm hoping to have this all closed out within the next week or two, especially with GGUF on the horizon, but I've been quite busy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants