Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while generating real quantized weights for VILA #160

Open
ocg2347 opened this issue Mar 15, 2024 · 0 comments
Open

Error while generating real quantized weights for VILA #160

ocg2347 opened this issue Mar 15, 2024 · 0 comments

Comments

@ocg2347
Copy link

ocg2347 commented Mar 15, 2024

I can successfully run vila-7b but when i want to generate awq weights using the "vila-7b-w4-g128-v2.pt" from "https://huggingface.co/Efficient-Large-Model/VILA-7b-4bit-awq/tree/main" I got the error below. Anyone facing this? or anyone who managed to get inference with vila-7b-awq?

note to developers: In "https://github.com/mit-han-lab/llm-awq/blob/main/scripts/vila_example.sh", it is stated that search results are shared, but it does not seem under "https://huggingface.co/datasets/mit-han-lab/awq-model-zoo/tree/main".

`
root@e9118846cb22:/llm-awq# python -m awq.entry --model_path vila-7b --w_bit 4 --q_group_size 128 --load_awq VILA-7b-4bit-awq/vila-7b-w4-g128-v2.pt --q_backend real --dump_quant quant_cache/llama-2-7b-chat-w4-g128-awq.pt
Quantization config: {'zero_point': True, 'q_group_size': 128}

  • Building model vila-7b
    You are using a model of type llava_llama to instantiate a model of type llama. This is not supported for all configurations of models and can yield errors.
    Traceback (most recent call last):
    File "/root/.pyenv/versions/3.10.13/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
    File "/root/.pyenv/versions/3.10.13/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
    File "/llm-awq/awq/entry.py", line 299, in
    main()
    File "/llm-awq/awq/entry.py", line 239, in main
    model, enc = build_model_and_enc(args.model_path)
    File "/llm-awq/awq/entry.py", line 93, in build_model_and_enc
    enc, model, image_processor, context_len = load_pretrained_model(
    File "/root/.pyenv/versions/3.10.13/lib/python3.10/site-packages/llava/model/builder.py", line 118, in load_pretrained_model
    model = LlavaLlamaForCausalLM.from_pretrained(model_path, config=config, low_cpu_mem_usage=True, **kwargs)
    File "/root/.pyenv/versions/3.10.13/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3462, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
    TypeError: LlavaLlamaForCausalLM.init() got an unexpected keyword argument 'use_cache'
    `
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant