No such file or directory: "VILA1.5-13b-AWQ/llm/model-00001-of-00006.safetensors" #184

kousun12 · 2024-05-09T00:42:36Z

I've done the following:

Alternatively, one may also skip the quantization process and directy download the quantized VILA-1.5 checkpoints from here. Take VILA-1.5-13B as an example, after running:

cd tinychat
git clone https://huggingface.co/Efficient-Large-Model/VILA1.5-13b-AWQ
One may run:

python vlm_demo_new.py \
    --model-path VILA1.5-13b-AWQ \
    --quant-path VILA1.5-13b-AWQ/llm \ 
    --precision W4A16 \
    --image-file /PATH/TO/INPUT/IMAGE \

from the docs. Then for some reason the model_path looks for the non-quantized safetensors file:

(base) ~/llm-awq/tinychat$ CUDA_VISIBLE_DEVICES=0 python vlm_demo_new.py --model-path VILA1.5-13b-AWQ --quant-path VILA1.5-13b-AWQ/llm --precision W4A16 --image-file ../../docs-fuji-red.jpg
/home/ray/anaconda3/lib/python3.10/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/ray/anaconda3/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/ray/anaconda3/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/home/ray/anaconda3/lib/python3.10/site-packages/torchvision/image.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
/home/ray/llm-awq/tinychat/models/vila_llama.py:31: UserWarning: model_dtype not found in config, defaulting to torch.float16.
  warnings.warn("model_dtype not found in config, defaulting to torch.float16.")
real weight quantization...(init only): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:01<00:00, 26.14it/s]

[Warning] The awq quantized checkpoint seems to be in v1 format.
If the model cannot be loaded successfully, please use the latest awq library to re-quantized the model, or repack the current checkpoint with tinychat/offline-weight-repacker.py

Loading checkpoint:   0%|                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/ray/llm-awq/tinychat/vlm_demo_new.py", line 238, in <module>
    main(args)
  File "/home/ray/llm-awq/tinychat/vlm_demo_new.py", line 93, in main
    model.llm = load_awq_model(model.llm, args.quant_path, 4, 128, args.device)
  File "/home/ray/llm-awq/tinychat/utils/load_quant.py", line 82, in load_awq_model
    model = load_checkpoint_and_dispatch(
  File "/home/ray/anaconda3/lib/python3.10/site-packages/accelerate/big_modeling.py", line 579, in load_checkpoint_and_dispatch
    load_checkpoint_in_model(
  File "/home/ray/anaconda3/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1568, in load_checkpoint_in_model
    checkpoint = load_state_dict(checkpoint_file, device_map=device_map)
  File "/home/ray/anaconda3/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1313, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
FileNotFoundError: No such file or directory: "VILA1.5-13b-AWQ/llm/model-00001-of-00006.safetensors"

The text was updated successfully, but these errors were encountered:

kousun12 · 2024-05-09T00:57:30Z

I got a little farther by specifying the actual .pt file:

(base) ray@6c663dea2a49:~/llm-awq/tinychat$ CUDA_VISIBLE_DEVICES=0 python vlm_demo_new.py --model-path VILA1.5-13b-AWQ/ --quant-path VILA1.5-13b-AWQ/llm/vila-1.5-13b-w4-g128-awq-v2.pt --image-file https://media.substrate.run/docs-fuji-red.jpg
/home/ray/anaconda3/lib/python3.10/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/ray/anaconda3/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/ray/anaconda3/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/home/ray/anaconda3/lib/python3.10/site-packages/torchvision/image.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
/home/ray/llm-awq/tinychat/models/vila_llama.py:31: UserWarning: model_dtype not found in config, defaulting to torch.float16.
  warnings.warn("model_dtype not found in config, defaulting to torch.float16.")
real weight quantization...(init only): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:01<00:00, 27.16it/s]
Loading checkpoint: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00,  5.93s/it]
==================================================
USER: what is this
--------------------------------------------------
ASSISTANT: Traceback (most recent call last):
  File "/home/ray/llm-awq/tinychat/vlm_demo_new.py", line 238, in <module>
    main(args)
  File "/home/ray/llm-awq/tinychat/vlm_demo_new.py", line 184, in main
    outputs = stream_output(output_stream, time_stats)
  File "/home/ray/llm-awq/tinychat/utils/conversation_utils.py", line 83, in stream_output
    for outputs in output_stream:
  File "/home/ray/anaconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 56, in generator_context
    response = gen.send(request)
  File "/home/ray/llm-awq/tinychat/stream_generators/llava_stream_gen.py", line 177, in LlavaStreamGenerator
    out = model(
  File "/home/ray/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ray/llm-awq/tinychat/models/vila_llama.py", line 91, in forward
    outputs = self.llm.forward(
  File "/home/ray/anaconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/ray/llm-awq/tinychat/models/llama.py", line 332, in forward
    h = self.model(tokens, start_pos, inputs_embeds)
  File "/home/ray/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/ray/llm-awq/tinychat/models/llama.py", line 316, in forward
    h = layer(h, start_pos, freqs_cis, mask)
  File "/home/ray/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ray/llm-awq/tinychat/models/llama.py", line 263, in forward
    h = x + self.self_attn.forward(
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

hkunzhe · 2024-05-13T13:02:08Z

@kousun12, There may be issues with your environment. You can use the following dockerfile to setup an environment with llm-awq and VILA.

FROM nvidia/cuda:11.8.0-devel-ubuntu22.04

RUN apt-get update && \
    apt-get install -y openssh-server python3-pip vim git tmux

# Install VILA firstly
RUN git clone https://github.com/Efficient-Large-Model/VILA.git /root/VILA
WORKDIR /root/VILA
RUN pip install --upgrade pip
RUN pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
RUN wget https://github.com/Dao-AILab/flash-attention/releases/download/v2.4.2/flash_attn-2.4.2+cu118torch2.0cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
RUN pip install flash_attn-2.4.2+cu118torch2.0cxx11abiFALSE-cp310-cp310-linux_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118

RUN pip install setuptools_scm --index-url=https://pypi.org/simple
RUN pip install -e . && pip install -e ".[train]"

RUN pip install git+https://github.com/huggingface/transformers@v4.36.2
RUN site_pkg_path=$(python3 -c 'import site; print(site.getsitepackages()[0])')
RUN cp -rv ./llava/train/transformers_replace/* $site_pkg_path/transformers/

# Then install llm-awq
RUN git clone https://github.com/mit-han-lab/llm-awq /root/llm-awq
WORKDIR /root/llm-awq
RUN pip install -e .
WORKDIR /root/llm-awq/awq/kernels
# https://github.com/pytorch/extension-cpp/issues/71#issuecomment-1183674660
# TORCH_CUDA_ARCH_LIST=$(python3 -c 'import torch; print(".".join(map(str, torch.cuda.get_device_capability(0))))')
# TORCH_CUDA_ARCH_LIST="8.0+PTX" for A100
RUN export TORCH_CUDA_ARCH_LIST="8.0+PTX"
RUN python3 setup.py install

RUN pip install opencv-python-headless

RUN rm -rf /var/lib/apt/lists/*
RUN rm -rf /root/.cache

kousun12 · 2024-05-13T13:16:16Z

a dockerfile is very helpful - thanks for that. i will give this a try

kousun12 · 2024-05-13T13:20:37Z

I'm also running on H100s and have seen issues in the log console around TORCH_CUDA_ARCH_LIST -- Should I be setting that to 9.0?

hkunzhe · 2024-05-13T15:28:08Z

I'm also running on H100s and have seen issues in the log console around TORCH_CUDA_ARCH_LIST -- Should I be setting that to 9.0?

I think so.

NigelNelson · 2024-05-14T13:58:15Z

I'm also running on H100s and have seen issues in the log console around TORCH_CUDA_ARCH_LIST -- Should I be setting that to 9.0?
Should be setting it as:

TORCH_CUDA_ARCH_LIST="9.0+PTX"

cktlco · 2024-05-15T01:30:37Z

Based on the OP's suggestion, I was able to resolve exactly the same issue by specifying the vila-1.5-13b-w4-g128-awq-v2.pt file location (not just the llm/ directory) directly in the --quant-path param. After this, the demo worked as expected.

python vlm_demo_new.py --model-path ../VILA1.5-13b-AWQ --quant-path ../VILA1.5-13b-AWQ/llm/vila-1.5-13b-w4-g128-awq-v2.pt --precision W4A16 --image-file ../VILA/demo_images/av.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No such file or directory: "VILA1.5-13b-AWQ/llm/model-00001-of-00006.safetensors" #184

No such file or directory: "VILA1.5-13b-AWQ/llm/model-00001-of-00006.safetensors" #184

kousun12 commented May 9, 2024

kousun12 commented May 9, 2024

hkunzhe commented May 13, 2024

kousun12 commented May 13, 2024

kousun12 commented May 13, 2024

hkunzhe commented May 13, 2024

NigelNelson commented May 14, 2024

cktlco commented May 15, 2024

No such file or directory: "VILA1.5-13b-AWQ/llm/model-00001-of-00006.safetensors" #184

No such file or directory: "VILA1.5-13b-AWQ/llm/model-00001-of-00006.safetensors" #184

Comments

kousun12 commented May 9, 2024

kousun12 commented May 9, 2024

hkunzhe commented May 13, 2024

kousun12 commented May 13, 2024

kousun12 commented May 13, 2024

hkunzhe commented May 13, 2024

NigelNelson commented May 14, 2024

cktlco commented May 15, 2024