Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: We need an offload_dir to dispatch this model according to this device_map... #79

Open
Pingmin opened this issue Jul 26, 2023 · 0 comments

Comments

@Pingmin
Copy link

Pingmin commented Jul 26, 2023

大家好!

我今天试着跑了下这个 Huatuo-Llama-Med-Chinese (整个过程见下文说明),然后遇到了这个错误:

$ bash scripts/infer.sh 
/usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.0.4) or chardet (4.0.0) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 5.0
CUDA SETUP: Detected CUDA version 122
/home/tcmai/.local/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU!
  warn(msg)
CUDA SETUP: Loading binary /home/tcmai/.local/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda122_nocublaslt.so...
/home/tcmai/.local/lib/python3.10/site-packages/bitsandbytes/cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'. 
The class this function is called from is 'LlamaTokenizer'.
The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.
Loading checkpoint shards: 100%|█████████████████████████████████████| 33/33 [01:52<00:00,  3.40s/it]
using lora ./lora-llama-med
Traceback (most recent call last):
  File "/data/source/medical-llm/Huatuo-Llama-Med-Chinese-git/infer.py", line 125, in <module>
    fire.Fire(main)
  File "/home/tcmai/.local/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/tcmai/.local/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/tcmai/.local/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/data/source/medical-llm/Huatuo-Llama-Med-Chinese-git/infer.py", line 47, in main
    model = PeftModel.from_pretrained(
  File "/home/tcmai/.local/lib/python3.10/site-packages/peft/peft_model.py", line 181, in from_pretrained
    model.load_adapter(model_id, adapter_name, **kwargs)
  File "/home/tcmai/.local/lib/python3.10/site-packages/peft/peft_model.py", line 406, in load_adapter
    dispatch_model(
  File "/home/tcmai/.local/lib/python3.10/site-packages/accelerate/big_modeling.py", line 345, in dispatch_model
    raise ValueError(
ValueError: We need an `offload_dir` to dispatch this model according to this `device_map`, the following submodules need to be offloaded: base_model.model.model.layers.4, base_model.model.model.layers.5, base_model.model.model.layers.6, base_model.model.model.layers.7, base_model.model.model.layers.8, base_model.model.model.layers.9, base_model.model.model.layers.10, base_model.model.model.layers.11, base_model.model.model.layers.12, base_model.model.model.layers.13, base_model.model.model.layers.14, base_model.model.model.layers.15, base_model.model.model.layers.16, base_model.model.model.layers.17, base_model.model.model.layers.18, base_model.model.model.layers.19, base_model.model.model.layers.20, base_model.model.model.layers.21, base_model.model.model.layers.22, base_model.model.model.layers.23, base_model.model.model.layers.24, base_model.model.model.layers.25, base_model.model.model.layers.26, base_model.model.model.layers.27, base_model.model.model.layers.28, base_model.model.model.layers.29, base_model.model.model.layers.30, base_model.model.model.layers.31, base_model.model.model.norm, base_model.model.lm_head.
$

整个过程大致如下:

(1)clone 这个 Huatuo-Llama-Med-Chinese repo ,下载 readme 中提到的四个模型权重数据,pip 安装好依赖;
(2)运行 $ bash scripts/infer.sh,根据错误提示,手动编译安装 bitsandbytes 支持我显卡的 cuda122 版;
(3)再运行 $ bash scripts/infer.sh,根据错误提示,clone 下载 https://huggingface.co/decapoda-research/llama-7b-hf 中的基础模型数据;
(4)再运行 $ bash scripts/infer.sh,便出现了上面的 "ValueError: We need an offload_dir to dispatch this model according to this device_map" 错误。

目前就中断在第(4)步里了,不知道这个错误是什么原因。是最近的 llama-7b-hf 数据不兼容,还是?

谢谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant