You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ImportError: Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes#30887
Open
4 tasks
AnandUgale opened this issue
May 18, 2024
· 5 comments
The issue seems to be that is_bitsandbytes_available() in import_utils.py returns false when a cuda device is not available. So one should just not use the 4/8bit stuff at all when the device is CPU, which to be fair makes sense.
Hi ! 1872bde should be included in the latest transformers so whenever you don't have access to a GPU it should error out with a clearer error message (I see you are using transformers==4.39.0)
Though I will enhance the error message to point out users to install bitsandbytes through the simpler command pip install -U bitsandbytes
System Info
Packages installed with CUDA 11.8:
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
import torch
from llama_index.llms.huggingface import HuggingFaceLLM
Optional quantization to 4bit
from transformers import BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
)
llm = HuggingFaceLLM(
model_name="meta-llama/Meta-Llama-3-8B-Instruct",
model_kwargs={
"token": hf_token,
"torch_dtype": torch.bfloat16, # comment this line and uncomment below to use 4bit
# "quantization_config": quantization_config
},
generate_kwargs={
"do_sample": True,
"temperature": 0.6,
"top_p": 0.9,
},
tokenizer_name="meta-llama/Meta-Llama-3-8B-Instruct",
tokenizer_kwargs={"token": hf_token},
stopping_ids=stopping_ids,
)
Expected behavior
able to run LLM model
The text was updated successfully, but these errors were encountered: