Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading checkpoint shards takes too long #251

Open
irjawais opened this issue May 9, 2024 · 2 comments
Open

Loading checkpoint shards takes too long #251

irjawais opened this issue May 9, 2024 · 2 comments

Comments

@irjawais
Copy link

irjawais commented May 9, 2024

When I load "meta-llama/Meta-Llama-3-8B-Instruct" model like this

from transformers import AutoTokenizer, TextStreamer from intel_extension_for_transformers.transformers import AutoModelForCausalLM model_name = "meta-llama/Meta-Llama-3-8B-Instruct" # Hugging Face model_id or local model tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) streamer = TextStreamer(tokenizer) model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)

it got hanged. Then only way is to restart instance to recover it.

Is there any issue in my spec?

my instance spec ubunu 32 GB RAM.

@irjawais
Copy link
Author

irjawais commented May 9, 2024

warnings.warn(
Loading checkpoint shards: 75%|█████████████████████████████████████████████████████████████████████████████████████████████████████████ | 3/4 [01:53<00:37, 37.72s/it]Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.10/dist-packages/intel_extension_for_transformers/transformers/modeling/modeling_auto.py", line 593, in from_pretrained
model.init( # pylint: disable=E1123
File "/usr/local/lib/python3.10/dist-packages/neural_speed/init.py", line 182, in init
assert os.path.exists(fp32_bin), "Fail to convert pytorch model"
AssertionError: Fail to convert pytorch model

@intellinjun
Copy link
Contributor

@irjawais
Can you check the memory usage when converting the model? From your description, it seems that there may be insufficient memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants