Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ITREX need to do modification for llama3 new prompt format #1507

Open
redhairerINTEL opened this issue Apr 23, 2024 · 3 comments
Open

ITREX need to do modification for llama3 new prompt format #1507

redhairerINTEL opened this issue Apr 23, 2024 · 3 comments
Assignees

Comments

@redhairerINTEL
Copy link

New prompt format for llama3
https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/

@kta-intel
Copy link
Contributor

@kevinintel

@a32543254
Copy link
Contributor

a32543254 commented Apr 25, 2024

here is the sample code if you want to use llama3 template:
all you need is to apply template to input_ids.

from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig

model_name = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
streamer = TextStreamer(tokenizer)
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)


outputs = model.generate(input_ids , streamer=streamer)

We will also add it to doc soon.

@N3RDIUM
Copy link

N3RDIUM commented Apr 30, 2024

here is the sample code if you want to use llama3 template: all you need is to apply template to input_ids.

from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig

model_name = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
streamer = TextStreamer(tokenizer)
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)


outputs = model.generate(input_ids , streamer=streamer)

We will also add it to doc soon.

This gives me AssertionError: Fail to convert pytorch model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants