Remove unnecessary system role's index check for llama3 #898

musab-mk · 2024-04-29T11:09:24Z

Lllama3 supports system messages in arbitrary indexes

Lllama3 doesnt require this

pytorch-bot · 2024-04-29T11:09:28Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/898

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

RdoubleA · 2024-04-29T14:54:35Z

Hi @musab-mk, thanks for the PR. I was not aware llama3 supports system prompts outside of the first message, do you have a reference for this?

Modifying validate_messages for all models will mean they all support system messages at arbitrary indices, which isn't true for llama2, mistral, etc.

musabgultekin · 2024-04-29T16:47:17Z

@RdoubleA llama3 doesn't have this type of restriction:
https://github.com/meta-llama/llama3/blob/af6eedf7042fb51d00b2b26d8ef1ceaab73e1670/llama/tokenizer.py#L225
It allows system messages in arbitrary locations.

I was trying to fine-tune for supporting in-chat file uploading support. But I guess I have to maintain a private fork if that's not possible to merge this? Or any other recommendations?

RdoubleA · 2024-04-29T17:22:53Z

@RdoubleA llama3 doesn't have this type of restriction:
https://github.com/meta-llama/llama3/blob/af6eedf7042fb51d00b2b26d8ef1ceaab73e1670/llama/tokenizer.py#L225
It allows system messages in arbitrary locations.

Ok I see, thanks for sharing. You're right that this does not explicitly enforce system prompt as the first message. We are adding this check intentionally.

The purpose of validate_messages in torchtune is to make sure a single sample conversation (the check is done per sample, not on the whole dataset at once) is well-formed since the data and converter to Messages are user-defined, and we want to catch any silent formatting errors that could impact modal quality. From what I've seen, system prompt is always the first message in a conversation, and I'm not sure how the model behavior would change if it was in the middle of a conversation (the official guidance on chat formats from Meta always have system first for llama2 and llama3). If you need to change the system prompt for your data, you could just make it a separate sample.

I was trying to fine-tune for supporting in-chat file uploading support.

I'm curious, does this require you to change the system prompt often?

musabgultekin · 2024-04-29T18:09:43Z

Yes, this requires adding a new system message in the middle of the conversation. For example:

User: Can you plot a graph of my sales data ?
Assistant: Yes, please upload the file and I can analyze and plot a graph for you!
System: User uploaded a file to : "/data/sales_march.csv"
Assistant: tool_call->python ```.....plt.show()````
System: "Image Displayed" OR "Display error for reason X"
Assistant: I created the chart for you and displayed. Please let me know if you have any more requests.

Remove unnecessary check for llama3

566d70a

Lllama3 doesnt require this

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 29, 2024

musab-mk changed the title ~~Remove unnecessary role check for llama3~~ Remove unnecessary system role's index check for llama3 Apr 29, 2024

Update _utils.py

e51feeb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove unnecessary system role's index check for llama3 #898

Remove unnecessary system role's index check for llama3 #898

musab-mk commented Apr 29, 2024 •

edited

pytorch-bot bot commented Apr 29, 2024

RdoubleA commented Apr 29, 2024

musabgultekin commented Apr 29, 2024 •

edited

RdoubleA commented Apr 29, 2024 •

edited

musabgultekin commented Apr 29, 2024 •

edited

Remove unnecessary system role's index check for llama3 #898

Are you sure you want to change the base?

Remove unnecessary system role's index check for llama3 #898

Conversation

musab-mk commented Apr 29, 2024 • edited

pytorch-bot bot commented Apr 29, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/898

RdoubleA commented Apr 29, 2024

musabgultekin commented Apr 29, 2024 • edited

RdoubleA commented Apr 29, 2024 • edited

musabgultekin commented Apr 29, 2024 • edited

musab-mk commented Apr 29, 2024 •

edited

musabgultekin commented Apr 29, 2024 •

edited

RdoubleA commented Apr 29, 2024 •

edited

musabgultekin commented Apr 29, 2024 •

edited