Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary system role's index check for llama3 #898

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

musab-mk
Copy link
Contributor

@musab-mk musab-mk commented Apr 29, 2024

Lllama3 supports system messages in arbitrary indexes

Lllama3 doesnt require this
Copy link

pytorch-bot bot commented Apr 29, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/898

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 29, 2024
@musab-mk musab-mk changed the title Remove unnecessary role check for llama3 Remove unnecessary system role's index check for llama3 Apr 29, 2024
@RdoubleA
Copy link
Contributor

Hi @musab-mk, thanks for the PR. I was not aware llama3 supports system prompts outside of the first message, do you have a reference for this?

Modifying validate_messages for all models will mean they all support system messages at arbitrary indices, which isn't true for llama2, mistral, etc.

@musabgultekin
Copy link
Contributor

musabgultekin commented Apr 29, 2024

@RdoubleA llama3 doesn't have this type of restriction:
https://github.com/meta-llama/llama3/blob/af6eedf7042fb51d00b2b26d8ef1ceaab73e1670/llama/tokenizer.py#L225
It allows system messages in arbitrary locations.

I was trying to fine-tune for supporting in-chat file uploading support. But I guess I have to maintain a private fork if that's not possible to merge this? Or any other recommendations?

@RdoubleA
Copy link
Contributor

RdoubleA commented Apr 29, 2024

@RdoubleA llama3 doesn't have this type of restriction:
https://github.com/meta-llama/llama3/blob/af6eedf7042fb51d00b2b26d8ef1ceaab73e1670/llama/tokenizer.py#L225
It allows system messages in arbitrary locations.

Ok I see, thanks for sharing. You're right that this does not explicitly enforce system prompt as the first message. We are adding this check intentionally.

The purpose of validate_messages in torchtune is to make sure a single sample conversation (the check is done per sample, not on the whole dataset at once) is well-formed since the data and converter to Messages are user-defined, and we want to catch any silent formatting errors that could impact modal quality. From what I've seen, system prompt is always the first message in a conversation, and I'm not sure how the model behavior would change if it was in the middle of a conversation (the official guidance on chat formats from Meta always have system first for llama2 and llama3). If you need to change the system prompt for your data, you could just make it a separate sample.

I was trying to fine-tune for supporting in-chat file uploading support.

I'm curious, does this require you to change the system prompt often?

@musabgultekin
Copy link
Contributor

musabgultekin commented Apr 29, 2024

Yes, this requires adding a new system message in the middle of the conversation. For example:

User: Can you plot a graph of my sales data ?
Assistant: Yes, please upload the file and I can analyze and plot a graph for you!
System: User uploaded a file to : "/data/sales_march.csv"
Assistant: tool_call->python ```.....plt.show()````
System: "Image Displayed" OR "Display error for reason X"
Assistant: I created the chart for you and displayed. Please let me know if you have any more requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants