Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Option to add BOS token #636

Open
psinger opened this issue Mar 12, 2024 · 4 comments
Open

[FEATURE] Option to add BOS token #636

psinger opened this issue Mar 12, 2024 · 4 comments
Labels
type/feature Feature request

Comments

@psinger
Copy link
Collaborator

psinger commented Mar 12, 2024

馃殌 Feature

Similar to EOS token, we should offer an option to add BOS token to the beginning. Might be useful for models like Gemma.

@psinger psinger added the type/feature Feature request label Mar 12, 2024
@pascal-pfeiffer pascal-pfeiffer added the type/good first issue Good for newcomers label Mar 16, 2024
@psinger psinger removed the type/good first issue Good for newcomers label Apr 22, 2024
@tmostak
Copy link

tmostak commented Apr 22, 2024

I believe this is necessary for getting the best results when fine-tuning Llama 3, although there seems to be some confusion (https://huggingface.co/meta-llama/Meta-Llama-3-8B/discussions/9).

@tmostak
Copy link

tmostak commented Apr 29, 2024

Just as a follow-up, I implemented a hacky version of this to help with training Llama 3, and indeed adding BOS tokens to prompts and answers when fine-tuning the Llama 8B base model lowered my loss by a small but significant margin (tried many different seeds to ensure it was reproducible)

@psinger
Copy link
Collaborator Author

psinger commented Apr 30, 2024

You can always just hardcode the bos token string to the prompt separator.

Although I am personally not convinced it can have a big impact for finetuning.

We should still add an option to add it.

@tmostak
Copy link

tmostak commented Apr 30, 2024

Yes good point... still seems desirable to have native support in the UX though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Feature request
Projects
None yet
Development

No branches or pull requests

3 participants