Skip to content
This repository has been archived by the owner on Apr 23, 2024. It is now read-only.

Support custom tokens #97

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Support custom tokens #97

wants to merge 7 commits into from

Conversation

9173860
Copy link

@9173860 9173860 commented Sep 15, 2022

This PR resolves #65 #44, which implemented the custom tokens feature.

Training BPE is intact, and custom tokens are just added to the model file. The tokens are used during encode/decode phase.

Encoding speed is not affected if custom tokens are not provided.
Providing custom tokens will make encoding time about 10% longer, which should be acceptable.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Special Tokens
1 participant