Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Huggingface tokenizer support #94

Open
melang982 opened this issue Mar 31, 2024 · 1 comment
Open

Feature request: Huggingface tokenizer support #94

melang982 opened this issue Mar 31, 2024 · 1 comment

Comments

@melang982
Copy link

Since world tokenizer training code is not available as far as I know, those of us who need a custom tokenizer train HF tokenizer (pip rwkv package, RWKV-LM trainer and json2binidx_tool all support it).
Currently it doesn't work with ai00_server:

[ai00_server::middleware] reload model failed: failed to parse vocabulary: invalid value: expected key to be a number in quotes at line 2 column 3

@melang982
Copy link
Author

Implemented this, pull request: cryscan/web-rwkv#23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant