Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SLM] Support BERT architecture. Implement a text embedding module #2249

Merged
merged 5 commits into from May 10, 2024

Conversation

rickzx
Copy link
Contributor

@rickzx rickzx commented Apr 29, 2024

This PR supports text embedding in MLC-LLM with a BERT encoder-only model.

Example usage: https://github.com/rickzx/mlc-llm/blob/18aa7ee378b826a61ce4baa98e4bab1bf3d64038/python/mlc_llm/embeddings/embeddings.ipynb

@tqchen
Copy link
Contributor

tqchen commented Apr 30, 2024

This is a good first step towards embedding support through python-level API, would be great to also think about what does it take to bring it as part of the ThreadEngine, in which case we do need to support multiple models, but also have opportunity to support it as a universal embedding endpoint

@tqchen
Copy link
Contributor

tqchen commented May 7, 2024

please fix the jenkins here

@rickzx
Copy link
Contributor Author

rickzx commented May 7, 2024

please fix the jenkins here

Should be addressed by #2292. I'm triggering a rebuild now

@rickzx
Copy link
Contributor Author

rickzx commented May 8, 2024

To fix CUDA error, apache/tvm#16982

@rickzx rickzx merged commit 459ffe3 into mlc-ai:main May 10, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants