Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intel Advanced Matrix Extensions (AMX) support #1632

Open
ahmetcanik opened this issue Feb 29, 2024 · 1 comment
Open

Intel Advanced Matrix Extensions (AMX) support #1632

ahmetcanik opened this issue Feb 29, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@ahmetcanik
Copy link

Hello CTranslate2 developers,

I am a user of your library and I appreciate your work on providing fast and accurate inference engine. I am wondering if you have any plans to support Intel Advanced Matrix Extensions (AMX) for CPU inference. According to Intel, AMX can speed up inference by several factors for certain models and data types.

I have tried to compile CTranslate2 from the source code with the -mamx-tile -mamx-int8 -mamx-bf16 flags, but it seems that there are some additional steps required to enable AMX (maybe adding a new kernel as vec_amx.h with modifiying vec_avx512.h to enable AMX tile operations).

I would appreciate it if you could share your thoughts on this topic and let me know if AMX support is feasible and desirable for CTranslate2.

Thank you for your time and attention.

@minhthuc2502
Copy link
Collaborator

Hello,
Thank you for your suggestion. We have no plan to do this now. However, some works are needed to implement AMX for some operations. It would be nice to have this, I will look at it more in detail.

@minhthuc2502 minhthuc2502 added the enhancement New feature or request label Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants