New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Converting model to AWS Inferentia hardware using Optimum-cli #90

Open

garbit opened this issue Aug 25, 2023 · 0 comments

garbit commented Aug 25, 2023

I'm trying to run the model on AWS Inferentia (inf1 hardware) for model deployment however I can't actually seem to get the optimum-cli neuron tooling to work.

Has anyone had similar experience?

Launched inf1 hardware
Installed python 3.9
Installed optimum-cli
Ran optimizer command

optimum-cli export neuron --model /root/multilingual_debiased-0b549669.ckpt --task token-classification --batch_size 30 --sequence_length 512 --auto_cast matmul --auto_cast_type bf16 multilingual_debiased-0b549669

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment