Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guide for MLC benchmarks is not working out of the box #529

Open
ramyadhadidi opened this issue May 15, 2024 · 3 comments
Open

Guide for MLC benchmarks is not working out of the box #529

ramyadhadidi opened this issue May 15, 2024 · 3 comments

Comments

@ramyadhadidi
Copy link

ramyadhadidi commented May 15, 2024

Hello, seems like the guide found here https://github.com/dusty-nv/jetson-containers/blob/master/packages/llm/mlc/README.md is outdated. Specifically, mlc_llm.build cannot be found. Maybe I am missing something, but I used the latest mlc docker container.

./run.sh $(./autotag mlc) \
  python3 -m mlc_llm.build \
    --model Llama-2-7b-chat-hf \
    --quantization q4f16_ft \
    --artifact-path /data/models/mlc/dist \
    --max-seq-len 4096 \
    --target cuda \
    --use-cuda-graph \
    --use-flash-attn-mqa

You can update it with the following commands:

python3 -m mlc_llm convert_weight /data/models/mlc/dist/Llama-2-7b-chat-hf --quantization q4f16_ft --output /data/models/mlc/dist/Llama-2-7b-chat-hf-q4f16_ft

python3 -m mlc_llm gen_config /data/models/mlc/dist/Llama-2-7b-chat-hf --quantization q4f16_ft --output /data/models/mlc/dist/Llama-2-7b-chat-hf-q4f16_ft --conv-template llama-2
@ramyadhadidi
Copy link
Author

After some debugging, I found that the target container should be mlc-builder and not mlc. The document doesn't mention the difference or the correct command.

@ramyadhadidi ramyadhadidi changed the title Guide for MLC is outdated due to MLC updates Guide for MLC is for benchmarks not working out of the box May 16, 2024
@ramyadhadidi ramyadhadidi changed the title Guide for MLC is for benchmarks not working out of the box Guide for MLC is for benchmarks is not working out of the box May 16, 2024
@dusty-nv
Copy link
Owner

Ah thanks @ramyadhadidi , yes I have various versions of MLC floating around, and the latest was after their transition from mlc_llm.build to mlc_llm convert_weight way. It seems like I pushed the builder but not the deployment container - which version of JetPack-L4T are you on?

@ramyadhadidi
Copy link
Author

I'm on the latest one
L4T_VERSION=36.3.0 JETPACK_VERSION=6.0 CUDA_VERSION=12.2

@ramyadhadidi ramyadhadidi changed the title Guide for MLC is for benchmarks is not working out of the box Guide for MLC benchmarks is not working out of the box May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants