Qwen-7B build failed on Windows with trtllm-0.9.0 #1571

bigbigQI · 2024-05-10T07:32:17Z

System Info

Platform: Windows
version: trtllm-0.9.0

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

I have built the qwen-7b engine successfully with trtllm-0.7.1, but when i upgrade the trtllm to 0.9.0 version, i can build the qwen engine successfully, but there is a problem when infer with it.

when run this:

python ../run.py --input_text "你好，请问你叫什么？" --max_output_len=50 --tokenizer_dir ./tmp/Qwen/7B/ --engine_dir=./trt_engines/weight_only_int4/

Expected behavior

the error messages are follows:

no entry found for key
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "D:\apps\trtllm_0.9\TensorRT-LLM\examples\run.py", line 564, in <module>
    main(args)
  File "D:\apps\trtllm_0.9\TensorRT-LLM\examples\run.py", line 484, in main
    print_output(tokenizer,
  File "D:\apps\trtllm_0.9\TensorRT-LLM\examples\run.py", line 278, in print_output
    output_text = tokenizer.decode(outputs)
  File "D:\anaconda\envs\trtllm_env_new\lib\site-packages\transformers\tokenization_utils_base.py", line 3782, in decode
    return self._decode(
  File "C:\Users\nv\.cache\huggingface\modules\transformers_modules\Qwen-7B-Chat\tokenization_qwen.py", line 276, in _decode
    return self.tokenizer.decode(token_ids, errors=errors or self.errors)
  File "D:\anaconda\envs\trtllm_env_new\lib\site-packages\tiktoken\core.py", line 258, in decode
    return self._core_bpe.decode_bytes(tokens).decode("utf-8", errors=errors)
pyo3_runtime.PanicException: no entry found for key
[TensorRT-LLM][ERROR] class tensorrt_llm::common::TllmException: [TensorRT-LLM][ERROR] CUDA runtime error in ::cudaFreeHost(ptr): driver shutting down (C:\Users\tejaswinp\workspace\tekit\cpp\tensorrt_llm/runtime/tllmBuffers.h:169)

actual behavior

I found it may because the generated token by qwen engine is beyond qwen tokenizer scope.

additional notes

What i can do to solve this problem?

The text was updated successfully, but these errors were encountered:

zhangyu68 · 2024-05-29T11:06:02Z

应该是量化模型有问题，或者微调阶段就出问题了

bigbigQI · 2024-05-29T12:15:47Z

应该是量化模型有问题，或者微调阶段就出问题了

这里的微调指的是sft吗，我是直接用huggingface上的model转的，并没有经过微调

bigbigQI added the bug Something isn't working label May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen-7B build failed on Windows with trtllm-0.9.0 #1571

Qwen-7B build failed on Windows with trtllm-0.9.0 #1571

bigbigQI commented May 10, 2024

zhangyu68 commented May 29, 2024

bigbigQI commented May 29, 2024

Qwen-7B build failed on Windows with trtllm-0.9.0 #1571

Qwen-7B build failed on Windows with trtllm-0.9.0 #1571

Comments

bigbigQI commented May 10, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

zhangyu68 commented May 29, 2024

bigbigQI commented May 29, 2024