You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
I have built the qwen-7b engine successfully with trtllm-0.7.1, but when i upgrade the trtllm to 0.9.0 version, i can build the qwen engine successfully, but there is a problem when infer with it.
no entry found for key
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
File "D:\apps\trtllm_0.9\TensorRT-LLM\examples\run.py", line 564, in <module>
main(args)
File "D:\apps\trtllm_0.9\TensorRT-LLM\examples\run.py", line 484, in main
print_output(tokenizer,
File "D:\apps\trtllm_0.9\TensorRT-LLM\examples\run.py", line 278, in print_output
output_text = tokenizer.decode(outputs)
File "D:\anaconda\envs\trtllm_env_new\lib\site-packages\transformers\tokenization_utils_base.py", line 3782, in decode
return self._decode(
File "C:\Users\nv\.cache\huggingface\modules\transformers_modules\Qwen-7B-Chat\tokenization_qwen.py", line 276, in _decode
return self.tokenizer.decode(token_ids, errors=errors or self.errors)
File "D:\anaconda\envs\trtllm_env_new\lib\site-packages\tiktoken\core.py", line 258, in decode
return self._core_bpe.decode_bytes(tokens).decode("utf-8", errors=errors)
pyo3_runtime.PanicException: no entry found for key
[TensorRT-LLM][ERROR] class tensorrt_llm::common::TllmException: [TensorRT-LLM][ERROR] CUDA runtime error in ::cudaFreeHost(ptr): driver shutting down (C:\Users\tejaswinp\workspace\tekit\cpp\tensorrt_llm/runtime/tllmBuffers.h:169)
actual behavior
I found it may because the generated token by qwen engine is beyond qwen tokenizer scope.
additional notes
What i can do to solve this problem?
The text was updated successfully, but these errors were encountered:
System Info
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I have built the qwen-7b engine successfully with trtllm-0.7.1, but when i upgrade the trtllm to 0.9.0 version, i can build the qwen engine successfully, but there is a problem when infer with it.
when run this:
Expected behavior
the error messages are follows:
actual behavior
I found it may because the generated token by qwen engine is beyond qwen tokenizer scope.
additional notes
What i can do to solve this problem?
The text was updated successfully, but these errors were encountered: