TllmXqaJit runtime error when build Yi-6B fp8 with TRTLLM-0.10.0.dev2024050700 #1586
Closed
2 of 4 tasks
Labels
bug
Something isn't working
System Info
GPU:RTX4090
OS:docker(tensorrt-llm make to produce image)
TensorRT-LLM version: 0.10.0.dev2024050700
driver:535.171.04
CUDA Version: 12.4
Who can help?
@byshiue
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
build success
actual behavior
[05/13/2024-07:00:20] [TRT] [W] [RemoveDeadLayers] Input Tensor position_ids is unused or used only at compile-time, but is not being removed.
[05/13/2024-07:00:20] [TRT] [I] Global timing cache in use. Profiling results in this builder pass will be stored.
[05/13/2024-07:01:41] [TRT] [I] [GraphReduction] The approximate region cut reduction algorithm is called.
[05/13/2024-07:01:41] [TRT] [I] Detected 14 inputs and 1 output network tensors.
terminate called after throwing an instance of 'tensorrt_llm::common::TllmException'
what(): [TensorRT-LLM][ERROR] TllmXqaJit runtime error in tllmXqaJitCreateAndCompileProgram(&program, &context): NVRTC Internal Error (/src/tensorrt_llm/cpp/tensorrt_llm/kernels/decoderMaskedMultiheadAttention/decoderXQAImplJIT/compileEngine.cpp:65)
1 0x7fcba6c4e5b4 /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libtensorrt_llm.so(+0x6935b4) [0x7fcba6c4e5b4]
2 0x7fcba6d8ca59 tensorrt_llm::kernels::jit::CompileEngine::compile() const + 169
3 0x7fcba6d8e63b tensorrt_llm::kernels::jit::CubinObjRegistryTemplate<tensorrt_llm::kernels::XQAKernelFullHashKey, tensorrt_llm::kernels::XQAKernelFullHasher>::getCubin(tensorrt_llm::kernels::XQAKernelFullHashKey const&, tensorrt_llm::kernels::jit::CompileEngine*) + 267
4 0x7fcba6d8e077 tensorrt_llm::kernels::DecoderXQAImplJIT::prepare(tensorrt_llm::kernels::XQAParams const&) + 87
5 0x7fcb6aa94efb /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so(+0xbdefb) [0x7fcb6aa94efb]
6 0x7fcb6aab140d /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so(+0xda40d) [0x7fcb6aab140d]
7 0x7fcbc9cbbf38 /usr/local/tensorrt/lib/libnvinfer.so.10(+0xd87f38) [0x7fcbc9cbbf38]
8 0x7fcbc9cbc85c /usr/local/tensorrt/lib/libnvinfer.so.10(+0xd8885c) [0x7fcbc9cbc85c]
9 0x7fcbc9d35caf /usr/local/tensorrt/lib/libnvinfer.so.10(+0xe01caf) [0x7fcbc9d35caf]
10 0x7fcbc9d0e4e0 /usr/local/tensorrt/lib/libnvinfer.so.10(+0xdda4e0) [0x7fcbc9d0e4e0]
11 0x7fcbc9d1507c /usr/local/tensorrt/lib/libnvinfer.so.10(+0xde107c) [0x7fcbc9d1507c]
12 0x7fcbc9d17071 /usr/local/tensorrt/lib/libnvinfer.so.10(+0xde3071) [0x7fcbc9d17071]
13 0x7fcbc995c61c /usr/local/tensorrt/lib/libnvinfer.so.10(+0xa2861c) [0x7fcbc995c61c]
14 0x7fcbc9961837 /usr/local/tensorrt/lib/libnvinfer.so.10(+0xa2d837) [0x7fcbc9961837]
15 0x7fcbc99621af /usr/local/tensorrt/lib/libnvinfer.so.10(+0xa2e1af) [0x7fcbc99621af]
16 0x7fcbd78a6478 /usr/local/lib/python3.10/dist-packages/tensorrt/tensorrt.so(+0xa6478) [0x7fcbd78a6478]
17 0x7fcbd78457a3 /usr/local/lib/python3.10/dist-packages/tensorrt/tensorrt.so(+0x457a3) [0x7fcbd78457a3]
additional notes
Yi-9B also encountered the same problem.
The text was updated successfully, but these errors were encountered: