Issues: NVIDIA/TensorRT-LLM
[Issue Template]Short one-line summary of the issue #270
#783
opened Jan 1, 2024 by
juney-nvidia
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
How to test the benchmark of Llama3 and Vicuna2 of TensorRT-LLM by benchmark.py
#1597
opened May 14, 2024 by
Ourspolaire1
Fail to run Mixtral 8x7b with tp size 4 on w4a16
bug
Something isn't working
#1596
opened May 14, 2024 by
gloritygithub11
2 of 4 tasks
rotary_scaling build command doesnt work
bug
Something isn't working
#1595
opened May 13, 2024 by
avianion
4 tasks done
enableBlockReuse
option is not available for tensorrt_llm.runtime.ModelRunner
bug
#1594
opened May 13, 2024 by
yupbank
2 of 4 tasks
Question: Do we support input log probabilties with C++ inflight backend?
#1593
opened May 13, 2024 by
sindhuvahinis
Classification with LoRA and LLAMA/Mistral example
bug
Something isn't working
#1592
opened May 13, 2024 by
bjayakumar
2 of 4 tasks
Top-P sampling occasionally produces invalid tokens
bug
Something isn't working
#1590
opened May 13, 2024 by
AlessioNetti
4 tasks done
convert qwen 110b gptq checkpoint的时候,qkv_bias 的shape不能被3整除
#1589
opened May 13, 2024 by
CallmeZhangChenchen
Inference Qwen1.5-14B with 2x RTX4090D failed based main branch
#1588
opened May 13, 2024 by
Fred-cell
TllmXqaJit runtime error when build Yi-6B fp8 with TRTLLM-0.10.0.dev2024050700
bug
Something isn't working
#1586
opened May 13, 2024 by
kimbaol
2 of 4 tasks
[Quantization] [mixtral_8x22B] NotImplementedError: Cannot copy out of meta tensor; no data!
bug
Something isn't working
#1585
opened May 13, 2024 by
Godlovecui
2 of 4 tasks
getPluginCreator could not find plugin: Gemmtensorrt_llm version: 1
bug
Something isn't working
#1584
opened May 13, 2024 by
gloritygithub11
2 of 4 tasks
Fail to build int4_awq on Mixtral 8x7b
bug
Something isn't working
#1580
opened May 12, 2024 by
gloritygithub11
2 of 4 tasks
Failed to quantize Starcoder2 with FP8
bug
Something isn't working
#1578
opened May 11, 2024 by
wxsms
2 of 4 tasks
Does enc-dec model support inflight bathing?
question
Further information is requested
#1573
opened May 10, 2024 by
Oldpan
Why is the calculation result of tensorrt-llm version llava1.5 different from the output of HF?
triaged
Issue has been triaged by maintainers
#1572
opened May 10, 2024 by
bleedingfight
2 of 4 tasks
Qwen-7B build failed on Windows with trtllm-0.9.0
bug
Something isn't working
#1571
opened May 10, 2024 by
bigbigQI
4 tasks
Why H200 shows only few improvement over H100 on Mistral-7B?
perf
Issue about performance number
#1570
opened May 10, 2024 by
shixuan94
Missing logits in Executor API when using Something isn't working
triaged
Issue has been triaged by maintainers
return_generation_logits
bug
#1569
opened May 10, 2024 by
AlessioNetti
2 of 4 tasks
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.