Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Fix missing link in perf-best-practices.md
#1587 opened May 13, 2024 by bloodeagle40234 Loading…
Fix the error of Ada traits for fpA_intB.
#1583 opened May 12, 2024 by JamesTheZ Loading…
[feat]: Support weight only gemm with 2bit
#1568 opened May 9, 2024 by gavinchen430 Loading…
Use cls variable instead of ModelRunner
#1551 opened May 7, 2024 by bloodeagle40234 Loading…
Update perf-best-practices.md
#1545 opened May 6, 2024 by sam-india-007 Loading…
Loading Medusa Safetensors + AWQ Conversion correction
#1535 opened May 2, 2024 by Tushar-ml Loading…
Add note on build Llama v3
#1522 opened Apr 29, 2024 by sammcj Loading…
Update perf-overview.md
#1521 opened Apr 29, 2024 by snowmanwwg Loading…
Support SDXL and its distributed inference
#1514 opened Apr 28, 2024 by Zars19 Loading…
Remove the <s> token from post_prompt of multimodal
#1508 opened Apr 26, 2024 by yupbank Loading…
[ModelRunner] Fix stop & bad word list pointer offset.
#1486 opened Apr 22, 2024 by fjosw Loading…
Update model_runner_cpp.py
#1435 opened Apr 9, 2024 by RoyHe Loading…
Update summarize.py
#1406 opened Apr 6, 2024 by biubiu3721 Loading…
serialize rotary base to config
#1403 opened Apr 4, 2024 by tonylek Loading…
Support internlm2
#1392 opened Apr 2, 2024 by RunningLeon Loading…
llama convert add rotary_scaling param in cli_args
#1385 opened Apr 1, 2024 by activezhao Loading…
[Doc] Fix mistral v0.1 build instructions
#1373 opened Mar 29, 2024 by minwhoo Loading…
Add SmoothQuant for T5 (decoder only right now)
#1366 opened Mar 27, 2024 by eycheung Loading…
Relax python dependencies
#1346 opened Mar 24, 2024 by tdeboissiere Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.