Skip to content

Navigation Menu

Explore
For
- Enterprise
- Teams
- Startups
- Education
By Solution
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

NVIDIA / TensorRT-LLM Public

Notifications
Fork 703
Star 6.7k

Code
Issues 548
Pull requests 92
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 24 Milestones 0

Labels 24 Milestones 0

New pull request New

92 Open 152 Closed

92 Open 152 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix missing link in perf-best-practices.md

#1587 opened May 13, 2024 by bloodeagle40234

Loading…

Fix the error of Ada traits for fpA_intB.

#1583 opened May 12, 2024 by JamesTheZ

Loading…

[feat]: Support weight only gemm with 2bit

#1568 opened May 9, 2024 by gavinchen430

Loading…

1

Update customAllReduceKernels.cu - line 120's typo was edited

#1558 opened May 8, 2024 by sjbae1999

Loading…

Use cls variable instead of ModelRunner

#1551 opened May 7, 2024 by bloodeagle40234

Loading…

Update perf-best-practices.md

#1545 opened May 6, 2024 by sam-india-007

Loading…

[fix] export failure with CUDA driver < 526 and pynvml>=11.5.0

#1537 opened May 3, 2024 by CoderHam

Loading…

Use first bad_words as extra parameters, and implement min-p

#1536 opened May 2, 2024 by pathorn • Draft

1

Loading Medusa Safetensors + AWQ Conversion correction

#1535 opened May 2, 2024 by Tushar-ml

Loading…

Define hf_config explisitly for convert_hf_mpt_legacy

#1534 opened May 2, 2024 by bloodeagle40234

Loading…

Add note on build Llama v3

#1522 opened Apr 29, 2024 by sammcj

Loading…

Update perf-overview.md

#1521 opened Apr 29, 2024 by snowmanwwg

Loading…

Support SDXL and its distributed inference

#1514 opened Apr 28, 2024 by Zars19

Loading…

Remove the <s> token from post_prompt of multimodal

#1508 opened Apr 26, 2024 by yupbank

Loading…

1

fix: correct cudaSetDevice error when GPUs per node are fewer than their ranks in inter-node inference

#1495 opened Apr 24, 2024 by littlefatfat

Loading…

[ModelRunner] Fix stop & bad word list pointer offset.

#1486 opened Apr 22, 2024 by fjosw

Loading…

Update model_runner_cpp.py

#1435 opened Apr 9, 2024 by RoyHe

Loading…

Update summarize.py

#1406 opened Apr 6, 2024 by biubiu3721

Loading…

serialize rotary base to config

#1403 opened Apr 4, 2024 by tonylek

Loading…

Support internlm2

#1392 opened Apr 2, 2024 by RunningLeon

Loading…

8

llama convert add rotary_scaling param in cli_args

#1385 opened Apr 1, 2024 by activezhao

Loading…

[Doc] Fix mistral v0.1 build instructions

#1373 opened Mar 29, 2024 by minwhoo

Loading…

Add SmoothQuant for T5 (decoder only right now)

#1366 opened Mar 27, 2024 by eycheung

Loading…

2

Relax python dependencies

#1346 opened Mar 24, 2024 by tdeboissiere

Loading…

[feat]: Add Option to convert and run distil-whisper large-v3

#1337 opened Mar 22, 2024 by IbrahimAmin1

Loading…

1

Previous 1 2 3 4 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.