Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorRT-LLM Requests #632

Open
24 of 41 tasks
ncomly-nvidia opened this issue Dec 11, 2023 · 6 comments
Open
24 of 41 tasks

TensorRT-LLM Requests #632

ncomly-nvidia opened this issue Dec 11, 2023 · 6 comments
Labels
good first issue Good for newcomers

Comments

@ncomly-nvidia
Copy link
Collaborator

ncomly-nvidia commented Dec 11, 2023

Hi all, this issue will track the feature requests you've made to TensorRT-LLM & provide a place to see what TRT-LLM is currently working on.

Last update: Jan 14th, 2024
馃殌 = in development

Models

Decoder Only

Encoder / Encoder-Decoder

Multi-Modal

Other

Features & Optimizations

KV Cache

Quantization

Sampling

Workflow

Front-ends

Integrations

Usage / Installation

Platform Support

@teis-e
Copy link

teis-e commented Apr 4, 2024

Please add CohereAI!!

CohereForAI/c4ai-command-r-plus

@EwoutH
Copy link

EwoutH commented Apr 22, 2024

Llama 3 would be great (both 8B and 70B): #1470

Maybe quantized to 8 or even 4 bit.

@StephennFernandes
Copy link

currently llama 3 throws a bunch of errors converting to TensorRT LLM

any ideal about the support for llama 3

@EwoutH
Copy link

EwoutH commented Apr 23, 2024

Phi-3-mini should be amazing! Such a small 3.8B model could run quantized on many GPUs, with as little as 4GB VRAM.

@oscarbg
Copy link

oscarbg commented May 4, 2024

+1 for Phi-3

@user-0a
Copy link

user-0a commented May 18, 2024

+1 for Command R Plus!

CohereForAI/c4ai-command-r-plus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

6 participants