triton-inference-server / tensorrtllm_backend Public

Notifications You must be signed in to change notification settings
Fork 73
Star 530

Code
Issues 199
Pull requests 17
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: triton-inference-server/tensorrtllm_backend

[Issue Template]Short one-line summary of the issue

#270 opened Jan 1, 2024 by juney-nvidia

Open

Labels 13 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

199 Open 180 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

How to deploy Triton Inference Server Container (tritonserver:24.04-trtllm-python-py3) in K8S without launching Triton Server directly?

#477 opened May 25, 2024 by Ryan-ZL-Lin

No 24.05-trtllm-python-py3 in NGC Repo

#476 opened May 25, 2024 by avianion

[Question] Best practises to track inputs and predictions?

#475 opened May 24, 2024 by FernandoDorado

tensorrt_llm_bls disregards temperature setting bug

Something isn't working

#472 opened May 23, 2024 by janpetrov

1 of 4 tasks

random_seed seems to be ignored (or at least inconsistent) for inflight_batcher_llm bug

Something isn't working

#468 opened May 21, 2024 by dyoshida-continua

2 of 4 tasks

unexpected error when creating modelInstanceState: [json.exception.out_of_range.403] key 'name' not found bug

Something isn't working

#467 opened May 21, 2024 by Godlovecui

2 of 4 tasks

[Bug] Zero temperature curl request affects non-zero temperature requests bug

Something isn't working

#464 opened May 20, 2024 by Hao-YunDeng

2 of 4 tasks

Can you provide an example of a visual language model or multimodal model launch by triton server?

#463 opened May 20, 2024 by lzcchl

How to deploy one model instance across multiple GPUs to tackle the OOM problem?

#462 opened May 16, 2024 by shil3754

decoding_mode top_k_top_p does not take effect for llama2 not same with huggingface bug

Something isn't working

#461 opened May 16, 2024 by yjjiang11

1 of 4 tasks

Implement XC-Cache to improve long context inference performance

#460 opened May 15, 2024 by avianion

Tritonserver won't start up running Smaug 34b bug

Something isn't working

#459 opened May 15, 2024 by workuser12345

2 of 4 tasks

two seemingly identical functions in the same file

#458 opened May 15, 2024 by dongluw

Mixtral 8x7-v0.1 Hangs after serving a few requests bug

Something isn't working

#457 opened May 15, 2024 by aaditya-srivathsan

2 of 4 tasks

[tensorrt-llm backend] A question about launch_triton_server.py

#455 opened May 13, 2024 by victorsoda

Example gpu_device_ids for multi-model usage? question

Further information is requested

#448 opened May 9, 2024 by vnkc1

2 of 4 tasks

InFlightBatching seems not working need more info triaged

Issue has been triaged by maintainers

#442 opened May 6, 2024 by larme

2 of 4 tasks

Deployement failed for BERT triaged

Issue has been triaged by maintainers

#440 opened May 3, 2024 by vivekjoshi556

Deploying Mixtral-8x7B-v0.1 with Triton 24.02 on A100 (160GB) raises "Cuda Runtime (out of memory)" exception bug

Something isn't working

triaged

Issue has been triaged by maintainers

#438 opened Apr 29, 2024 by kelkarn

2 of 4 tasks

GptManager’s scalability issues with input & output parameters feature request

New feature or request

#437 opened Apr 28, 2024 by service-kit

Encountered an error in forward function: std::bad_cast bug

Something isn't working

#435 opened Apr 26, 2024 by wangqy1216

1 of 4 tasks

max_batch_size seems to have no impact on model performance bug

Something isn't working

triaged

Issue has been triaged by maintainers

#429 opened Apr 23, 2024 by VitalyPetrov

3 of 4 tasks

Performance Issue with return_context_logits Enabled in TensorRT-LLM bug

Something isn't working

#428 opened Apr 23, 2024 by gywlssww

2 of 4 tasks

Seg fault after loaded models in official example bug

Something isn't working

#425 opened Apr 20, 2024 by LeatherDeerAU

2 of 4 tasks

Can't launch triton server following docs, expecting [TensorRT] library version 9.2.0.5 got 9.3.0.1 bug

Something isn't working

#424 opened Apr 20, 2024 by conway-abacus

2 of 4 tasks

Previous 1 2 3 4 5 6 7 8 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly