Skip to content

Issues: triton-inference-server/tensorrtllm_backend

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Assignee
Filter by who’s assigned
Sort

Issues list

There is a problem with llama 7B model pre-processing after using triton server bug Something isn't working
#445 opened May 8, 2024 by Graham1025
2 of 4 tasks
InFlightBatching seems not working bug Something isn't working
#442 opened May 6, 2024 by larme
2 of 4 tasks
Deployement failed for BERT
#440 opened May 3, 2024 by vivekjoshi556
How to post sample parameters (like top_k, temperature) for triton http server bug Something isn't working
#436 opened Apr 26, 2024 by wanzhenchn
2 of 4 tasks
Encountered an error in forward function: std::bad_cast bug Something isn't working
#435 opened Apr 26, 2024 by wangqy1216
1 of 4 tasks
LLama 7B model can't get longer ouput text after using triton server bug Something isn't working
#434 opened Apr 26, 2024 by XiaobingSuper
2 of 4 tasks
max_batch_size seems to have no impact on model performance bug Something isn't working triaged Issue has been triaged by maintainers
#429 opened Apr 23, 2024 by VitalyPetrov
3 of 4 tasks
Performance Issue with return_context_logits Enabled in TensorRT-LLM bug Something isn't working
#428 opened Apr 23, 2024 by gywlssww
2 of 4 tasks
Seg fault after loaded models in official example bug Something isn't working
#425 opened Apr 20, 2024 by LeatherDeerAU
2 of 4 tasks
Performance Issue with return_context_logits Enabled in TensorRT-LLM bug Something isn't working triaged Issue has been triaged by maintainers
#419 opened Apr 19, 2024 by metterian
2 of 4 tasks
Filtering beam_search output tensors results in a string output vs list triaged Issue has been triaged by maintainers
#418 opened Apr 18, 2024 by nikhilshandilya
Warmup Example of loading LoRa weights triaged Issue has been triaged by maintainers
#417 opened Apr 18, 2024 by TheCodeWrangler
Block reuse is currently not supported with beam width > 1 triaged Issue has been triaged by maintainers
#411 opened Apr 16, 2024 by tonylek
Supporting beam search in streaming mode feature request New feature or request
#408 opened Apr 13, 2024 by tonylek
Support bfloat16 LoRa Adaptors bug Something isn't working triaged Issue has been triaged by maintainers
#403 opened Apr 11, 2024 by TheCodeWrangler
Example of LoRa weights triaged Issue has been triaged by maintainers
#399 opened Apr 9, 2024 by TheCodeWrangler
ProTip! Mix and match filters to narrow down what you’re looking for.