Issues: predibase/lorax
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Add all launcher args as optional in the Helm charts
enhancement
New feature or request
#465
opened May 9, 2024 by
tgaddair
Retrieve all lora models from Huggingface hub by base model setting.
#463
opened May 8, 2024 by
svjack
Improve async load for adapters to avoid main thread lockups in server
enhancement
New feature or request
#457
opened May 3, 2024 by
tgaddair
Batch inference endpoint (OpenAI compatible)
enhancement
New feature or request
#448
opened Apr 30, 2024 by
tgaddair
Fallback to Flash Attention v1 for pre-Ampere GPUs
enhancement
New feature or request
good first issue
Good for newcomers
#440
opened Apr 26, 2024 by
tgaddair
Improve the latency of New feature or request
load_batched_adapter_weights
enhancement
#433
opened Apr 22, 2024 by
thincal
Inference with AWQ quantized base model + compile enabled results in the <unk> tokens
#426
opened Apr 19, 2024 by
thincal
4 tasks
Error: Warmup(Generation("'bool' object has no attribute 'dtype'"))
#422
opened Apr 18, 2024 by
KrisWongz
1 of 4 tasks
Can't run Mistral quantized on T4
enhancement
New feature or request
#417
opened Apr 16, 2024 by
emillykkejensen
2 of 4 tasks
LoRAX server with 2 GPUs and multiple adapters becomes permanently faster in swapping ONLY after parallel execution of requests.
#395
opened Apr 8, 2024 by
lighteternal
1 of 4 tasks
In Structured Output, a JSON schema with a date string format will yield invalid JSON
#392
opened Apr 5, 2024 by
oscarjohansson94
2 of 4 tasks
Need some help. " You need to decrease --max-batch-prefill-tokens."
#390
opened Apr 5, 2024 by
KrisWongz
4 tasks
Misleading/wrong openapi schema in REST API docs for structured output
#389
opened Apr 5, 2024 by
oscarjohansson94
2 of 4 tasks
Add support for AQLM quantization
enhancement
New feature or request
#388
opened Apr 4, 2024 by
tgaddair
Previous Next
ProTip!
no:milestone will show everything without a milestone.