predibase / lorax Public

Notifications
Fork 109
Star 1.6k

Code
Issues 99
Pull requests 11
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: predibase/lorax

Project Roadmap

#57 opened Nov 22, 2023 by tgaddair

Open 32

Labels 9 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

99 Open 96 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Add all launcher args as optional in the Helm charts enhancement

New feature or request

#465 opened May 9, 2024 by tgaddair

Retrieve all lora models from Huggingface hub by base model setting.

#463 opened May 8, 2024 by svjack

Improve async load for adapters to avoid main thread lockups in server enhancement

New feature or request

#457 opened May 3, 2024 by tgaddair

Batch inference endpoint (OpenAI compatible) enhancement

New feature or request

#448 opened Apr 30, 2024 by tgaddair

Llama3-8b-Instruct won't stop generating

#442 opened Apr 27, 2024 by ekim322

4 tasks

Fallback to Flash Attention v1 for pre-Ampere GPUs enhancement

New feature or request

good first issue

Good for newcomers

#440 opened Apr 26, 2024 by tgaddair

Idefics2 and LLaVA

#439 opened Apr 26, 2024 by joaomsimoes

2 tasks done

Fix PyTorch CUDA version in Docker bug

Something isn't working

#438 opened Apr 25, 2024 by tgaddair

Improve the latency of load_batched_adapter_weights enhancement

New feature or request

#433 opened Apr 22, 2024 by thincal

Combining multiple LoRA adapters

#429 opened Apr 21, 2024 by winglian

Inference with AWQ quantized base model + compile enabled results in the <unk> tokens

#426 opened Apr 19, 2024 by thincal

4 tasks

Error: Warmup(Generation("'bool' object has no attribute 'dtype'"))

#422 opened Apr 18, 2024 by KrisWongz

1 of 4 tasks

Support loading .pt weights

#420 opened Apr 17, 2024 by shripadk

Can't run Mistral quantized on T4 enhancement

New feature or request

#417 opened Apr 16, 2024 by emillykkejensen

2 of 4 tasks

Async client to backoff when model overloaded

#412 opened Apr 12, 2024 by jppgks

log arbitrary headers good first issue

Good for newcomers

#407 opened Apr 10, 2024 by noyoshi

Add support for LoReFT enhancement

New feature or request

#396 opened Apr 8, 2024 by tgaddair

LoRAX server with 2 GPUs and multiple adapters becomes permanently faster in swapping ONLY after parallel execution of requests.

#395 opened Apr 8, 2024 by lighteternal

1 of 4 tasks

Upgrade to AWQ kernels v0.0.6

#394 opened Apr 8, 2024 by thincal

In Structured Output, a JSON schema with a date string format will yield invalid JSON

#392 opened Apr 5, 2024 by oscarjohansson94

2 of 4 tasks

Supporting inference with EETQ quantized model

#391 opened Apr 5, 2024 by thincal

Need some help. " You need to decrease --max-batch-prefill-tokens."

#390 opened Apr 5, 2024 by KrisWongz

4 tasks

Misleading/wrong openapi schema in REST API docs for structured output

#389 opened Apr 5, 2024 by oscarjohansson94

2 of 4 tasks

Add support for AQLM quantization enhancement

New feature or request

#388 opened Apr 4, 2024 by tgaddair

Add support for fp8 (H100) enhancement

New feature or request

#387 opened Apr 4, 2024 by tgaddair

Previous 1 2 3 4 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly