The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!
-
Updated
May 10, 2024 - Python
The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!
CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search
Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more
Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.
Add a description, image, and links to the model-inference-service topic page so that developers can more easily learn about it.
To associate your repository with the model-inference-service topic, visit your repo's landing page and select "manage topics."