BentoML is an open-source model serving library for building model inference APIs and multi-model serving systems with any open-source or custom AI models. It comes with everything you need for serving optimization, model packaging, and simplifies production deployment via ☁️ BentoCloud.
- 🍱 BentoML: The Unified Model Serving Framework
- 🦾 OpenLLM: Self-hosting Large Language Models Made Easy
- ☁️ BentoCloud: Inference Platform for fast-moving AI teams
👀 Follow us on X @bentomlai and LinkedIn
📖 Read our blog