Triton inference server with Python backend and transformers
-
Updated
May 24, 2024 - Python
Triton inference server with Python backend and transformers
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
Cassandra plugin for NVIDIA DALI
Tiny configuration for Triton Inference Server
The Sumen model integrates with Triton Inference Server
An Alternative for Triton Inference Server. Boosting DL Service Throughput 1.5-4x by Ensemble Pipeline Serving with Concurrent CUDA Streams for PyTorch/LibTorch Frontend and TensorRT/CVCUDA, etc., Backends
The Triton backend for the ONNX Runtime.
Miscellaneous codes and writings for MLOps
Streamlit Dockerized Computer Vision App with Triton Inference Server and PostgreSQL database
ClearML - Model-Serving Orchestration and Repository Solution
This repository contains the content for a proof of concept implementation of computer vision systems in industry. The project explores scalability and performance using the NVIDIA ecosystem, aiming to create an example scaffold for implementing a system accessible to non-technical users.
OpenAI compatible API for TensorRT LLM triton backend
Deploy DL/ ML inference pipelines with minimal extra code.
The Purpose of this repository is to create a DeepStream/Triton-Server sample application that utilizes yolov7, yolov7-qat, yolov9 models to perform inference on video files or RTSP streams.
Provides an ensemble model to deploy a YOLOv8 TensorRT model to Triton
MLModelService wrapping Nvidia's Triton Server
An easy classification implement to explain how triton work
C++ application to perform computer vision tasks using Nvidia Triton Server for model inference
An image retrieval system that utilizes deep learning ResNet for feature extraction, Local Optimized Product Quantization techniques for storage and retrieval, and efficient deployment using Nvidia technologies like TensorRT and Triton Server, all accessible through a FastAPI-powered web API.
Add a description, image, and links to the triton-inference-server topic page so that developers can more easily learn about it.
To associate your repository with the triton-inference-server topic, visit your repo's landing page and select "manage topics."