#

triton-inference-server

Here are 73 public repositories matching this topic...

CoinCheung / BiSeNet

Add bisenetv2. My implementation of BiSeNet

pytorch cityscapes tensorrt ncnn ade20k cocostuff openvino bisenet triton-inference-server

Updated Feb 5, 2023
Python

NVIDIA / GenerativeAIExamples

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

microservice gpu-acceleration nemo tensorrt rag triton-inference-server large-language-models llm llm-inference retrieval-augmented-generation

Updated May 24, 2024
Python

isarsoft / yolov4-triton-tensorrt

This repository deploys YOLOv4 as an optimized TensorRT engine to Triton Inference Server

docker deep-learning object-detection tensorrt yolov4 triton-inference-server yolov4-tiny

Updated Jun 2, 2022
C++

triton-inference-server / onnxruntime_backend

The Triton backend for the ONNX Runtime.

backend inference triton-inference-server onnx-runtime

Updated May 21, 2024
C++

allegroai / clearml-serving

ClearML - Model-Serving Orchestration and Repository Solution

kubernetes devops machine-learning ai deep-learning triton tensorflow-serving model-serving serving mlops serving-pytorch-models triton-inference-server clearml serving-ml

Updated May 3, 2024
Python

kamalkraj / stable-diffusion-tritonserver

Deploy stable diffusion model with onnx/tenorrt + tritonserver

docker machine-learning deploy transformers inference python3 pytorch nvidia fp16 tensorrt onnx triton-inference-server tensorrt-inference stablediffusion

Updated Aug 15, 2023
Jupyter Notebook

notAI-tech / fastDeploy

Deploy DL/ ML inference pipelines with minimal extra code.

Updated Apr 23, 2024
Python

npuichigo / openai_trtllm

OpenAI compatible API for TensorRT LLM triton backend

triton-inference-server openai-api llm langchain tensorrt-llm

Updated Apr 26, 2024
Rust

NVIDIA-ISAAC-ROS / isaac_ros_dnn_inference

Hardware-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU

ai deep-learning gpu dnn ros nvidia triton deeplearning tao jetson ros2 tensorrt triton-inference-server tensorrt-inference ros2-humble

Updated Nov 17, 2023
C++

torchpipe / torchpipe

An Alternative for Triton Inference Server. Boosting DL Service Throughput 1.5-4x by Ensemble Pipeline Serving with Concurrent CUDA Streams for PyTorch/LibTorch Frontend and TensorRT/CVCUDA, etc., Backends

deployment inference pytorch ray serve tensorrt serving pipeline-parallelism torch2trt triton-inference-server ray-serve cvcuda

Updated May 24, 2024
C++

openhackathons-org / End-to-End-Computer-Vision

This repository is an AI bootcamp material that consist of a workflow for computer vision

opencv deep-neural-networks computer-vision deep-learning image-processing image-recognition object-detection deepstream tao object-tracking tensorrt triton-inference-server

Updated Sep 11, 2023
Jupyter Notebook

bug-developer021 / YOLOV5_optimization_on_triton

Compare multiple optimization methods on triton to imporve model service performance

gpu inference tensorrt triton-inference-server yolov5

Updated Jan 10, 2024
Jupyter Notebook

Bobo-y / triton_ensemble_model_demo

triton server ensemble model demo

pipeline triton-inference-server

Updated May 2, 2022
Python

akiragy / recsys_pipeline

Build Recommender System with PyTorch + Redis + Elasticsearch + Feast + Triton + Flask. Vector Recall, DeepFM Ranking and Web Application.

python redis flask elasticsearch retrieval pytorch ranking inverted-index recommender-system recommendation feast vector-database triton-inference-server

Updated Sep 2, 2023
Python

chiehpower / Setup-deeplearning-tools

Set up CI in DL/ cuda/ cudnn/ TensorRT/ onnx2trt/ onnxruntime/ onnxsim/ Pytorch/ Triton-Inference-Server/ Bazel/ Tesseract/ PaddleOCR/ NVIDIA-docker/ minIO/ Supervisord on AGX or PC from scratch.

Updated Sep 27, 2023
Python

k9ele7en / Triton-TensorRT-Inference-CRAFT-pytorch

Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server - multi-format). Supported model format for Triton inference: TensorRT engine, Torchscript, ONNX

inference pytorch text-detection nvidia-docker inference-server tensorrt inference-engine onnx onnx-torch tensorrt-conversion triton-inference-server text-detection-from-image

Updated Aug 18, 2021
Python

omarabid59 / yolov8-triton

Provides an ensemble model to deploy a YoloV8 ONNX model to Triton

deployment triton-inference-server ultralytics triton-server yolov8

Updated Oct 19, 2023
Python

trinhtuanvubk / Diff-VC

Diffusion Model for Voice Conversion

gradio voice-conversion diffusion-models triton-inference-server

Updated Mar 14, 2024
Jupyter Notebook

hiennguyen9874 / triton-face-recognition

Triton face detection & recognition

deep-learning face-recognition face-detection triton-inference-server

Updated Jan 3, 2023
Jupyter Notebook

tonhathuy / tensorrt-triton-magface

Magface Triton Inferece Server Using Tensorrt

face-recognition onnx triton-inference-server magface tensorrt-engine

Updated Feb 12, 2022
Jupyter Notebook

Improve this page

Add a description, image, and links to the triton-inference-server topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the triton-inference-server topic, visit your repo's landing page and select "manage topics."