Demo BERT ONNX server written in rust

This demo showcase the use of onnxruntime-rs on BERT with a GPU on CUDA 11 served by actix-web and tokenized with Hugging Face tokenizer.

Requirement

Linux x86_64
NVIDIA GPU with CUDA 11 (Not sure if CUDA 10 works)
Rust (obviously)
git lfs for the models

Installation

export ORT_USE_CUDA=1
git lfs install
cargo build --release

Run

cargo run --release

or

export LD_LIBRARY_PATH=path/to/onnxruntime-linux-x64-gpu-1.8.0/lib:${LD_LIBRARY_PATH}
./target/release/onnx-server

Call

curl http://localhost:8080/\?data=Hello+World

Python alternative

To compare with standart python server with FastAPI, I've added the code for the same server in src called python_alternative.py

Install

pip install -r requirements.txt

Run

cd src
uvicorn python_alternative:app --reload --workers 1

Call

curl http://localhost:8000/\?data=Hello+World

training and converting to ONNX

The training pipeline is in another repo: https://github.com/haixuanTao/bert-onnx-rs-pipeline

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

.gitattributes

.gitattributes

.gitignore

.gitignore

Cargo.lock

Cargo.lock

Cargo.toml

Cargo.toml

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Demo BERT ONNX server written in rust

Requirement

Installation

Run

Call

Python alternative

Install

Run

Call

training and converting to ONNX

About

Releases

Packages

Languages

haixuanTao/bert-onnx-rs-server

Folders and files

Latest commit

History

Repository files navigation

Demo BERT ONNX server written in rust

Requirement

Installation

Run

Call

Python alternative

Install

Run

Call

training and converting to ONNX

About

Resources

Stars

Watchers

Forks

Languages