Skip to content

yinqiwen/lmsf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rust LLM Serving Framework

Features

  • Paged Attention
  • Continuous Batch
  • Quantization
    • awq
    • squeezellm
  • Models
    • llama
    • gemma
    • chatglm

Getting Started

Examples

$ cargo run --release --example llm_engine_example -- --model <llma model dir> --gpu-memory-utilization 0.95 --block-size 8 --max-model-len 1024

API Server

$ cargo build --release
$ ./target/release/entrypoints --model <llma model dir> --gpu-memory-utilization 0.95 --block-size 8 --max-model-len 1024 --host 0.0.0.0 --port 8000

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published