This repository contains scripts and outputs related to VLLM service performance testing and usability validation.
- start_server.sh: Shell script to start the VLLM service.
- hf_test.py: Python script for testing performance using method 1.
- vllm_test.py: Python script for testing performance using method 2.
- test.sh: Shell script to verify VLLM service availability.
- inference_stats.txt: Output file containing inference performance statistics.
- vllm_inference_stats.txt: Output file with additional VLLM inference statistics.
-
Start the VLLM Service:
bash start_server.sh
-
Check VLLM Usability:
bash test.sh ---
-
Run Performance Tests:
- Method 1:
python hf_test.py
- Method 2:
python vllm_test.py
- Method 1:
-
Outputs:
inference_stats.txt
andvlm_inference_stats.txt
contain performance results.
- Python 3.x
- vLLM 0.6.3.post1
- Zh1yuShen