[WIP] Include GPT Fast in torch.compile nightly benchmark workflow #2857

sachanub · 2023-12-20T01:59:56Z

Description

Please read our CONTRIBUTING.md prior to creating your first pull request.

The objective of this PR is to include the GPT Fast model with weights corresponding to Llama 7B with int4 quantization.

Steps to download Llama 7B weights in the benchmark host:

Ran a temporary workflow to download weights with the HUGGING_FACE_HUB_TOKEN in the commit 1e6088e

Results of successful run: https://github.com/pytorch/serve/actions/runs/7271384883/job/19811851851?pr=2857

Testing:

Ran benchmark workflow in the commit 31936c9

Results of the successful run: https://github.com/pytorch/serve/actions/runs/7272224847/job/19813999840?pr=2857
Benchmark report file: report.md

Updates in benchmark-ab.py script:

Also updated the benchmark-ab.py script to include -l in the ab commands to allow variable response lengths without counting them as errors (https://httpd.apache.org/docs/2.4/programs/ab.html).

lxning · 2023-12-20T20:33:04Z

benchmarks/models_config/gpt_fast_torch_compile_gpu.yaml

+gpt_fast:
+    7b_int4:
+        benchmark_engine: "ab"
+        url: https://torchserve.pytorch.org/mar_files/gpt_fast_7b_int4.mar


please clearly specify the model in the name. eg. Llama-2-7b-hf

lxning · 2023-12-20T20:33:13Z

benchmarks/models_config/gpt_fast_torch_compile_gpu.yaml

+        backend_profiling: False
+        exec_env: "local"
+        processors:
+            - "cpu"


cpu should be reomoved.

chauhang · 2024-04-14T09:40:55Z

@namannandan @lxning What is the work remaining for this PR?

Ubuntu and others added 10 commits December 20, 2023 01:59

Download Llama 7B model weights

8597d62

Retrigger workflow

9d3e531

Retrigger workflow

bfea7b3

Retrigger workflow

eb6bba7

Retrigger workflow

1e6088e

Run GPT Fast benchmark test

46abd5c

Retrigger tests

4b80cb7

Retrigger tests

31936c9

Revert changes made for GPT Fast test

27f0ab1

Merge branch 'master' into gpt_fast_benchmark

33655c2

sachanub changed the title ~~Include GPT Fast in torch.compile nightly benchmark workflow~~ [WIP] Include GPT Fast in torch.compile nightly benchmark workflow Dec 20, 2023

lxning reviewed Dec 20, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Include GPT Fast in torch.compile nightly benchmark workflow #2857

[WIP] Include GPT Fast in torch.compile nightly benchmark workflow #2857

sachanub commented Dec 20, 2023 •

edited

lxning Dec 20, 2023

lxning Dec 20, 2023

chauhang commented Apr 14, 2024

[WIP] Include GPT Fast in torch.compile nightly benchmark workflow #2857

Are you sure you want to change the base?

[WIP] Include GPT Fast in torch.compile nightly benchmark workflow #2857

Conversation

sachanub commented Dec 20, 2023 • edited

Description

lxning Dec 20, 2023

Choose a reason for hiding this comment

lxning Dec 20, 2023

Choose a reason for hiding this comment

chauhang commented Apr 14, 2024

sachanub commented Dec 20, 2023 •

edited