Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tracking Issue] Benchmarks #154

Open
yaoyaoding opened this issue Apr 5, 2023 · 134 comments
Open

[Tracking Issue] Benchmarks #154

yaoyaoding opened this issue Apr 5, 2023 · 134 comments

Comments

@yaoyaoding
Copy link
Member

yaoyaoding commented Apr 5, 2023

This issue tracks the performance benchmarks of hidet vs. other dynamo backends in pytorch.

The benchmark scripts that produce these report are located at hidet/scripts/bench.

@yaoyaoding yaoyaoding pinned this issue Apr 5, 2023
@hidet-org hidet-org deleted a comment from github-actions bot Apr 5, 2023
@yaoyaoding
Copy link
Member Author

2023-04-05

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 63d75a7...ab69a97
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.656 3.261 3.302 1.481
resnet50 f16[1,3,224,224] 5.663 3.395 3.460 1.217
model/bert-base-uncased f32, bs=1, seq=128 6.060 3.095 2.920 2.335
model/bert-base-uncased f16, bs=1, seq=128 6.444 1.425 1.099 1.923

Time: 3.35 hours

@yaoyaoding
Copy link
Member Author

2023-04-07

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 172166955d6b608f06394a23a37f99fa93009023...6289f46d21169c01c7b4a00cfb62e89484306998
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.542 1.408 1.407 1.291
resnet50 f16[1,3,224,224] 1.557 1.250 1.253 1.089
model/bert-base-uncased f32, bs=1, seq=128 3.037 2.727 2.631 2.012
model/bert-base-uncased f16, bs=1, seq=128 1.865 1.288 1.017 1.715

Time: 2.47 hours

@yaoyaoding
Copy link
Member Author

2023-04-08

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 6289f46...68faaa5
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.533 1.402 1.399 1.303
resnet50 f16[1,3,224,224] 1.591 1.250 1.294 1.122
model/bert-base-uncased f32, bs=1, seq=128 3.033 2.807 2.634 2.008
model/bert-base-uncased f16, bs=1, seq=128 1.830 1.289 1.014 1.683

Time: 2.45 hours

@yaoyaoding
Copy link
Member Author

2023-04-09

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 68faaa5...68faaa5
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.537 1.404 1.399 1.293
resnet50 f16[1,3,224,224] 1.578 1.234 1.258 1.055
model/bert-base-uncased f32, bs=1, seq=128 2.867 2.738 2.515 2.014
model/bert-base-uncased f16, bs=1, seq=128 1.863 1.287 1.014 1.688

Time: 2.45 hours

@yaoyaoding
Copy link
Member Author

2023-04-10

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 68faaa5...da54417
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.541 1.406 1.404 1.280
resnet50 f16[1,3,224,224] 1.577 1.243 1.288 1.116
model/bert-base-uncased f32, bs=1, seq=128 2.958 2.809 2.612 2.015
model/bert-base-uncased f16, bs=1, seq=128 1.871 1.289 1.015 1.683

Time: 2.18 hours

@yaoyaoding
Copy link
Member Author

2023-04-11

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: da54417...3a7b972
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.534 1.404 1.401 1.288
resnet50 f16[1,3,224,224] 1.614 1.261 1.280 1.040
model/bert-base-uncased f32, bs=1, seq=128 2.911 2.811 2.631 2.031
model/bert-base-uncased f16, bs=1, seq=128 1.843 1.289 1.013 1.574

Time: 2.18 hours

@yaoyaoding
Copy link
Member Author

2023-04-11

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 20.04.6 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git commit: 634a3a2
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.584 3.263 3.316 1.512
resnet50 f16[1,3,224,224] 5.466 3.376 3.420 1.130
model/bert-base-uncased f32, bs=1, seq=128 6.085 3.093 2.898 2.354
model/bert-base-uncased f16, bs=1, seq=128 6.288 1.425 1.095 1.863

Time: 8.44 hours

@yaoyaoding
Copy link
Member Author

2023-04-12

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 20.04.6 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 634a3a2...7f634d8
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.604 3.222 3.265 1.483
resnet50 f16[1,3,224,224] 5.401 3.365 3.412 1.141
model/bert-base-uncased f32, bs=1, seq=128 6.085 3.082 2.896 2.357
model/bert-base-uncased f16, bs=1, seq=128 6.224 1.426 1.097 1.829

Time: 7.91 hours

@yaoyaoding
Copy link
Member Author

2023-04-14

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.0)
  • Git diff: 3a7b972...ef81b2a
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.534 1.400 1.319 1.279
resnet50 f16[1,3,224,224] 1.623 1.207 1.226 1.006
model/bert-base-uncased f32, bs=1, seq=128 2.941 2.716 2.593 2.006
model/bert-base-uncased f16, bs=1, seq=128 1.909 1.299 0.963 1.573

Time: 2.82 hours

@yaoyaoding
Copy link
Member Author

2023-04-13

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 20.04.6 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 7f634d8...ef81b2a
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.651 3.271 3.282 1.509
resnet50 f16[1,3,224,224] 5.466 3.403 3.446 1.143
model/bert-base-uncased f32, bs=1, seq=128 6.155 3.080 2.896 2.350
model/bert-base-uncased f16, bs=1, seq=128 6.249 1.426 1.097 1.831

Time: 7.95 hours

@yaoyaoding
Copy link
Member Author

2023-04-15

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.0)
  • Git diff: ef81b2a...ef81b2a
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.539 1.409 1.404 1.291
resnet50 f16[1,3,224,224] 1.580 1.175 1.200 1.008
model/bert-base-uncased f32, bs=1, seq=128 3.045 2.720 2.677 2.003
model/bert-base-uncased f16, bs=1, seq=128 1.949 1.297 1.014 1.587

Time: 2.68 hours

@yaoyaoding
Copy link
Member Author

2023-04-14

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 20.04.6 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git diff: ef81b2a...ef81b2a
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.722 3.330 3.370 1.483
resnet50 f16[1,3,224,224] 5.482 3.421 3.434 1.141
model/bert-base-uncased f32, bs=1, seq=128 6.154 3.085 2.901 2.346
model/bert-base-uncased f16, bs=1, seq=128 6.346 1.426 1.097 1.828

Time: 8.10 hours

@yaoyaoding
Copy link
Member Author

2023-04-16

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.0)
  • Git diff: ef81b2a...ef81b2a
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.532 1.401 1.396 1.289
resnet50 f16[1,3,224,224] 1.625 1.199 1.231 1.009
model/bert-base-uncased f32, bs=1, seq=128 3.078 2.861 2.682 2.005
model/bert-base-uncased f16, bs=1, seq=128 1.959 1.299 1.013 1.592

Time: 2.75 hours

@yaoyaoding
Copy link
Member Author

2023-04-15

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 20.04.6 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git diff: ef81b2a...ef81b2a
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.817 3.375 3.422 1.482
resnet50 f16[1,3,224,224] 5.511 3.415 3.452 1.142
model/bert-base-uncased f32, bs=1, seq=128 7.157 3.092 2.903 2.351
model/bert-base-uncased f16, bs=1, seq=128 6.398 1.426 1.098 1.829

Time: 8.09 hours

@yaoyaoding
Copy link
Member Author

2023-04-17

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.0)
  • Git diff: ef81b2a...ef81b2a
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.542 1.402 1.404 1.280
resnet50 f16[1,3,224,224] 1.610 1.203 1.215 1.009
model/bert-base-uncased f32, bs=1, seq=128 2.954 2.710 2.678 1.962
model/bert-base-uncased f16, bs=1, seq=128 1.926 1.299 1.014 1.622

Time: 2.74 hours

@yaoyaoding
Copy link
Member Author

2023-04-16

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 20.04.6 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git diff: ef81b2a...ef81b2a
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.629 3.305 3.331 1.481
resnet50 f16[1,3,224,224] 5.719 3.485 3.522 1.144
model/bert-base-uncased f32, bs=1, seq=128 6.184 3.084 2.897 2.337
model/bert-base-uncased f16, bs=1, seq=128 6.372 1.426 1.096 1.838

Time: 8.08 hours

@yaoyaoding
Copy link
Member Author

2023-04-18

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.0)
  • Git diff: ef81b2a...48e57cb
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.543 1.410 1.406 1.281
resnet50 f16[1,3,224,224] 1.639 1.207 1.216 1.009
model/bert-base-uncased f32, bs=1, seq=128 3.082 2.855 2.556 2.021
model/bert-base-uncased f16, bs=1, seq=128 1.943 1.298 1.015 1.610

Time: 2.76 hours

@yaoyaoding
Copy link
Member Author

2023-04-17

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 20.04.6 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git diff: ef81b2a...3e7d959
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.643 3.228 3.296 1.516
resnet50 f16[1,3,224,224] 5.600 3.431 3.448 1.146
model/bert-base-uncased f32, bs=1, seq=128 6.233 3.083 2.896 2.346
model/bert-base-uncased f16, bs=1, seq=128 6.154 1.424 1.095 1.828

Time: 8.54 hours

@yaoyaoding
Copy link
Member Author

2023-04-18

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 20.04.6 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 3e7d959...f5afc42
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.707 3.267 3.308 1.506
resnet50 f16[1,3,224,224] 5.583 3.431 3.498 1.141
model/bert-base-uncased f32, bs=1, seq=128 6.188 3.084 2.898 2.339
model/bert-base-uncased f16, bs=1, seq=128 6.277 1.426 1.095 1.828

Time: 8.06 hours

@yaoyaoding
Copy link
Member Author

2023-04-20

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.0)
  • Git diff: 48e57cb...67cd640
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.517 1.390 1.293 1.272
resnet50 f16[1,3,224,224] 1.623 1.197 1.228 1.042
model/bert-base-uncased f32, bs=1, seq=128 3.090 2.849 2.597 2.012
model/bert-base-uncased f16, bs=1, seq=128 1.922 1.297 0.990 1.572

Time: 2.55 hours

@yaoyaoding
Copy link
Member Author

2023-04-21

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.0)
  • Git diff: 67cd640...2165662
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.535 1.409 1.403 1.282
resnet50 f16[1,3,224,224] 1.609 1.160 1.192 1.042
model/bert-base-uncased f32, bs=1, seq=128 2.907 2.809 2.650 2.046
model/bert-base-uncased f16, bs=1, seq=128 1.960 1.297 1.046 1.572

Time: 2.50 hours

@yaoyaoding
Copy link
Member Author

2023-04-22

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.0)
  • Git diff: 2165662...f361211
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.537 1.411 1.398 1.267
resnet50 f16[1,3,224,224] 1.645 1.211 1.233 1.044
model/bert-base-uncased f32, bs=1, seq=128 3.097 2.859 2.675 2.006
model/bert-base-uncased f16, bs=1, seq=128 1.951 1.298 1.046 1.577

Time: 2.56 hours

@yaoyaoding
Copy link
Member Author

2023-04-22

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 20.04.6 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git diff: f5afc42...f361211
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.780 3.355 3.410 1.506
resnet50 f16[1,3,224,224] 5.658 3.445 3.513 1.146
model/bert-base-uncased f32, bs=1, seq=128 6.238 3.083 2.895 2.350
model/bert-base-uncased f16, bs=1, seq=128 6.477 1.426 1.095 1.829

Time: 8.19 hours

@yaoyaoding
Copy link
Member Author

2023-04-23

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.0)
  • Git diff: f361211...9a65fa2
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.543 1.409 1.403 1.289
resnet50 f16[1,3,224,224] 1.620 1.192 1.218 1.045
model/bert-base-uncased f32, bs=1, seq=128 3.092 2.853 2.651 1.995
model/bert-base-uncased f16, bs=1, seq=128 1.939 1.299 1.046 1.545

Time: 2.61 hours

@yaoyaoding
Copy link
Member Author

2023-04-23

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 20.04.6 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git diff: f361211...9a65fa2
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.851 3.350 3.398 1.480
resnet50 f16[1,3,224,224] 5.689 3.520 3.547 1.147
model/bert-base-uncased f32, bs=1, seq=128 6.228 3.082 2.898 2.334
model/bert-base-uncased f16, bs=1, seq=128 6.454 1.426 1.097 1.820

Time: 8.35 hours

@yaoyaoding
Copy link
Member Author

2023-04-24

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 3090
  • GPU driver: 530.30.02 (12.0)
  • Git diff: 9a65fa2...9a65fa2
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.530 1.403 1.398 1.312
resnet50 f16[1,3,224,224] 1.605 1.191 1.231 1.045
model/bert-base-uncased f32, bs=1, seq=128 3.100 2.860 2.598 2.013
model/bert-base-uncased f16, bs=1, seq=128 1.926 1.298 1.047 1.546

Time: 2.47 hours

@yaoyaoding
Copy link
Member Author

2023-04-24

  • Hidet version: 0.2.3.dev
  • PyTorch version: 2.0.0+cu117
  • OS: Ubuntu 20.04.6 LTS
  • GPU: NVIDIA A10G
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 9a65fa2...9a65fa2
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.863 3.351 3.393 1.505
resnet50 f16[1,3,224,224] 5.575 3.427 3.452 1.148
model/bert-base-uncased f32, bs=1, seq=128 6.187 3.083 2.897 2.341
model/bert-base-uncased f16, bs=1, seq=128 6.351 1.426 1.098 1.823

Time: 8.26 hours

@yaoyaoding
Copy link
Member Author

2023-06-28

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: f3aad89...f3aad89
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.211 1.062 1.092 0.732
resnet50 f16[1,3,224,224] 1.420 1.095 1.115 0.477
model/bert-base-uncased f32, bs=1, seq=128 2.057 1.863 1.682 1.289
model/bert-base-uncased f16, bs=1, seq=128 1.934 0.711 0.798 0.896

Time: 2.16 hours

@yaoyaoding
Copy link
Member Author

2023-06-29

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: f3aad89...7c52c9d
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.219 1.117 1.087 0.732
resnet50 f16[1,3,224,224] 1.421 1.077 1.095 0.526
model/bert-base-uncased f32, bs=1, seq=128 2.053 1.862 1.696 1.165
model/bert-base-uncased f16, bs=1, seq=128 1.909 0.712 0.799 0.886

Time: 2.24 hours

@yaoyaoding
Copy link
Member Author

2023-06-30

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 7c52c9d...d881416
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.221 1.055 1.076 0.735
resnet50 f16[1,3,224,224] 1.428 1.084 1.096 0.526
model/bert-base-uncased f32, bs=1, seq=128 2.052 1.870 1.705 1.162
model/bert-base-uncased f16, bs=1, seq=128 1.894 0.712 0.799 0.883

Time: 2.25 hours

@yaoyaoding
Copy link
Member Author

2023-07-01

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: d881416...790e775
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.227 1.044 1.084 0.737
resnet50 f16[1,3,224,224] 1.407 1.082 1.101 0.528
model/bert-base-uncased f32, bs=1, seq=128 2.054 1.867 1.699 1.285
model/bert-base-uncased f16, bs=1, seq=128 1.913 0.713 0.799 0.885

Time: 2.26 hours

@yaoyaoding
Copy link
Member Author

2023-07-02

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 790e775...664f9f0
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.212 1.057 1.101 0.739
resnet50 f16[1,3,224,224] 1.432 1.089 1.110 0.523
model/bert-base-uncased f32, bs=1, seq=128 2.054 1.863 1.707 1.277
model/bert-base-uncased f16, bs=1, seq=128 1.891 0.712 0.798 0.880

Time: 2.27 hours

@yaoyaoding
Copy link
Member Author

2023-07-03

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 664f9f0...664f9f0
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.227 1.076 1.110 0.734
resnet50 f16[1,3,224,224] 1.401 1.080 1.087 0.527
model/bert-base-uncased f32, bs=1, seq=128 2.051 1.773 1.688 1.164
model/bert-base-uncased f16, bs=1, seq=128 1.921 0.715 0.799 0.881

Time: 2.26 hours

@yaoyaoding
Copy link
Member Author

2023-07-04

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 664f9f0...664f9f0
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.236 1.065 1.088 0.736
resnet50 f16[1,3,224,224] 1.398 1.092 1.085 0.524
model/bert-base-uncased f32, bs=1, seq=128 1.978 1.789 1.701 nan
model/bert-base-uncased f16, bs=1, seq=128 1.899 0.711 0.798 nan

Time: 1.97 hours

@yaoyaoding
Copy link
Member Author

2023-07-05

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 664f9f0...1c1cd11
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.233 1.064 1.071 0.733
resnet50 f16[1,3,224,224] 1.409 1.067 1.094 0.529
model/bert-base-uncased f32, bs=1, seq=128 2.040 1.856 1.695 1.167
model/bert-base-uncased f16, bs=1, seq=128 1.924 0.710 0.797 0.883

Time: 2.28 hours

@yaoyaoding
Copy link
Member Author

2023-07-06

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 1c1cd11...ac2b489
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.220 1.076 1.099 0.733
resnet50 f16[1,3,224,224] 1.420 1.088 1.114 0.526
model/bert-base-uncased f32, bs=1, seq=128 1.972 1.868 1.702 1.165
model/bert-base-uncased f16, bs=1, seq=128 1.862 0.715 0.801 0.882

Time: 2.26 hours

@yaoyaoding
Copy link
Member Author

2023-07-07

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: ac2b489...eb55317
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.221 1.065 1.083 0.733
resnet50 f16[1,3,224,224] 1.429 1.099 1.108 0.525
model/bert-base-uncased f32, bs=1, seq=128 2.052 1.869 1.704 1.293
model/bert-base-uncased f16, bs=1, seq=128 1.906 0.710 0.797 0.938

Time: 2.26 hours

@yaoyaoding
Copy link
Member Author

2023-07-08

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: eb55317...02d9a10
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.232 1.061 1.077 0.728
resnet50 f16[1,3,224,224] 1.398 1.097 1.099 0.519
model/bert-base-uncased f32, bs=1, seq=128 2.055 1.865 1.698 1.189
model/bert-base-uncased f16, bs=1, seq=128 1.893 0.712 0.798 0.884

Time: 2.25 hours

@yaoyaoding
Copy link
Member Author

2023-07-09

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 02d9a10...02d9a10
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.225 1.057 1.076 0.730
resnet50 f16[1,3,224,224] 1.397 1.086 1.107 0.521
model/bert-base-uncased f32, bs=1, seq=128 2.042 1.865 1.699 1.284
model/bert-base-uncased f16, bs=1, seq=128 1.978 0.712 0.796 0.883

Time: 2.26 hours

@yaoyaoding
Copy link
Member Author

2023-07-10

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 02d9a10...02d9a10
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.250 1.081 1.097 0.731
resnet50 f16[1,3,224,224] 1.412 1.088 1.114 0.520
model/bert-base-uncased f32, bs=1, seq=128 2.049 1.803 1.699 1.180
model/bert-base-uncased f16, bs=1, seq=128 1.936 0.712 0.799 0.881

Time: 2.26 hours

@yaoyaoding
Copy link
Member Author

2023-07-11

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu118
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.1)
  • Git diff: 02d9a10...02d9a10
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.231 1.042 1.075 0.730
resnet50 f16[1,3,224,224] 1.393 1.082 1.091 0.522
model/bert-base-uncased f32, bs=1, seq=128 2.051 1.866 1.703 1.159
model/bert-base-uncased f16, bs=1, seq=128 1.909 0.710 0.797 0.882

Time: 2.25 hours

@yaoyaoding
Copy link
Member Author

2023-07-15

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: 02d9a10...99feace
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.947 1.832 1.865 0.729
resnet50 f16[1,3,224,224] 4.156 3.690 3.621 0.524
model/bert-base-uncased f32, bs=1, seq=128 2.056 1.866 1.700 1.161
model/bert-base-uncased f16, bs=1, seq=128 1.732 0.713 0.800 0.882

Time: 2.46 hours

@yaoyaoding
Copy link
Member Author

2023-07-16

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: 99feace...99feace
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.944 1.857 1.864 0.734
resnet50 f16[1,3,224,224] 3.949 3.638 3.648 0.518
model/bert-base-uncased f32, bs=1, seq=128 2.041 1.865 1.704 1.165
model/bert-base-uncased f16, bs=1, seq=128 1.733 0.712 0.799 0.880

Time: 2.47 hours

@yaoyaoding
Copy link
Member Author

2023-07-18

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: 99feace...e3b01bb
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.949 1.852 1.857 0.732
resnet50 f16[1,3,224,224] 4.102 3.609 3.876 0.521
model/bert-base-uncased f32, bs=1, seq=128 2.050 1.815 1.707 1.149
model/bert-base-uncased f16, bs=1, seq=128 1.718 0.714 0.801 0.855

Time: 2.48 hours

@yaoyaoding
Copy link
Member Author

2023-07-19

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: e3b01bb...799b0d6
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.967 1.858 1.868 0.728
resnet50 f16[1,3,224,224] 4.233 3.880 3.755 0.521
model/bert-base-uncased f32, bs=1, seq=128 2.038 1.865 1.673 1.150
model/bert-base-uncased f16, bs=1, seq=128 1.743 0.712 0.796 0.855

Time: 2.47 hours

@yaoyaoding
Copy link
Member Author

2023-07-20

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: 799b0d6...bebc41b
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.918 1.852 1.863 0.736
resnet50 f16[1,3,224,224] 3.973 3.766 3.647 0.522
model/bert-base-uncased f32, bs=1, seq=128 2.057 1.865 1.701 1.149
model/bert-base-uncased f16, bs=1, seq=128 1.753 0.710 0.796 0.859

Time: 2.47 hours

@yaoyaoding
Copy link
Member Author

2023-07-21

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: bebc41b...bebc41b
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.952 1.865 1.865 0.731
resnet50 f16[1,3,224,224] 3.967 3.760 3.638 0.524
model/bert-base-uncased f32, bs=1, seq=128 2.057 1.838 1.698 1.142
model/bert-base-uncased f16, bs=1, seq=128 1.725 0.712 0.801 0.865

Time: 2.47 hours

@yaoyaoding
Copy link
Member Author

2023-07-22

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: bebc41b...bebc41b
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.943 1.854 1.866 0.728
resnet50 f16[1,3,224,224] 4.128 3.874 3.752 0.520
model/bert-base-uncased f32, bs=1, seq=128 2.055 1.829 1.683 1.151
model/bert-base-uncased f16, bs=1, seq=128 1.726 0.712 0.797 0.849

Time: 2.44 hours

@yaoyaoding
Copy link
Member Author

2023-07-23

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: bebc41b...1b40ef4
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.937 1.853 1.860 0.730
resnet50 f16[1,3,224,224] 3.972 3.765 3.752 0.521
model/bert-base-uncased f32, bs=1, seq=128 2.006 1.867 1.701 1.146
model/bert-base-uncased f16, bs=1, seq=128 1.733 0.713 0.800 0.856

Time: 2.48 hours

@yaoyaoding
Copy link
Member Author

2023-07-24

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: 1b40ef4...1b40ef4
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.952 1.853 1.863 0.728
resnet50 f16[1,3,224,224] 3.934 3.743 3.658 0.524
model/bert-base-uncased f32, bs=1, seq=128 2.050 1.840 1.701 1.145
model/bert-base-uncased f16, bs=1, seq=128 1.729 0.711 0.799 0.858

Time: 2.48 hours

@yaoyaoding
Copy link
Member Author

2023-07-25

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: 1b40ef4...fcdec26
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.926 1.854 1.807 0.728
resnet50 f16[1,3,224,224] 3.893 3.858 3.876 0.519
model/bert-base-uncased f32, bs=1, seq=128 2.049 1.855 1.697 1.161
model/bert-base-uncased f16, bs=1, seq=128 1.710 0.714 0.799 0.862

Time: 2.25 hours

@yaoyaoding
Copy link
Member Author

2023-07-26

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: fcdec26...b356f3d
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.880 1.847 1.861 0.728
resnet50 f16[1,3,224,224] 3.905 3.745 3.656 0.519
model/bert-base-uncased f32, bs=1, seq=128 2.051 1.859 1.696 1.149
model/bert-base-uncased f16, bs=1, seq=128 1.732 0.709 0.796 0.864

Time: 2.26 hours

@yaoyaoding
Copy link
Member Author

2023-07-27

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: b356f3d...b356f3d
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.952 1.852 1.859 0.737
resnet50 f16[1,3,224,224] 3.908 3.850 3.686 0.520
model/bert-base-uncased f32, bs=1, seq=128 2.049 1.865 1.699 1.265
model/bert-base-uncased f16, bs=1, seq=128 1.730 0.713 0.800 0.907

Time: 2.25 hours

@yaoyaoding
Copy link
Member Author

2023-07-28

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: b356f3d...b356f3d
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.947 1.854 1.858 0.729
resnet50 f16[1,3,224,224] 4.162 3.862 3.768 0.525
model/bert-base-uncased f32, bs=1, seq=128 2.050 1.866 1.701 1.148
model/bert-base-uncased f16, bs=1, seq=128 1.727 0.710 0.796 0.853

Time: 2.26 hours

@yaoyaoding
Copy link
Member Author

2023-07-29

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: b356f3d...a491cb3
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.940 1.851 1.859 0.725
resnet50 f16[1,3,224,224] 3.905 3.580 3.743 0.521
model/bert-base-uncased f32, bs=1, seq=128 2.035 1.864 1.706 1.148
model/bert-base-uncased f16, bs=1, seq=128 1.742 0.713 0.798 0.913

Time: 2.26 hours

@yaoyaoding
Copy link
Member Author

2023-07-30

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: a491cb3...a491cb3
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.953 1.855 1.858 0.735
resnet50 f16[1,3,224,224] 3.931 3.604 3.583 0.519
model/bert-base-uncased f32, bs=1, seq=128 2.048 1.865 1.633 1.148
model/bert-base-uncased f16, bs=1, seq=128 1.743 0.709 0.795 0.857

Time: 2.26 hours

@yaoyaoding
Copy link
Member Author

2023-07-31

  • Hidet version: 0.3.0.dev
  • PyTorch version: 2.0.1+cu117
  • OS: Ubuntu 22.04.2 LTS
  • GPU: NVIDIA GeForce RTX 4090
  • GPU driver: 530.30.02 (12.2)
  • Git diff: a491cb3...0532154
model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.913 1.805 1.870 0.730
resnet50 f16[1,3,224,224] 3.900 3.782 3.674 0.520
model/bert-base-uncased f32, bs=1, seq=128 2.051 1.753 1.699 1.148
model/bert-base-uncased f16, bs=1, seq=128 1.730 0.709 0.795 0.916

Time: 2.26 hours

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant