Skip to content

Latest commit

 

History

History
1507 lines (1501 loc) · 22.3 KB

File metadata and controls

1507 lines (1501 loc) · 22.3 KB

Neural Engine Support Matrix

Performance results test on ​​07/10/2022 with Intel(R) Platinum 8375C processor on AWS c6i.12xlarge instance.

Performance varies by use, configuration and other factors. See platform configuration for configuration details. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks

Performance

Model Sparsity Sequencelength MAX Throughput (samples/sec)  Latency (ms) BatchSize Instance Cores/Instance Requirement
BERT Large 90% 16 2652 36.513 8 12 2 BatchSize <= 8 && Latency <= 50ms
85% 2181 43.628 8 12 2
80% 1719 41.91 6 12 2
75% 1487 48.255 6 12 2
70% 1220 39.143 4 12 2
90% 32 1304 46.066 5 12 2
85% 1102 43.9 4 12 2
80% 870 41.487 3 12 2
75% 750 47.82 3 12 2
70% 690 46.188 4 8 3
90% 48 910 39.921 3 12 2
85% 729 49.71 3 12 2
80% 619 38.821 4 6 4
75% 499 47.933 2 12 2
70% 429 41.936 3 6 4
90% 64 680 35.318 2 12 2
85% 557 43.262 2 12 2
80% 467 38.589 3 6 4
75% 405 44.415 3 6 4
70% 339 35.473 2 4 6
90% 80 513 47.119 2 12 2
85% 389 40.945 2 8 3
80% 360 49.936 3 6 4
75% 274 43.843 2 6 4
70% 259 46.242 3 6 4
90% 96 442 36.241 2 8 3
85% 364 44.016 2 8 3
80% 303 40.13 2 6 4
75% 266 45.05 2 6 4
70% 221 35.973 2 6 4
90% 112 349 45.938 2 8 3
85% 275 43.614 2 6 4
80% 217 37.039 2 4 6
75% 188 42.301 2 4 6
70% 166 48.033 2 6 4
90% 128 320 49.769 2 8 3
85% 263 45.724 2 6 4
80% 207 38.819 2 4 6
75% 181 44.021 2 4 6
70% 160 49.979 2 6 4
90% 384 74 41.235 1 3 8
85% 63 48.091 2 2 12
80% 51 38.914 1 2 12
75% 46 43.698 1 2 12
70% 42 47.905 1 2 12
BERT Base 90% 16 8972 16.076 6 24 1 BatchSize <= 8 && Latency <= 20ms
85% 7192 19.906 6 24 1
80% 5482 17.509 4 24 1
75% 4808 19.952 4 24 1
70% 3678 19.409 3 24 1
90% 32 4750 19.975 3 24 1
85% 3647 19.867 3 24 1
80% 2921 16.587 4 12 2
75% 2576 18.768 4 12 2
70% 2131 16.913 3 12 2
90% 48 2804 17.096 4 12 2
85% 2263 16.037 3 12 2
80% 1938 18.756 3 12 2
75% 1581 15.119 2 12 2
70% 1409 17.015 2 12 2
90% 64 2116 17.113 3 12 2
85% 1777 19.972 3 12 2
80% 1474 16.392 2 12 2
75% 1278 18.74 4 6 4
70% 1137 15.752 3 6 4
90% 80 1594 15.124 4 6 4
85% 1347 17.9 4 6 4
80% 1126 16 3 6 4
75% 993 18.258 3 6 4
70% 890 19.877 3 6 4
90% 96 1319 18.28 4 6 4
85% 1086 16.63 3 6 4
80% 931 19.545 3 6 4
75% 833 14.332 2 6 4
70% 747 16.141 2 6 4
90% 112 1106 16.31 3 6 4
85% 924 19.446 4 6 4
80% 719 16.698 4 4 6
75% 633 19.124 2 6 4
70% 501 15.816 2 4 6
90% 128 961 18.549 3 6 4
85% 807 14.868 2 6 4
80% 701 17.117 4 4 6
75% 613 19.618 2 6 4
70% 515 15.484 2 4 6
BERT Mini 90% 16 75384 0.989 3 24 1 BatchSize <= 8 && Latency <= 1ms
85% 55628 0.917 2 24 1
80% 49120 0.957 4 12 2
75% 41598 0.85 3 12 2
70% 39218 0.913 3 12 2
90% 32 31211 0.788 1 24 1
85% 28399 0.848 1 24 1
80% 24910 0.966 4 6 4
75% 20505 0.88 3 6 4
70% 18575 0.865 2 8 3
90% 48 25485 0.987 1 24 1
85% 17468 0.924 2 8 3
80% 16771 0.971 2 8 3
75% 16004 0.996 2 8 3
70% 13381 0.894 2 6 4
90% 64 13990 0.863 1 12 2
85% 13154 0.917 2 6 4
80% 12535 0.973 2 6 4
75% 12102 0.998 2 6 4
70% 8426 0.961 2 4 6
90% 80 8736 0.917 1 8 3
85% 8282 0.967 2 6 4
80% 6659 0.898 2 3 8
75% 6477 0.933 2 3 8
70% 6290 0.974 2 3 8
90% 96 8598 0.931 1 8 3
85% 6449 0.935 2 3 8
80% 6263 0.965 2 3 8
75% 6080 0.988 2 3 8
70% 3684 0.817 1 3 8
90% 112 6246 0.978 1 6 4
85% 6374 0.947 1 6 4
80% 6026 0.998 1 6 4
75% 3300 0.926 1 3 8
70% 3226 0.935 1 3 8
90% 128 6221 0.958 1 6 4
85% 6322 0.96 1 6 4
80% 6081 0.985 1 6 4
75% 3368 0.894 1 3 8
70% 3264 0.924 1 3 8
DistillBERT 90% 16 15460 6.296 8 12 2 BatchSize <= 8 && Latency <= 10ms
85% 16 13129 7.363 8 12 2
80% 16 11323 8.578 8 12 2
75% 16 10072 9.599 8 12 2
70% 16 8689 7.44 8 8 3
90% 32 7901 8.205 8 8 3
85% 32 6737 9.562 8 8 3
80% 32 5440 8.904 8 6 4
75% 32 4920 9.786 8 6 4
70% 32 4460 8.053 6 6 4
90% 48 5280 9.179 6 8 3
85% 48 4339 9.244 5 8 3
80% 48 3634 9.928 5 6 4
75% 48 3255 9.211 5 6 4
70% 48 2928 8.197 4 6 4
90% 64 3681 9.527 6 6 4
85% 64 2965 8.106 6 4 6
80% 64 2741 8.862 3 8 3
75% 64 2351 8.527 5 4 6
70% 64 2150 9.282 5 4 6
90% 80 2918 8.316 4 6 4
85% 80 2505 9.602 4 6 4
80% 80 2010 7.995 4 4 6
75% 80 1872 8.609 4 4 6
70% 80 1706 9.442 4 4 6
90% 96 2409 9.983 4 6 4
85% 96 1993 8.013 4 4 6
80% 96 1671 9.627 4 4 6
75% 96 1484 8.131 4 3 8
70% 96 1342 8.967 4 3 8
90% 112 1910 8.456 4 4 6
85% 112 1659 9.687 4 4 6
80% 112 1293 9.245 4 3 8
75% 112 1230 9.798 4 3 8
70% 112 1091 8.261 3 3 8
90% 128 1627 9.908 4 4 6
85% 128 1397 8.684 3 4 6
80% 128 1130 7.957 3 3 8
75% 128 1081 8.353 3 3 8
70% 128 982 9.187 3 3 8

platform configuration

Manufacturer Amazon EC2
Product Name c6i.12xlarge
BIOS Version 1
OS Ubuntu   20.04.3 LTS
Kernel 5.15.0-1021-aws
Microcode 0xd000331
IRQ Balance Disabled
CPU Model Intel(R) XeonPlatinum 8375C CPU @ 2.90GHz
Base Frequency 2.9GHz
Maximum Frequency 3.9GHz
All-core Maximum Frequency 3.5GHz
CPU(s) 48
Thread(s) per Core N/A
Core(s) per Socket 24
Socket(s) 1
NUMA Node(s) 1
Turbo Enabled
FrequencyGoverner Default
Max C-State 9