You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am trying to use OneDNN BenchDNN with v2.6.3 and OneDNN BenchDNN with v3.4.1 to observe how the performance has improved with respect to specific M,N,K Dimensions with AVX512_VNNI Kernels(Int8).
I am using the dimensions from LLM variants that is being used to Generate the tokens. The Table represents m,n,k dimensions with input length 1024 for one of the LLM Variant and followed by time taken for execution with v3.4.1 and v2.6.3
As we can observe in the table the efficiency is on par between the two versions. I would like to know why we are not observing the improvement in efficiency? (If there are any specific tweaks that needs to be done to observe the enhancement of kernel)
Sample BenchDNN command I am using for this activity: v2.6.3 --> ./benchdnn --matmul --mode=p --cfg=u8s8s8 --stag=ab --wtag=any --dtag=ab --fix-times-per-prb=200 --perf-template=%prb%,%-time%,%+time%,%0time%,%-Gflops%,%+Gflops%,%0Gflops%,%bw% 4012x4096:4096x16384 v3.4.1 --> ./benchdnn --matmul --mode=p --dt=u8:s8:s8 --stag=ab --wtag=any --dtag=ab --fix-times-per-prb=200 --perf-template=%prb%,%-time%,%+time%,%0time%,%-Gflops%,%+Gflops%,%0Gflops%,%bw% 4012x4096:4096x16384
Kernel Triggered: brg:avx512_core_vnni Kernel for both the versions.
I have also observed difference in the weight tag between the versions, with v2.6.3 we are observing wei_s8::blocked:BA16a64b4a::f0 and with v3.4.1 we are observing wei_s8:a:blocked:BA16a64b4a::f0. I would like to know what is the difference between following tags?
The text was updated successfully, but these errors were encountered:
I am trying to replicate your numbers with the oneDNN versions you mentioned.
The results are the following:
M,N,K = 4012,16384,4096; ratio v3.4.1 over v2.6.3 is 1.16x
M,N,K = 4012,4096,4096: ratio v3.4.1 over v2.6.3 is 1.14x
M,N,K = 4012,130528,4096; ratio v3.4.1 over v2.6.3 is 1.1x
For v3.4.1, I am using oneAPI compiler latest version, 2024.1 and tbb version is also the latest, 2021.12
For v2.6.3, I am using oneAPI compiler version, 2021.3 and tbb version is 2021.3
As you have not shared the system details (number of cores, etc.), we can't compare the numbers directly, but if we see a comparison or oneDNN version, v3.4.1 is better than v2.6.3. I am using bare metal 3rd Gen Intel Xeon Processor.
Please share your compiler and tbb versions and machine details.
In oneDNN v3.4.1,
a -- indicates memory desc was created with fmt_kind any.
Hi, I am trying to use OneDNN BenchDNN with v2.6.3 and OneDNN BenchDNN with v3.4.1 to observe how the performance has improved with respect to specific M,N,K Dimensions with AVX512_VNNI Kernels(Int8).
I am using the dimensions from LLM variants that is being used to Generate the tokens. The Table represents m,n,k dimensions with input length 1024 for one of the LLM Variant and followed by time taken for execution with v3.4.1 and v2.6.3
As we can observe in the table the efficiency is on par between the two versions. I would like to know why we are not observing the improvement in efficiency? (If there are any specific tweaks that needs to be done to observe the enhancement of kernel)
Sample BenchDNN command I am using for this activity:
v2.6.3 -->
./benchdnn --matmul --mode=p --cfg=u8s8s8 --stag=ab --wtag=any --dtag=ab --fix-times-per-prb=200 --perf-template=%prb%,%-time%,%+time%,%0time%,%-Gflops%,%+Gflops%,%0Gflops%,%bw% 4012x4096:4096x16384
v3.4.1 -->
./benchdnn --matmul --mode=p --dt=u8:s8:s8 --stag=ab --wtag=any --dtag=ab --fix-times-per-prb=200 --perf-template=%prb%,%-time%,%+time%,%0time%,%-Gflops%,%+Gflops%,%0Gflops%,%bw% 4012x4096:4096x16384
Kernel Triggered: brg:avx512_core_vnni Kernel for both the versions.
I have also observed difference in the weight tag between the versions, with v2.6.3 we are observing wei_s8::blocked:BA16a64b4a::f0 and with v3.4.1 we are observing wei_s8:a:blocked:BA16a64b4a::f0. I would like to know what is the difference between following tags?
The text was updated successfully, but these errors were encountered: