update vkpeak 20230812 #284

nihui · 2023-08-05T11:43:17Z

https://github.com/nihui/vkpeak/releases/tag/20230812

fp16-matrix value added for all VK_KHR_cooperative_matrix capable devices, such as rtx20+ and rdna3
It reflects the computing power of tensorcore or similar AI engine on the device

At the moment, all nvidia turing+ devices are known to work
rdna3 device works with the latest windows driver (130Tflops+ measured on my 7900xtx graphic)

In the future, the linux mesa driver will follow up, bring this extension for intel etc.

sample output on nvidia t4

[action@VM-116-181-centos build]$ ./vkpeak 0
device       = GRID T4-8C

fp32-scalar  = 3823.95 GFLOPS
fp32-vec4    = 3796.63 GFLOPS

fp16-scalar  = 3599.11 GFLOPS
fp16-vec4    = 7203.46 GFLOPS
fp16-matrix  = 29188.25 GFLOPS

fp64-scalar  = 127.15 GFLOPS
fp64-vec4    = 127.13 GFLOPS

int32-scalar = 3667.11 GIOPS
int32-vec4   = 3741.25 GIOPS

int16-scalar = 3707.29 GIOPS
int16-vec4   = 3797.13 GIOPS

The text was updated successfully, but these errors were encountered:

nihui changed the title ~~update vkpeak 20230805~~ update vkpeak 20230812 Aug 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update vkpeak 20230812 #284

update vkpeak 20230812 #284

nihui commented Aug 5, 2023 •

edited

update vkpeak 20230812 #284

update vkpeak 20230812 #284

Comments

nihui commented Aug 5, 2023 • edited

nihui commented Aug 5, 2023 •

edited