cublasgemm-benchmark

A simple and repeatable benchmark for validating the GPU performance based on cublas matrix multiplication.

How to run

Make sure your CUDA tool kit is setup (Your nvcc is on $PATH, shared libraries on $LD_LIBRARY_PATH, headers on $CPATH). Then execute the following command to start the test:

$ ./run.sh

The code does C=alpha*A*B+beta*C with square matrices A, B and C and repeate 2 times (adjustable to test longer for more stable result).
The sizes of A,B and C are upto (16384,16384) in default test (also adjustable to fit your GPU memory size).
The default code runs benchmark for GeForce GTX TITAN BLACK (sm_35) (adjustable) to test with cublasSgemm (can also be cublasHgemm for Pascal GPUs).

Uncomment line 11 in gemm.cu and line 4 in run.sh to test float16 matrix multiplication (cublasHgemm) on Tesla P100 GPU. This needs CUDA 8.0.

Example Testing Result

An example testing result can be found in here.

The "pstate" ranges from P0 to P12 where P0 is the maximum performance and P12 is the minimum performance.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
example		example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
fp16_conversion.h		fp16_conversion.h
gemm.cu		gemm.cu
makefile		makefile
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

example

example

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

fp16_conversion.h

fp16_conversion.h

gemm.cu

gemm.cu

makefile

makefile

run.sh

run.sh

Repository files navigation

cublasgemm-benchmark

How to run

Example Testing Result

See also

About

Releases

Packages

Languages

License

hma02/cublasgemm-benchmark

Folders and files

Latest commit

History

Repository files navigation

cublasgemm-benchmark

How to run

Example Testing Result

See also

About

Topics

Resources

License

Stars

Watchers

Forks

Languages