Skip to content

yzhaiustc/Optimizing-SGEMV-on-NVIDIA-GPUs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple-SGEMV-on-GPU

An implementation of SGEMV with performance comparable to cuBLAS.

Sample run on Nvidia RTX 2080 Super.

Sample run1 (testing mysgemv):

./sgemv 20480 20480 1 
m = 20480, n = 20480.
Testing my sgemv.
Start the sanity check...
Sanity check passed. Start performance benchmarking...
Average elasped time: 0.003594 second, performance: 233.399858 GFLOPS.

Sample run2 (testing cublasSgemv):

./sgemv 20480 20480 2
m = 20480, n = 20480.
Testing cuBLAS SGEMV.
Average elasped time: 0.003627 second, performance: 231.293983 GFLOPS.

Full data can be found here.

About

An implementation of SGEMV with performance comparable to cuBLAS.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published