yzhaiustc

Follow

Yujia Zhai yzhaiustc

Follow

130 followers · 18 following

@NVIDIA
Santa Clara, California
17:29 (UTC -07:00)
https://yzhaiustc.github.io/

Achievements

BetaSend feedback

Achievements

BetaSend feedback

Block or Report

Block or report yzhaiustc

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned

Optimizing-SGEMM-on-NVIDIA-Turing-GPUs Optimizing-SGEMM-on-NVIDIA-Turing-GPUs Public

Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.

Cuda 236 41
Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F Public

Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.

C 90 18
Optimizing-SGEMV-on-NVIDIA-GPUs Optimizing-SGEMV-on-NVIDIA-GPUs Public

An implementation of SGEMV with performance comparable to cuBLAS.

Cuda 7 6
Optimizing-DGEMV-on-Intel-CPUs Optimizing-DGEMV-on-Intel-CPUs Public

Highly optimized DGEMV on CPU with both serial and parallel performance better than MKL and OpenBLAS.

C 3 1
NVIDIA/cutlass NVIDIA/cutlass Public

CUDA Templates for Linear Algebra Subroutines

C++ 4.6k 807