Add suport for int8 matmul in afcuda #1656

WilliamTambellini · 2016-11-30T02:09:45Z

I see these new features in cuda 8:
"Native FP16 and INT8 computation for deep learning and other workloads;" :
https://devblogs.nvidia.com/parallelforall/cuda-8-features-revealed/
This feature request is to be able to use it through AF at least for matmul and arithmetic (+ - * /)
Thanks
WT.

arcfide · 2016-11-30T05:00:11Z

I would also like to mention my own interest in having general support for int8 as a feature.

umar456 · 2016-11-30T05:17:25Z

This is a great idea but we need to be careful to up/down convert on certain hardware when adding this feature. For example fp16 performance on compute_61(non-Tesla Pascal) cards is absolutely abysmal. We will also need to support this on older hardware. I wonder how cuda provides fall back in those cases.

pavanky · 2016-11-30T05:19:12Z

AFAIK, there is no support for half precision floating point numbers in C and C++ standards. So fp16 is going to be a bit to support in a general purpose manner.

int8 can be supported easily (although it is a bit tedious).

umar456 · 2016-11-30T05:36:34Z

Found the half library. It seems to be well documented. Probably need to run some tests on performance and compatibility with native types.

pavanky · 2016-11-30T05:44:09Z

@umar456 Does boost have anything similar ?

WilliamTambellini · 2016-11-30T05:48:36Z

Would int8 easier to implement than fp16, especially the int8 cuda8 pascal gpu backend ?

pavanky · 2016-11-30T05:50:51Z

@WilliamTambellini yes int8 support is much easier to accomplish.

WilliamTambellini · 2016-11-30T16:23:47Z

ok so we should better split this feature/issue in 2: int8 and fp16.
Would you mind ?

shehzan10 · 2016-11-30T16:25:13Z

The issue with int8 (char) is that we use it for b8. That is a differentiation we will have to keep in mind.

pavanky · 2016-11-30T16:51:37Z

@shehzan10 int8_t is a different datatype which can be used for i8

WilliamTambellini · 2016-11-30T19:05:13Z

Ok thanks.
WARNING: I have renamed this feature to "int8" in order to limit the scope to int8 support. Anyone interested by fp16 should create another github issue/ticket.
Cheers
W.

WilliamTambellini · 2017-02-02T22:14:35Z

I do confirm our interest in int8 mainly via cuda.
Could anyone summarize which part of ArrayFire would need to be modified in order to take advantage of INT8 hardware acceleration ?
Cheers

pavanky · 2017-02-03T19:38:31Z

@WilliamTambellini This involves changing a lot of files across 3 backends. This is not fairly straight forward.

arcfide · 2017-02-04T00:23:19Z

I'd just like to add my support for int8 support.

WilliamTambellini · 2017-09-27T23:39:04Z

Seen here:
https://devblogs.nvidia.com/parallelforall/mixed-precision-programming-cuda-8/
"cuBLAS is a GPU library for dense linear algebra— an implementation of BLAS, the Basic Linear Algebra Subroutines. cuBLAS has support for mixed precision in several matrix-matrix multiplication routines. cublasHgemm is a FP16 dense matrix-matrix multiply routine that uses FP16 for compute as well as for input and output. cublasSgemmEx() computes in FP32, but the input data can be FP32, FP16, or INT8, and the output can be FP32 or FP16. cublasGemm() is a new routine in CUDA 8 that allows specification of the computation precision, including INT8 computation (which uses DP4A)."

@umar456 would it be possible to do a minimalist implementation in AF in order to call cublasGemm() when af array datatype is int8 or int16 ?
Tks

WilliamTambellini · 2018-06-19T17:37:41Z

Attaching a minimalist poc (seen on the nvidia forum) of int8 matmul using cublasGemmEx :
int8cublas.cu.txt
TBC.

WilliamTambellini · 2019-03-12T21:22:11Z

@umar456 this one is not needed for 3.7.0 on my side.

WilliamTambellini · 2019-12-17T21:18:23Z

Could we please remove that one from the 3.7.0 scope ?

WilliamTambellini · 2023-08-29T17:58:30Z

up ?

WilliamTambellini · 2023-08-29T18:09:01Z

pavanky added this to the v3.5.0 milestone Nov 30, 2016

WilliamTambellini changed the title ~~Add suport for int8 and/or fp16~~ Add suport for int8 Nov 30, 2016

pavanky mentioned this issue Dec 14, 2016

Support for fp16 #1673

Open

4 tasks

shehzan10 added the feature label Dec 29, 2016

mlloreda modified the milestones: v3.5.1, v3.5.0 May 22, 2017

pavanky modified the milestones: v3.6.0, v3.5.1 Jun 16, 2017

mlloreda modified the milestones: v3.6.0, v3.7.0 Mar 1, 2018

WilliamTambellini changed the title ~~Add suport for int8~~ Add suport for int8 matmul for the afcuda backend Jun 19, 2018

umar456 removed this from the v3.7.0 milestone Dec 18, 2019

WilliamTambellini changed the title ~~Add suport for int8 matmul for the afcuda backend~~ Add suport for int8 matmul for afcuda Aug 29, 2023

WilliamTambellini changed the title ~~Add suport for int8 matmul for afcuda~~ Add suport for int8 matmul in afcuda Aug 29, 2023

verstatx linked a pull request Oct 4, 2023 that will close this issue

add int8 matmul support to CUDA backend #3508

Draft

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add suport for int8 matmul in afcuda #1656

Add suport for int8 matmul in afcuda #1656

WilliamTambellini commented Nov 30, 2016

arcfide commented Nov 30, 2016

umar456 commented Nov 30, 2016

pavanky commented Nov 30, 2016

umar456 commented Nov 30, 2016 •

edited

pavanky commented Nov 30, 2016

WilliamTambellini commented Nov 30, 2016

pavanky commented Nov 30, 2016

WilliamTambellini commented Nov 30, 2016

shehzan10 commented Nov 30, 2016

pavanky commented Nov 30, 2016

WilliamTambellini commented Nov 30, 2016

WilliamTambellini commented Feb 2, 2017

pavanky commented Feb 3, 2017

arcfide commented Feb 4, 2017

WilliamTambellini commented Sep 27, 2017

WilliamTambellini commented Jun 19, 2018 •

edited

WilliamTambellini commented Mar 12, 2019

WilliamTambellini commented Dec 17, 2019

WilliamTambellini commented Aug 29, 2023

WilliamTambellini commented Aug 29, 2023

Add suport for int8 matmul in afcuda #1656

Add suport for int8 matmul in afcuda #1656

Comments

WilliamTambellini commented Nov 30, 2016

arcfide commented Nov 30, 2016

umar456 commented Nov 30, 2016

pavanky commented Nov 30, 2016

umar456 commented Nov 30, 2016 • edited

pavanky commented Nov 30, 2016

WilliamTambellini commented Nov 30, 2016

pavanky commented Nov 30, 2016

WilliamTambellini commented Nov 30, 2016

shehzan10 commented Nov 30, 2016

pavanky commented Nov 30, 2016

WilliamTambellini commented Nov 30, 2016

WilliamTambellini commented Feb 2, 2017

pavanky commented Feb 3, 2017

arcfide commented Feb 4, 2017

WilliamTambellini commented Sep 27, 2017

WilliamTambellini commented Jun 19, 2018 • edited

WilliamTambellini commented Mar 12, 2019

WilliamTambellini commented Dec 17, 2019

WilliamTambellini commented Aug 29, 2023

WilliamTambellini commented Aug 29, 2023

umar456 commented Nov 30, 2016 •

edited

WilliamTambellini commented Jun 19, 2018 •

edited