Feature request: AMD GPU support with oneDNN AMD support #1072

santhoshtr · 2023-02-09T10:07:19Z

Hi, CTranslate2 uses oneDNN. oneDNN latest versions has support for AMD GPU. It require Intel oneAPI DPC++. The same approach can potentially enable NVIDIA GPU support too.

It would help running the MT models on AMD GPUs. With Rocm, this would be a full opensource way to run MT models in GPUs.

Thanks

guillaumekln · 2023-02-09T13:23:03Z

Hello,

Currently we only use oneDNN for specific operators such as matrix multiplications and convolutions, but a full MT models contains many other operators (softmax, layer norm, gather, concat, etc.). Even though some of them are available in oneDNN, it would require quite some work to specialize all operations for AMD GPUs.

At this time I don't plan to work on this feature, but it would indeed be a nice one to have!

leuc · 2023-04-04T20:36:24Z

I wanted to try faster whisper on a Intel A770 dGPU 16GB. A complete use of oneDNN could also enable that hardware support.

towel · 2023-04-05T12:32:28Z

Migrating a transcription component to faster-whisper, and using an AMD GPU, I'd also appreciate faster-whisper with ROCm support even more.

phineas-pta · 2023-04-11T20:45:22Z

@towel did you manage to get faster-whisper working on AMD ?

CristianPi · 2023-05-15T19:51:44Z

any way to run with an amd gpu?

MidnightKittenCat · 2023-08-29T05:17:24Z

Any update on this?

guillaumekln · 2023-08-29T08:18:44Z

I still don't plan to work on this at this time, and as far as I know no one else is working on this. I expect it would be quite some work to have a complete ROCm support.

lenhone · 2023-09-10T05:43:57Z

I had a go at converting the existing cuda stuff to rocm a few months ago but could never get it to build, not surprising as I have zero C++ or cmakelists skills.

curand, cublas, cudnn, cuda, cub appear to map to hip with minor adjustments, but I could never get the cmakelists to include thrust (the version supplied by rocm) and it always halted compiling due to producing too many errors.

TheJKM · 2023-10-26T23:11:36Z

I started trying to port CTranslate2 to ROCm last weekend and decided to share my (non-working) results here. The code is available in the rocm Branch of my fork.

Basically, hipify was able to convert most of the code automatically. I added a new CMake config option to enable compiling with ROCm, and so far calling the HIP compiler works, however it breaks the other options and requires a CMake version new enough to have HIP language support.

Current issues are some CUDA library dependencies I did not look at yet, and the use of bfloat16 data type. While latest ROCm has a (according to this GH issue -> ROCm/ROCm#2534) drop-in replacement for the CUDA bf16, it currently has some issues in missing operators. Therefore, I'm trying to completely disable bf16 for now, but without luck so far.

This work has right now just the goal of making it work, and not integrating HIP/ROCm into the (CMake) infrastructure.

In case someone wants to have a look at the code and help porting, feel free to look at my fork. Unfortunately, I don't expect to have much time in the near future for this project.

BBC-Esq · 2023-10-26T23:15:14Z

This is awesome dude. Wish I had programming experience to help with this, but alas I don't. I've been looking for ways to enable gpu acceleration for amd gpus using ctranslate2...Let me know if I can help in any way, whether it be by testing or what have you.

BBC-Esq · 2023-10-26T23:20:44Z

Have you gotten it to work at all yet?

MidnightKittenCat · 2023-10-26T23:21:53Z

Have you gotten it to work at all yet?

“I started trying to port CTranslate2 to ROCm last weekend and decided to share my (non-working) results here”

I believe that should answer your question.

commandline-be · 2023-12-31T01:48:34Z

I started trying to port CTranslate2 to ROCm last weekend and decided to share my (non-working) results here. The code is available in the rocm Branch of my fork.

Basically, hipify was able to convert most of the code automatically. I added a new CMake config option to enable compiling with ROCm, and so far calling the HIP compiler works, however it breaks the other options and requires a CMake version new enough to have HIP language support.

Current issues are some CUDA library dependencies I did not look at yet, and the use of bfloat16 data type. While latest ROCm has a (according to this GH issue -> ROCm/ROCm#2534) drop-in replacement for the CUDA bf16, it currently has some issues in missing operators. Therefore, I'm trying to completely disable bf16 for now, but without luck so far.

This work has right now just the goal of making it work, and not integrating HIP/ROCm into the (CMake) infrastructure.

In case someone wants to have a look at the code and help porting, feel free to look at my fork. Unfortunately, I don't expect to have much time in the near future for this project.

Thanks for sharing, much interested in the ripple effects this port may have for others projects.

There's now ROCM 6.0 available which I believe addresses specifically what you're referencing.

FYI: https://repo.radeon.com/amdgpu/6.0/ubuntu/dists/jammy/

I've tried all kinds of dumb uninformed stuff trying to get libretranslate to work with rocm to no avail. It depens on too recent cuda to be tricked by rocm. Latest pytorch+rocm5.7 also did not work out well.

https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/pytorch-install.html

commandline-be · 2024-01-03T07:42:50Z

So, would it make sense to create an experimental version of ctranslate2 using a more recnet oneDNN which does have AMD GPU support ?

from https://github.com/oneapi-src/oneDNN
oneAPI Deep Neural Network Library (oneDNN) is an open-source cross-platform performance library of basic building blocks for deep learning applications. oneDNN is part of oneAPI. The library is optimized for Intel(R) Architecture Processors, Intel Graphics, and Arm* 64-bit Architecture (AArch64)-based processors. oneDNN has experimental support for the following architectures: NVIDIA* GPU, AMD* GPU, OpenPOWER* Power ISA (PPC64), IBMz* (s390x), and RISC-V.

https://github.com/oneapi-src/oneDNN?tab=readme-ov-file#system-requirements
SYCL runtime with AMD GPU support requires
oneAPI DPC++ Compiler with support for HIP AMD
AMD ROCm, version 5.3 or later
MIOpen, version 2.18 or later (optional if AMD ROCm includes the required version of MIOpen)
rocBLAS, version 2.45.0 or later (optional if AMD ROCm includes the required version of rocBLAS)

https://github.com/oneapi-src/oneDNN/blob/main/src/gpu/amd/README.md
Support for AMD backend is implemented via SYCL HIP backend. The feature is disabled by default. Users must enable it at build time with a CMake option DNNL_GPU_VENDOR=AMD. The AMD GPUs can be used via oneDNN engine abstraction. The engine should be created using dnnl::engine::kind::gpu engine kind or the user can provide a sycl::device objects that corresponds to AMD GPUs.

vince62s · 2024-01-03T18:01:23Z

As said in the Feb 2023 comment "Even though some of them are available in oneDNN, it would require quite some work to specialize all operations for AMD GPUs." Since no one is making those changes, it won't move on.

tannisroot · 2024-03-03T00:30:59Z

Do I understand correctly that support for Intel discrete GPUs would be covered by this issue as well?

katzmike · 2024-03-21T13:10:16Z

I am not a developer but I work at AMD and handle developer relationships. We would like to assist with the effort to enable CTranslate2 for AMD dGPUs and iGPU. We will have engineers investigate, but we may also be able to provide hardware to the lead contributors of this effort. Please contact me via michael dot katz at amd dot com if this would help.

guillaumekln added enhancement New feature or request help wanted Extra attention is needed labels Feb 9, 2023

guillaumekln mentioned this issue Apr 19, 2023

Can I use my amd gpu to run it? althougn the official pytorch does not support it on windows however SYSTRAN/faster-whisper#162

Closed

Doomsdayrs mentioned this issue Jul 2, 2023

OpenCL for AMD / Intel GPUs? argosopentech/argos-translate#296

Closed

kandeshvari mentioned this issue Dec 14, 2023

ROCM support for AMD GPUs m-bain/whisperX#566

Open

commandline-be mentioned this issue Jan 23, 2024

Document how to build ctranslate2 with zenddn amd/ZenDNN-pytorch#1

Open

hlevring mentioned this issue Jan 23, 2024

OpenVino GPU plugin support #1603

Closed

dbolger mentioned this issue Feb 26, 2024

wyoming-(normal)-whisper? rhasspy/wyoming-faster-whisper#15

Closed

tannisroot mentioned this issue Mar 3, 2024

please include openvino iGPU support for hardware acceleration rhasspy/wyoming-faster-whisper#25

Closed

minhthuc2502 pinned this issue Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: AMD GPU support with oneDNN AMD support #1072

Feature request: AMD GPU support with oneDNN AMD support #1072

santhoshtr commented Feb 9, 2023

guillaumekln commented Feb 9, 2023

leuc commented Apr 4, 2023

towel commented Apr 5, 2023 •

edited

phineas-pta commented Apr 11, 2023

CristianPi commented May 15, 2023

MidnightKittenCat commented Aug 29, 2023

guillaumekln commented Aug 29, 2023

lenhone commented Sep 10, 2023

TheJKM commented Oct 26, 2023

BBC-Esq commented Oct 26, 2023

BBC-Esq commented Oct 26, 2023

MidnightKittenCat commented Oct 26, 2023 •

edited

commandline-be commented Dec 31, 2023

commandline-be commented Jan 3, 2024 •

edited

vince62s commented Jan 3, 2024

tannisroot commented Mar 3, 2024

katzmike commented Mar 21, 2024

Feature request: AMD GPU support with oneDNN AMD support #1072

Feature request: AMD GPU support with oneDNN AMD support #1072

Comments

santhoshtr commented Feb 9, 2023

guillaumekln commented Feb 9, 2023

leuc commented Apr 4, 2023

towel commented Apr 5, 2023 • edited

phineas-pta commented Apr 11, 2023

CristianPi commented May 15, 2023

MidnightKittenCat commented Aug 29, 2023

guillaumekln commented Aug 29, 2023

lenhone commented Sep 10, 2023

TheJKM commented Oct 26, 2023

BBC-Esq commented Oct 26, 2023

BBC-Esq commented Oct 26, 2023

MidnightKittenCat commented Oct 26, 2023 • edited

commandline-be commented Dec 31, 2023

commandline-be commented Jan 3, 2024 • edited

vince62s commented Jan 3, 2024

tannisroot commented Mar 3, 2024

katzmike commented Mar 21, 2024

towel commented Apr 5, 2023 •

edited

MidnightKittenCat commented Oct 26, 2023 •

edited

commandline-be commented Jan 3, 2024 •

edited