no kernel image is available for execution on the device #806

liu1352183717 · 2024-04-24T18:15:04Z

[ctranslate2] [thread 10436] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
Detected language 'zh' with probability 1.000000
Traceback (most recent call last):
File "C:\Users\Administrator\Desktop\pythonProject\main.py", line 23, in
for segment in segments:
File "C:\Users\Administrator.conda\envs\faster_whisper\lib\site-packages\faster_whisper\transcribe.py", line 1106, in restore_speech_timestamps
for segment in segments:
File "C:\Users\Administrator.conda\envs\faster_whisper\lib\site-packages\faster_whisper\transcribe.py", line 511, in generate_segments
encoder_output = self.encode(segment)
File "C:\Users\Administrator.conda\envs\faster_whisper\lib\site-packages\faster_whisper\transcribe.py", line 762, in encode
return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device

Purfview · 2024-04-24T22:31:23Z

What is your GPU model?

liu1352183717 · 2024-04-26T15:19:35Z

GeForce GTX 950

LeonVeganMan · 2024-04-30T10:30:41Z

Hi,

same problem here with Tesla M40, CUDA 12.4.

Thanx.
Ciao.
L.

laraws · 2024-05-10T03:41:24Z

Same issue, my gpu is GTX 950M

seclog · 2024-05-12T18:48:46Z

I had the same problem and have solved it. This is because the version of ctranslate2 used by fastwhisper is too high and is not compatible with older versions of cuda cards, such as cuda5.0. Just replace the fastwhisper installation library ctranslate2.dll with the version 3.24 ctranslate2.dll. Or download the source code of ct2-4.2, compile a native cuda arch compatible dll, and replace ctranslate2.dll in the fastwhisper installation library directory.

laraws · 2024-05-13T00:02:37Z

I had the same problem and have solved it. This is because the version of ctranslate2 used by fastwhisper is too high and is not compatible with older versions of cuda cards, such as cuda5.0. Just replace the fastwhisper installation library ctranslate2.dll with the version 3.24 ctranslate2.dll. Or download the source code of ct2-4.2, compile a native cuda arch compatible dll, and replace ctranslate2.dll in the fastwhisper installation library directory.

Thx for your reply. I tried it. But it didn't work.

seclog · 2024-05-16T20:17:29Z

I had the same problem and have solved it. This is because the version of ctranslate2 used by fastwhisper is too high and is not compatible with older versions of cuda cards, such as cuda5.0. Just replace the fastwhisper installation library ctranslate2.dll with the version 3.24 ctranslate2.dll. Or download the source code of ct2-4.2, compile a native cuda arch compatible dll, and replace ctranslate2.dll in the fastwhisper installation library directory.

Thx for your reply. I tried it. But it didn't work.

GTX 950M has the compute capability of 5.0.You have the same hardware as me. This is my own ctranslate2.dll compiled on windows11 compatible with cudnn9, you can try to replace the ctranslate2.dll in the fastwhisper script directory, pay attention to install the new graphics card driver, cudatoolkits12 ,cudnn9 support cuda12, otherwise some other dll dependencies lead to not work. This dll may not support other cards, because it is only compatible with cuda5.0, so the best way is to replace ctranslate2 in version 3.24.
ctranslate2_cuda5.0.zip

laraws · 2024-05-17T09:02:05Z

I had the same problem and have solved it. This is because the version of ctranslate2 used by fastwhisper is too high and is not compatible with older versions of cuda cards, such as cuda5.0. Just replace the fastwhisper installation library ctranslate2.dll with the version 3.24 ctranslate2.dll. Or download the source code of ct2-4.2, compile a native cuda arch compatible dll, and replace ctranslate2.dll in the fastwhisper installation library directory.

Thx for your reply. I tried it. But it didn't work.

GTX 950M has the compute capability of 5.0.You have the same hardware as me. This is my own ctranslate2.dll compiled on windows11 compatible with cudnn9, you can try to replace the ctranslate2.dll in the fastwhisper script directory, pay attention to install the new graphics card driver, cudatoolkits12 ,cudnn9 support cuda12, otherwise some other dll dependencies lead to not work. This dll may not support other cards, because it is only compatible with cuda5.0, so the best way is to replace ctranslate2 in version 3.24. ctranslate2_cuda5.0.zip

Thank you for your reply. My system is Ubuntu, can I use the "ctranslate2.dll"

risacher · 2024-05-25T14:01:35Z

@laraws No, a .dll is a Windows shared library and will not work on Ubuntu. I have also been trying to get faster-whisper to work on my Ubuntu 22.04 server with a GTX 750 Ti (which is compute capability 5.0, like others in this thread.) As noted above, the version of CTranslate2 bundled with faster-whisper does not support this, so I recompiled it from source.

This was slightly tricky as I wanted to be able to run on both CPU and GPU. CPU support requires Intel MKL, and the 22.04 package for that is missing a pkg-config file.

So, for reference of anyone else trying this, I configured CTranslate2 with: cmake .. -DWITH_MKL=ON -DWITH_CUDA=ON -DWITH_CUDNN=ON -DMKL_ROOT=/usr -DMKL_INCLUDE_DIR=/usr/include/mkl. In order for cmake to auto-detect the compute-capability, cmake must be run on the machine with the GPU and the NVIDIA driver and driver API must be installed and match. If you are compiling on a different machine you should be able to add -DCMAKE_CUDA_ARCHITECTURES=50 to force it to use compute capability 5.0.

Be sure to add /usr/local/lib to $LD_LIBRARY_PATH and /usr/local/bin to $PATH, and delete the ctranslate2.so files installed in ~/.local/lib/python3.10/site-packages/. (I think they were in site-packages/ctranslate2.libs/ or something like that). I think to compile CTranslate2 I also had to have CUDA installed and nvcc must be in the PATH, so I needed to add /usr/local/cuda/bin to PATH and /usr/local/cuda/lib64 to LD_LIBRARY_PATH.

All that said, when I tried to run the sample code on the GPU with a 30-second sample, it segfaulted, so I'd be interested if anyone else gets it to work. It works on the CPU, but too slow for my application.

risacher · 2024-05-25T14:14:36Z

I also note that CTranslate2 apparently thinks the GTX 750 Ti only really supports float32, which I determined by running

import ctranslate2
t = ctranslate2.get_supported_compute_types("cuda")
print(f"get_supported_compute_types(\"cuda\"): {t}")

As a result I requantized the model to float32 like this:

ct2-transformers-converter --model openai/whisper-small.en --output_dir f32-whisper-small.en --copy_files tokenizer.json --quantization float32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

no kernel image is available for execution on the device #806

no kernel image is available for execution on the device #806

liu1352183717 commented Apr 24, 2024

Purfview commented Apr 24, 2024 •

edited

liu1352183717 commented Apr 26, 2024

LeonVeganMan commented Apr 30, 2024

laraws commented May 10, 2024

seclog commented May 12, 2024 •

edited

laraws commented May 13, 2024

seclog commented May 16, 2024 •

edited

laraws commented May 17, 2024

risacher commented May 25, 2024 •

edited

risacher commented May 25, 2024

no kernel image is available for execution on the device #806

no kernel image is available for execution on the device #806

Comments

liu1352183717 commented Apr 24, 2024

Purfview commented Apr 24, 2024 • edited

liu1352183717 commented Apr 26, 2024

LeonVeganMan commented Apr 30, 2024

laraws commented May 10, 2024

seclog commented May 12, 2024 • edited

laraws commented May 13, 2024

seclog commented May 16, 2024 • edited

laraws commented May 17, 2024

risacher commented May 25, 2024 • edited

risacher commented May 25, 2024

Purfview commented Apr 24, 2024 •

edited

seclog commented May 12, 2024 •

edited

seclog commented May 16, 2024 •

edited

risacher commented May 25, 2024 •

edited