CUDA 11 error (invalid resource handle) after destroying FFT plan & using a new one #308

vincefn · 2020-12-04T17:03:01Z

Problem

I have found an issue when using CUDA 11.1, where creating a FFT plan, using it and doing another operation (simple sum reduction), then deleting the plan, re-creating another one and doing this again ends up with a cuFuncSetBlockShape failed: invalid resource handle

The following minimal example can be used to reproduce the issue (needs to be done in a fresh session for reproductibility)

import numpy as np
import pycuda.gpuarray as cua
import pycuda.autoinit
import skcuda.fft as cu_fft

fft_shape = (128, 128)

plan = cu_fft.Plan(fft_shape, np.complex64, np.complex64, batch=1)
a = cua.to_gpu(np.random.uniform(0,1, fft_shape).astype(np.complex64))
cu_fft.fft(a, a, plan)
tmp = cua.sum(a)

del plan

plan = cu_fft.Plan(fft_shape, np.complex64, np.complex64, batch=1)
cu_fft.fft(a, a, plan)
tmp = cua.sum(a)

Using the above code in a fresh python session always ends up with the following error:

---> 17 tmp = cua.sum(a)

~/dev/py38-env/lib/python3.8/site-packages/pycuda/gpuarray.py in sum(a, dtype, stream, allocator)
   1639     from pycuda.reduction import get_sum_kernel
   1640     krnl = get_sum_kernel(dtype, a.dtype)
-> 1641     return krnl(a, stream=stream, allocator=allocator)
   1642
   1643
~/dev/py38-env/lib/python3.8/site-packages/pycuda/reduction.py in __call__(self, *args, **kwargs)
    283
    284             # print block_count, seq_count, self.block_size, sz
--> 285             f((block_count, 1), (self.block_size, 1, 1), stream,
    286                     *([result.gpudata]+invocation_args+[seq_count, sz]),
    287                     **kwargs)

~/dev/py38-env/lib/python3.8/site-packages/pycuda/driver.py in function_prepared_async_call(func, grid, block, stream, *arg
s, **kwargs)
    547     def function_prepared_async_call(func, grid, block, stream, *args, **kwargs):
    548         if isinstance(block, tuple):
--> 549             func._set_block_shape(*block)    550         else:
    551             from warnings import warn

LogicError: cuFuncSetBlockShape failed: invalid resource handle

The error occurs during the pycuda sum reduction, but it seems triggered by the deletion of the plan and re-creation of another one, so it may be due to cuFFT.
I noted than in CUDA 11.1 the release notes indicate: "After successfully creating a plan, cuFFT now enforces a lock on the cufftHandle. Subsequent calls to any planning function with the same cufftHandle will fail" but I have no idea if that can be related.

Environment

List the following info:

OS platform: Linux (tested in power64/debian10, but also fresh X86_64 cloud machines (from vast.ai) based on https://hub.docker.com/r/nvidia/cuda/ , for example version nvidia/cuda:11.1-devel or nvidia/cuda:11.0-devel images)
Python version: 3.8 (probably not dependent)
CUDA version: 11.0 (with driver 455.45.01) , 11.1 (with driver 450.80.02, 455.23.05 or 455.38)
PyCUDA version: pycuda.VERSION = (2020, 1)
scikit-cuda version: latest git 806ee27 (0.53 pip-installed also has the issue)

The text was updated successfully, but these errors were encountered:

vincefn · 2021-01-03T15:42:21Z

I tested this also under windows 10 with CUDA 11.2 and the issue is reproduced with the above code snippet

In the CUDA 11.2 release notes you can read among known issues: "cuFFT planning and plan estimation functions may not restore correct context affecting CUDA driver API applications"

vincefn · 2022-01-05T11:16:55Z

Under linux with cuda toolkit 11.5 installed in a conda environment (cufftGetVersion() reports 106000 ; driver 460.91.03), the issue is still present, even if the cuda release notes do not mention the issue any more (?)...

dimitsev · 2022-05-12T16:24:05Z

Related? #330

g-bond mentioned this issue Feb 22, 2023

pyCuda contexts on CUDA 11 - invalid resources handle for piecewise motion correction flatironinstitute/CaImAn#1048

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA 11 error (invalid resource handle) after destroying FFT plan & using a new one #308

CUDA 11 error (invalid resource handle) after destroying FFT plan & using a new one #308

vincefn commented Dec 4, 2020

vincefn commented Jan 3, 2021

vincefn commented Jan 5, 2022

dimitsev commented May 12, 2022

CUDA 11 error (invalid resource handle) after destroying FFT plan & using a new one #308

CUDA 11 error (invalid resource handle) after destroying FFT plan & using a new one #308

Comments

vincefn commented Dec 4, 2020

Problem

Environment

vincefn commented Jan 3, 2021

vincefn commented Jan 5, 2022

dimitsev commented May 12, 2022