You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to set up a dynamic kernel wherein a KA kernel launches a CUDA kernel. The final objective would be to have dynamic parallelism using only kernel abstractions. This is a MWE showing the comparison between launching the parent kernel with CUDA or with KA
the child kernel
functionchild!(a)
i =threadIdx().x
@inbounds a[i] = i
returnnothingend
In my experience dynamic parallelism doesn't have the best performance and of course we will need to figure out what it means for at least one different backend.
I am trying to set up a dynamic kernel wherein a KA kernel launches a CUDA kernel. The final objective would be to have dynamic parallelism using only kernel abstractions. This is a MWE showing the comparison between launching the parent kernel with CUDA or with KA
the child kernel
CUDA implementation (runs)
KA implementation
returns
Is this expected?
I guess it might be a problem of KA setting up
maxthreads=1
in the kernel callThe text was updated successfully, but these errors were encountered: