-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA out of memory with complex_matmul #20
Comments
I am also working on this code and have the same problem as you. I think the out-of-memory problem is because the ComplexCNN layer needs to fft and ifft all the inputs and outputs, so the memory problem comes. I think if we are going to build a CNN model based on Fourier domain, we need to keep tensors in this domain instead of doing fft and ifft at every layer. Of course, if you have enough memory, transferring the data domain frequently may help the model learn from both domains (Fourier and image space) for better results. I think one of the advantages of converting to the Fourier domain is that you can directly use large convolution kernels to simulate stacking of small convolution kernels. Because the data and the kernel can be directly multiplied in the Fourier domain, the convolution efficiency can be improved. So in this case we should tend to build simpler networks. I am also just starting to research this aspect, so it may be inaccurate, and it is recommended to refer to more peer-reviewed papers. I am welcome to discuss this idea together. |
@aminaab96 Sorry for the late response here. Do you think this is a bug with the |
@GuanghuiFU You're correct about keeping Tensors in the Fourier domain. When it's possible, that will help to reduce latency and memory footprints for your models. Unfortunately, I don't think it's possible to perform point-wise activation functions (e.g. |
Thank you for your reply. First of all, I really like your code, both theoretically and practically correct. I've been reading some papers and code recently trying to build a neural network for complex numbers and would love to hear your advice. [1] Rempe, Moritz, et al. "k-strip: A novel segmentation algorithm in k-space for the application of skull stripping." arXiv preprint arXiv:2205.09706 (2022). |
I may not be the best person to ask. 😅 But I'll try to give an answer, based on my limited knowledge. Although that paper uses complex convolutions, it doesn't seem like Fourier convolutions are applied anywhere. It's possible that there's a connection between the two, but I don't immediately see what that would be. In my mind, Fourier convolutions are essentially an algorithmic trick to try and compute (spatial) convolutions more efficiently. The spatial convolution is the end result we care about. It's definitely true, as you mentioned earlier, that you could perform multiple convolutions in Fourier space, without converting back to position-space in between. Pointwise (spatial) activations like ReLU cannot be performed in Fourier-space. If the complex neural network is just operating on complex-valued (spatial) inputs, then it seems valid to use "Complex ReLU" and other pointwise activation functions. But I'm not sure that Fourier convolutions will be much use to you. Hope that helps somehow! Please let me know if I'm misunderstanding something -- I'm happy to discuss more. 😄 |
Hello,
Thank you for the effort you put into making this work, however I am very confused. When I want to apply this "FFTCONV2D" layer to a network (Resnet for example), in GPU I always get the error 'CUDA OUT OF MEMORY......'. It's due to the "complex_matmul" function which needs a lot of resources, how can I solve this problem please?
The text was updated successfully, but these errors were encountered: