New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
preds are nan #28
Comments
Me too have this problem |
Varying machines exhibit different behaviours. Can you attempt multiple tries? |
Yes, for me the problem goes away when i set worker to 0 (not always the case) or run in a docker environment (no error what soever). Another problem is setting large number of worker such as 4 (default) filled up my 32 gb memory. |
it is a CUDA memory error? what(): CUDA error: an illegal memory access was encountered |
我也遇到了相似的问题,: 已放弃 (核心已转储) 很奇怪的是我在远程debug时不会出现该错误,一旦我在远程服务器终端运行时就会出现这个错误,但也有极少数时候可以正常运行 |
I also encountered this issue. Deleting the ./VoxFormer/deform_attn_3d directory and re-uploading it resolved the issue. I'm curious about the reason and hope the author can provide an explanation. |
Thanks for your great work. I have a issue. In stage2, my preds are nan at the start of training and it turns out error. Have you ever encounted this problem?
I train using VoxFormer-T
The text was updated successfully, but these errors were encountered: