-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Running on windows 10] cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87 #17108
Comments
And also my CUDA version: nvcc: NVIDIA (R) Cuda compiler driver |
I think that this statement |
I am having the same issue here. My system:
And here is a sample code that reproduces the error:
Could this be a bug? |
I don't think cuda error 30 is an error on our side. Please try these things first.
|
OK, I did some extra tests, and it seems that it is some weird behavior only when running on an interactive shell. Here's what I have done (step-by-step)
I can run this file with either
Now, if I try to use exactly the same commands in an interactive shell, I get the error: With python:
or with ipython:
Any hints? |
@gmseabra |
@ChocolateDave ,
How does it work in a Jupyter notebook? |
@gmseabra @ChocolateDave |
Can you tell us what is your configuration? Thanks! |
@ChocolateDave , @kuretru: I am using:
I have already tried rebooting, removing and reinstalling CUDA, torch, Anaconda, etc., and the error persists. There must be something else going on here... |
@gmseabra |
I have tried all that, and the error is still there. DId you try and reproduce the error? |
@gmseabra
After a reboot
|
Thanks. I tried it all - reinstalled the whole Windows, then installed Visual Studio and CUDA Toolkit, installed Miniconda, installed PyTorch in a new environment, and still the same. The commands work from a file, but not interactively. Note: I'm using Python 3.7.1. If I update the packages in miniconda, I fall into the error described here: #17233 |
I'm sorry but the issues are not reproducible at my side. Could you please try these things to help me locate the problem?
Usually, the results should stay consistent regardless of the interactive mode is on or not So it's actually very weird. Maybe you should check whether they are using the exact same DLLs by using sth. like Process Explorer. |
I'll try that
OK, what should I look for here? Thanks for looking into the issue! |
Hi, I tried reverting to the CUDA drivers that come with the CUDA Development Kit, but I can't install them because I keep getting an error: "Windows cannot verify the driver signature... (Code 52)", so I have to stick with the most recent driver. My system is an Acer laptop with:
My exact procedure was:
Here are the results I get. In the end I also add details about my environment and the output of the deviceQuery app from the CUDA tests: Output of running the small program:
Output of interactive python interpreter:
Finally, here are the information about my conda environment:
And the output of the deviceQuery, from CUDA tests suite:
I've already tried reinstalling the system, uninstalling and reinstalling Anaconda and Miniconda, and nothing changes. Should I open a bug report? Thanks! |
Hi all, I just wanted to mention that I have just tried with the nightly build of pytorch, and the problem disappears. Using the nightly build available today (02/20/2019), I get the following:
So it seems that, at some point between stable and today's build, the issue has been resolved. |
@gmseabra I'm glad that it's solved. But I'm not sure which one is related to this. |
Thank you guys all for all your support.😄 I couldn't fix the problem so I decided to downgrade my python version to 3.6.8 and it somehow worked. |
Have you tried using the nightly-build? That did work fine for me (as of 02/20/2019). |
@peterjc123 Thanks. Is there any idea about when the "nightly build" becomes part of the "stable" distribution? |
@gmseabra It won't be too soon. Our release cycle is ~90 days. BTW, would you please try if removing |
Thanks.
Tried removing from I also tried copying those DLLs from my torch-nightly env to the torch env, but there was no difference either way. |
I am getting the same error as this with PyTorch 1.0.1 and CUDA 10. Indeed, updating to one of the nightly builds solved the issue, yet I stumbled upon a "classical nightly issue": some random Assertion Failure which prompted me to message PyTorch developers about it. This is getting really frustrating since I've been losing considerable time in reconfiguring my environment. I think I will have to downgrade some components now... EDIT: Downgrading to PyTorch 1.0.0 solved the issue for me as well. Clearly, there's a problem with 1.0.1. |
I am getting the same error. My setup:
My Jupyter Notebook Test: torch.cuda.is_available() torch.backends.cudnn.enabled torch.cuda.current_device()RuntimeError Traceback (most recent call last) C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in current_device() C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in _lazy_init() RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87 torch.cuda.device(0) torch.cuda.device_count() torch.cuda.get_device_name(0)RuntimeError Traceback (most recent call last) C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in get_device_name(device) C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in get_device_properties(device) C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in init() C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in _lazy_init() RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87 The THCGeneral.cpp code can be found at: The code block in THCGeneral where the error is thrown is: for (int i = 0; i < numDevices; ++i) {
} Line 87 of this code is: Any ideas why I and so many others are experiencing this exact same error? |
Looks like the callback was accidentally triggered here. https://github.com/pytorch/pytorch/blame/master/torch/cuda/__init__.py#L188. Usually it won't happen. Anyway, I'll try to add a protection clause here for Windows. |
@peterjc123 Regarding situation from https://forums.fast.ai/t/cuda-runtime-error-30-resnet-not-loading/38556/2, I think there is some inner errors between jupyter and pytorch 1.0.1, as a result of downgrading pytorch 1.0.0 could solve the problem. I noticed several issues raised about the same problem and this might be the best answer till now. |
My system: The following always works:
The following always fails for me:
My solution was to add to my scripts the call to |
I ran into the same problem (GTX 1050, anaconda environment, Win10, latest pytorch installed with anaconda (both pip and conda). Before that issue came up, pytorch worked as usual. I didn't change anything in the settings nor did I install packages, it just came up. |
Quote from @andrei-rusu
Downgrading PyTorch to 1.0.0 solved mine, also I need to ensure script ran in an administrative command. Thanks! |
Downgrading to version 1.0.0 also worked for me. Also, I changed some nvidia graphics card settings (maximum performance=yes) which may also have contributed to get it working. |
Same issue here: Downgrading to pytorch 1.0.0 solved the issue |
Well, would you guys please check whether this error persists in the nightlies? Downgrading is a workaround here, but it does little help to locate the actual cause of this issue. Let me conclude all the known possible reasons that may cause this issue:
I'd be grateful if you could help me locate the issue. It's currently hard to fix because I cannot reproduce it on my side. |
Nightly builds for windows are available here: https://download.pytorch.org/whl/nightly/cu100/torch_nightly.html but only for version 1.0.0 |
@Jonas1312 You mean CUDA 10? If you are talking about the version of PyTorch, it will always build for the latest source every day. |
https://download.pytorch.org/whl/nightly/cu100/torch_nightly.html shows the following packages: https://pastebin.com/yYxdEqU5 I tried with the last windows build:
It's working but I don't understand why is it showing version 1.0.0 even if it's built with the latest source? |
@JohnRambo Oh, I see. I will update the build scripts. |
@JohnRambo Should be fixed now. Looks like I forgot to sync with upstream after I sent these changes about the version change. |
@peterjc123 I've just installed torch_nightly-1.1.0.dev20190424-cp36-cp36m-win_amd64.whl and it seems that it fixed the issue:
Works with cuda 10.0 also! |
I just got this error for the first time today, after running PyTorch 1.0.1 (CUDA 10.0) on Windows 10 for months and months with no problems. In my case, the error only started happening when I updated my Nvidia graphics driver to 430.53 from 417.35. Luckily, simply reverting to driver version 417.35 caused the error to go away and everything works fine again. I did not need to touch my CUDA or Python environment to fix it, just roll back the graphics driver. Very odd, looks like Nvidia changed something in the driver code which is causing this. My setup: Windows 10 1607 64-bit PyTorch installed via pip |
I got the similar issue with an error as "RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:51" on my new machine (Windows 10, Nvidia RTX2070). (Also referred to https://discuss.pytorch.org/t/a-error-when-using-gpu/32761). |
We are building this time with the NVIDIA driver 418.96. But according to your test results, I don't know whether I should downgrade or upgrade it. However, if the problem is caused by the driver, we can actually do some tests on this. Also, if you have time, you can try whether building from source solves it. |
Actually, from the CUDA document, I can only find that there is a lower limit of the driver version for each CUDA version: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#major-components. But it didn't mention what will happen if we compile binaries using newer versions of GPU drivers, or the driver version mismatches with the one on the user's PC. |
Updating Nivida driver (to 430.39) also worked for me. |
Thanks! This is the exactly same thing that happens with me on Windows 10. If I use torch.cuda.current_device() before anything cuda-related, it works like a charm. |
For the record, this isn't supposed to be necessary, but it's possible this is broken. |
Guys, I seem to find the root cause of this issue with the help of @Jonas1312 in #20635, it is caused by the fact that we changed the way we link against our libraries against |
Thank you! I have the same problem, and I need to reboot the Python every time. According to what you said, I add torch.cuda.current_device() after import torch. It works. |
fix was merged |
it worked for me as well import torch |
Yes,,after reboot my computer ,this error has been gone |
❓ Questions and Help
Please note that this issue tracker is not a help form and this issue will be closed.
We have a set of listed resources available on the website. Our primary means of support is our discussion forum:
While trying to run my test.py file on my anaconda prompt I got these messages below:
CUDA™ is AVAILABLE
Please assign a gpu core (int, <1): 0
THCudaCheck FAIL file=..\aten\src\THC\THCGeneral.cpp line=87 error=30 : unknown error
Traceback (most recent call last):
File "VSLcore.py", line 202, in
DQNAgent()
File "VSLcore.py", line 87, in DQNAgent
torch.set_default_tensor_type('torch.cuda.FloatTensor')
File "D:\Softwares\Anaconda3\lib\site-packages\torch_init_.py", line 158, in set_default_tensor_type
_C.set_default_tensor_type(t)
File "D:\Softwares\Anaconda3\lib\site-packages\torch\cuda_init.py", line 162, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87
What should I do?
The text was updated successfully, but these errors were encountered: