[Running on windows 10] cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87 #17108

juanwulu · 2019-02-14T10:43:45Z

❓ Questions and Help

Please note that this issue tracker is not a help form and this issue will be closed.

We have a set of listed resources available on the website. Our primary means of support is our discussion forum:

Discussion Forum

While trying to run my test.py file on my anaconda prompt I got these messages below:

CUDA™ is AVAILABLE
Please assign a gpu core (int, <1): 0
THCudaCheck FAIL file=..\aten\src\THC\THCGeneral.cpp line=87 error=30 : unknown error
Traceback (most recent call last):
File "VSLcore.py", line 202, in
DQNAgent()
File "VSLcore.py", line 87, in DQNAgent
torch.set_default_tensor_type('torch.cuda.FloatTensor')
File "D:\Softwares\Anaconda3\lib\site-packages\torch_init_.py", line 158, in set_default_tensor_type
_C.set_default_tensor_type(t)
File "D:\Softwares\Anaconda3\lib\site-packages\torch\cuda_init.py", line 162, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87

What should I do?

juanwulu · 2019-02-14T10:56:45Z

And also my CUDA version:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:04_Central_Daylight_Time_2018
Cuda compilation tools, release 10.0, V10.0.130

peterjc123 · 2019-02-15T03:13:46Z

I think that this statement torch.set_default_tensor_type('torch.cuda.FloatTensor') should be replaced by torch.set_default_tensor_type(torch.cuda.FloatTensor).

gmseabra · 2019-02-15T04:23:04Z

I am having the same issue here. My system:

Windows 10
NVIDIA GeForce GTX 1060
Python 3.7.1 (Anaconda)
PyTorch 1.0.1
CUDA 10

And here is a sample code that reproduces the error:

>ipython
Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import torch

In [2]: torch.cuda.is_available()
Out[2]: True

In [3]: torch.cuda.device_count()
Out[3]: 1

In [4]: torch.cuda.current_device()
THCudaCheck FAIL file=..\aten\src\THC\THCGeneral.cpp line=87 error=30 : unknown error
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-4-3380d2c12118> in <module>
----> 1 torch.cuda.current_device()

C:\Anaconda3\lib\site-packages\torch\cuda\__init__.py in current_device()
    339 def current_device():
    340     r"""Returns the index of a currently selected device."""
--> 341     _lazy_init()
    342     return torch._C._cuda_getDevice()
    343

C:\Anaconda3\lib\site-packages\torch\cuda\__init__.py in _lazy_init()
    160             "Cannot re-initialize CUDA in forked subprocess. " + msg)
    161     _check_driver()
--> 162     torch._C._cuda_init()
    163     _cudart = _load_cudart()
    164     _cudart.cudaGetErrorName.restype = ctypes.c_char_p

RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87

In [5]:

Could this be a bug?

peterjc123 · 2019-02-15T04:36:43Z

I don't think cuda error 30 is an error on our side. Please try these things first.

Re-install latest GPU driver
Reboot
Ensure you have admin access

gmseabra · 2019-02-15T15:10:33Z

OK, I did some extra tests, and it seems that it is some weird behavior only when running on an interactive shell. Here's what I have done (step-by-step)

Prepare a simple file with the example:

> type torch_test.ipy
import torch
print("torch.cuda.is_available()   =", torch.cuda.is_available())
print("torch.cuda.device_count()   =", torch.cuda.device_count())
print("torch.cuda.device('cuda')   =", torch.cuda.device('cuda'))
print("torch.cuda.current_device() =", torch.cuda.current_device())

I can run this file with either Python or iPython, and it all works fine:

> python torch_test.ipy
torch.cuda.is_available()   = True
torch.cuda.device_count()   = 1
torch.cuda.device('cuda')   = <torch.cuda.device object at 0x0000021B331A0160>
torch.cuda.current_device() = 0

> ipython torch_test.ipy
torch.cuda.is_available()   = True
torch.cuda.device_count()   = 1
torch.cuda.device('cuda')   = <torch.cuda.device object at 0x000002B39C1FD390>
torch.cuda.current_device() = 0

Now, if I try to use exactly the same commands in an interactive shell, I get the error:

With python:

>python
Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print("torch.cuda.is_available()   =", torch.cuda.is_available())
torch.cuda.is_available()   = True
>>> print("torch.cuda.device_count()   =", torch.cuda.device_count())
torch.cuda.device_count()   = 1
>>> print("torch.cuda.device('cuda')   =", torch.cuda.device('cuda'))
torch.cuda.device('cuda')   = <torch.cuda.device object at 0x0000028CBD034198>
>>> print("torch.cuda.current_device() =", torch.cuda.current_device())
THCudaCheck FAIL file=..\aten\src\THC\THCGeneral.cpp line=87 error=30 : unknown error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Anaconda3\lib\site-packages\torch\cuda\__init__.py", line 341, in current_device
    _lazy_init()
  File "C:\Anaconda3\lib\site-packages\torch\cuda\__init__.py", line 162, in _lazy_init
    torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87
>>> ^Z

or with ipython:

>ipython
Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import torch

In [2]: print("torch.cuda.is_available()   =", torch.cuda.is_available())
torch.cuda.is_available()   = True

In [3]: print("torch.cuda.device_count()   =", torch.cuda.device_count())
torch.cuda.device_count()   = 1

In [4]: print("torch.cuda.device('cuda')   =", torch.cuda.device('cuda'))
torch.cuda.device('cuda')   = <torch.cuda.device object at 0x0000018A068007F0>

In [5]: print("torch.cuda.current_device() =", torch.cuda.current_device())
THCudaCheck FAIL file=..\aten\src\THC\THCGeneral.cpp line=87 error=30 : unknown error
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-5-f8c552eb6277> in <module>
----> 1 print("torch.cuda.current_device() =", torch.cuda.current_device())

C:\Anaconda3\lib\site-packages\torch\cuda\__init__.py in current_device()
    339 def current_device():
    340     r"""Returns the index of a currently selected device."""
--> 341     _lazy_init()
    342     return torch._C._cuda_getDevice()
    343

C:\Anaconda3\lib\site-packages\torch\cuda\__init__.py in _lazy_init()
    160             "Cannot re-initialize CUDA in forked subprocess. " + msg)
    161     _check_driver()
--> 162     torch._C._cuda_init()
    163     _cudart = _load_cudart()
    164     _cudart.cudaGetErrorName.restype = ctypes.c_char_p

RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87

In [6]:

Any hints?

juanwulu · 2019-02-15T15:18:41Z

@gmseabra
Thanks for your post.
I tested it as your described and surprisingly I got the result which was a completely opposite of yours. It turns out it run well on interactive she'll but got bugged on the other

gmseabra · 2019-02-15T15:39:10Z

@ChocolateDave ,

@gmseabra
Thanks for your post.
I tested it as your described and surprisingly I got the result which was a completely opposite of yours. It turns out it run well on interactive she'll but got bugged on the other

How does it work in a Jupyter notebook?

kuretru · 2019-02-16T16:18:29Z

@gmseabra @ChocolateDave
I have the same problem with you. After a reboot, the problem was gone.

gmseabra · 2019-02-16T17:00:58Z

@gmseabra @ChocolateDave
I have the same problem with you. After a reboot, the problem was gone.

Can you tell us what is your configuration? Thanks!

gmseabra · 2019-02-16T17:52:33Z

@ChocolateDave , @kuretru:
What are the versions of python, CUDA and PyTorch that you are using?

I am using:

Windows 10 v1809
Anaconda 3
Python 3.7.1
CUDA 10.0 (V10.0.130)
PyTorch 1.0.1 (py3.7_cuda100_cudnn7_1)
cudatoolkit 10.0.130

I have already tried rebooting, removing and reinstalling CUDA, torch, Anaconda, etc., and the error persists. There must be something else going on here...

juanwulu · 2019-02-16T18:11:56Z

@gmseabra
Thanks for all your advice. Pardon me for replying so late due to my busy trip schedule. And without my laptop, I couldn't test my program on Jupyter.
If I remember it correctly, I'm currently using the same system configuration as yours.

gmseabra · 2019-02-16T18:24:48Z

@peterjc123

I don't think cuda error 30 is an error on our side. Please try these things first.

Re-install latest GPU driver
Reboot
Ensure you have admin access

I have tried all that, and the error is still there. DId you try and reproduce the error?

kuretru · 2019-02-17T03:43:14Z

@gmseabra
All environments are brand new, I reinstalled the OS on February 14th.
And I am using:

Nvidia GTX 860M
Windows 10 1809 x64
Python 3.7.2 x64
CUDA V10.0.130
PyTorch 1.0.1 (torch-1.0.1-cp37-cp37m-win_amd64.whl)

Python
>>> import torch
>>> torch.cuda.current_device()
>>> RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87

After a reboot

Python
>>> import torch
>>> torch.cuda.current_device()
>>> 0

gmseabra · 2019-02-19T00:59:10Z

Thanks. I tried it all - reinstalled the whole Windows, then installed Visual Studio and CUDA Toolkit, installed Miniconda, installed PyTorch in a new environment, and still the same. The commands work from a file, but not interactively.

Note: I'm using Python 3.7.1. If I update the packages in miniconda, I fall into the error described here: #17233

peterjc123 · 2019-02-19T03:07:27Z

I'm sorry but the issues are not reproducible at my side. Could you please try these things to help me locate the problem?

Install the GPU driver that ships with the CUDA installation
Install the wheels package instead of the conda package

Usually, the results should stay consistent regardless of the interactive mode is on or not So it's actually very weird. Maybe you should check whether they are using the exact same DLLs by using sth. like Process Explorer.

gmseabra · 2019-02-19T04:21:51Z

Install the GPU driver that ships with the CUDA installation

I'll try that

Usually, the results should stay consistent regardless of the interactive mode is on or not So it's actually very weird. Maybe you should check whether they are using the exact same DLLs by using sth. like Process Explorer.

OK, what should I look for here?

Thanks for looking into the issue!

gmseabra · 2019-02-20T15:22:04Z

Hi, I tried reverting to the CUDA drivers that come with the CUDA Development Kit, but I can't install them because I keep getting an error: "Windows cannot verify the driver signature... (Code 52)", so I have to stick with the most recent driver.

My system is an Acer laptop with:

Windows 10 Home Single Language v 1809
GeForce GTX 1060, Driver version 25.21.14.1891 (In the GeForce Experience it shows as 418.91)
Miniconda with Python 3.7.1

My exact procedure was:

Install Miniconda. Do not update anything.
Clone base into a new env: (base) > conda create --name torch --clone base
Activate the new env: (base) > conda activate torch
Install pytorch: (torch) > conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
Deactivte / reactivate the env, just to be sure
Try to run the simple example torch_test.py by: (torch) > python torch_test.py
Try to run the same sequence of commands using the python interactive interpreter, see results below.

Here are the results I get. In the end I also add details about my environment and the output of the deviceQuery app from the CUDA tests:

Output of running the small program:

(torch) >python torch_test.py
torch.cuda.is_available()   = True
torch.cuda.device_count()   = 1
torch.cuda.device('cuda')   = <torch.cuda.device object at 0x000001FCD3A61F28>
torch.cuda.current_device() = 0

Output of interactive python interpreter:

(torch) > python
Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.cuda.device_count()
1
>>> torch.cuda.device('cuda')
<torch.cuda.device object at 0x000001E18C72D208>
>>> torch.cuda.current_device()
THCudaCheck FAIL file=..\aten\src\THC\THCGeneral.cpp line=87 error=30 : unknown error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Miniconda3\envs\torch\lib\site-packages\torch\cuda\__init__.py", line 341, in current_device
    _lazy_init()
  File "C:\Miniconda3\envs\torch\lib\site-packages\torch\cuda\__init__.py", line 162, in _lazy_init
    torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87
>>>

Finally, here are the information about my conda environment:

(torch) >type torch_env.txt
# packages in environment at C:\Miniconda3\envs\torch:
#
# Name                    Version                   Build  Channel
asn1crypto                0.24.0                   py37_0
blas                      1.0                         mkl
ca-certificates           2018.03.07                    0
certifi                   2018.11.29               py37_0
cffi                      1.11.5           py37h74b6da3_1
chardet                   3.0.4                    py37_1
console_shortcut          0.1.1                         3
cryptography              2.4.2            py37h7a1dbc1_0
cudatoolkit               10.0.130                      0
freetype                  2.9.1                ha9979f8_1
icc_rt                    2019.0.0             h0cc432a_1
idna                      2.8                      py37_0
intel-openmp              2019.1                      144
jpeg                      9b                   hb83a4c4_2
libpng                    1.6.36               h2a8f88b_0
libtiff                   4.0.10               hb898794_2
menuinst                  1.4.14           py37hfa6e2cd_0
mkl                       2019.1                      144
mkl_fft                   1.0.10           py37h14836fe_0
mkl_random                1.0.2            py37h343c172_0
ninja                     1.8.2            py37he980bc4_1
numpy                     1.15.4           py37h19fb1c0_0
numpy-base                1.15.4           py37hc3f5095_0
olefile                   0.46                     py37_0
openssl                   1.1.1a               he774522_0
pillow                    5.4.1            py37hdc69c19_0
pip                       18.1                     py37_0
pycosat                   0.6.3            py37hfa6e2cd_0
pycparser                 2.19                     py37_0
pyopenssl                 18.0.0                   py37_0
pysocks                   1.6.8                    py37_0
python                    3.7.1                h8c8aaf0_6
pytorch                   1.0.1           py3.7_cuda100_cudnn7_1    pytorch
pywin32                   223              py37hfa6e2cd_1
requests                  2.21.0                   py37_0
ruamel_yaml               0.15.46          py37hfa6e2cd_0
setuptools                40.6.3                   py37_0
six                       1.12.0                   py37_0
sqlite                    3.26.0               he774522_0
tk                        8.6.8                hfa6e2cd_0
torchvision               0.2.1                      py_2    pytorch
urllib3                   1.24.1                   py37_0
vc                        14.1                 h0510ff6_4
vs2015_runtime            14.15.26706          h3a45250_0
wheel                     0.32.3                   py37_0
win_inet_pton             1.0.1                    py37_1
wincertstore              0.2                      py37_0
xz                        5.2.4                h2fa13f4_4
yaml                      0.1.7                hc54c509_2
zlib                      1.2.11               h62dcd97_3
zstd                      1.3.7                h508b16e_0

And the output of the deviceQuery, from CUDA tests suite:

(torch) >type deviceQuery.out
deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1060"
  CUDA Driver Version / Runtime Version          10.1 / 10.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 6144 MBytes (6442450944 bytes)
  (10) Multiprocessors, (128) CUDA Cores/MP:     1280 CUDA Cores
  GPU Max Clock rate:                            1733 MHz (1.73 GHz)
  Memory Clock rate:                             4004 Mhz
  Memory Bus Width:                              192-bit
  L2 Cache Size:                                 1572864 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 5 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS

I've already tried reinstalling the system, uninstalling and reinstalling Anaconda and Miniconda, and nothing changes.

Should I open a bug report?

Thanks!

gmseabra · 2019-02-20T23:14:04Z

Hi all,

I just wanted to mention that I have just tried with the nightly build of pytorch, and the problem disappears. Using the nightly build available today (02/20/2019), I get the following:

(torch_nightly) >python
Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.cuda.current_device()
0
>>> quit()

So it seems that, at some point between stable and today's build, the issue has been resolved.

peterjc123 · 2019-02-21T02:21:38Z

@gmseabra I'm glad that it's solved. But I'm not sure which one is related to this.

juanwulu · 2019-02-21T17:58:19Z

Thank you guys all for all your support.😄
especially @gmseabra

I couldn't fix the problem so I decided to downgrade my python version to 3.6.8 and it somehow worked.
The bugs may still exist on a newer version of of python but to those who are currently stuck at this problem, downgrading your python version might be a good solution.

gmseabra · 2019-02-21T20:11:50Z

Thank you guys all for all your support.😄
especially @gmseabra
I couldn't fix the problem so I decided to downgrade my python version to 3.6.8 and it somehow worked.
The bugs may still exist on a newer version of of python but to those who are currently stuck at this problem, downgrading your python version might be a good solution.

Have you tried using the nightly-build? That did work fine for me (as of 02/20/2019).

gmseabra · 2019-02-21T20:12:42Z

@gmseabra I'm glad that it's solved. But I'm not sure which one is related to this.

@peterjc123 Thanks. Is there any idea about when the "nightly build" becomes part of the "stable" distribution?

peterjc123 · 2019-02-22T03:04:33Z

@gmseabra It won't be too soon. Our release cycle is ~90 days. BTW, would you please try if removing nvcuda.dll and nvfatbinaryloader.dll from [Anaconda Root]\Lib\site-packages\torch\lib helps?

gmseabra · 2019-02-22T22:01:23Z

@gmseabra It won't be too soon. Our release cycle is ~90 days.

Thanks.

BTW, would you please try if removing nvcuda.dll and nvfatbinaryloader.dll from [Anaconda Root]\Lib\site-packages\torch\lib helps?

Tried removing from
[Miniconda3]\envs\torch\Lib\site-packages\torch\lib

I also tried copying those DLLs from my torch-nightly env to the torch env, but there was no difference either way.

andrei-rusu · 2019-02-26T01:48:51Z

I am getting the same error as this with PyTorch 1.0.1 and CUDA 10. Indeed, updating to one of the nightly builds solved the issue, yet I stumbled upon a "classical nightly issue": some random Assertion Failure which prompted me to message PyTorch developers about it. This is getting really frustrating since I've been losing considerable time in reconfiguring my environment. I think I will have to downgrade some components now...

EDIT: Downgrading to PyTorch 1.0.0 solved the issue for me as well. Clearly, there's a problem with 1.0.1.

jsmith8888 · 2019-03-01T16:08:39Z

I am getting the same error.

My setup:

Nvidia GTX 1050Ti
Windows 10 Pro
Conda 4.6.7
Python 3.7.1
CUDA V10.0.130
PyTorch 1.0.1

My Jupyter Notebook Test:

torch.cuda.is_available()
True

torch.backends.cudnn.enabled
True

torch.cuda.current_device()

RuntimeError Traceback (most recent call last)
in
----> 1 torch.cuda.current_device()

C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in current_device()
339 def current_device():
340 r"""Returns the index of a currently selected device."""
--> 341 _lazy_init()
342 return torch._C._cuda_getDevice()
343

C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in _lazy_init()
160 "Cannot re-initialize CUDA in forked subprocess. " + msg)
161 _check_driver()
--> 162 torch._C._cuda_init()
163 _cudart = _load_cudart()
164 _cudart.cudaGetErrorName.restype = ctypes.c_char_p

RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87

torch.cuda.device(0)
<torch.cuda.device at 0x21f81413fd0>

torch.cuda.device_count()
1

torch.cuda.get_device_name(0)

RuntimeError Traceback (most recent call last)
in
----> 1 torch.cuda.get_device_name(0)

C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in get_device_name(device)
274 if :attr:device is None (default).
275 """
--> 276 return get_device_properties(device).name
277
278

C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in get_device_properties(device)
296 def get_device_properties(device):
297 if not _initialized:
--> 298 init() # will define _get_device_properties and _CudaDeviceProperties
299 device = _get_device_index(device, optional=True)
300 if device < 0 or device >= device_count():

C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in init()
142 Does nothing if the CUDA state is already initialized.
143 """
--> 144 _lazy_init()
145
146

C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda_init_.py in _lazy_init()
160 "Cannot re-initialize CUDA in forked subprocess. " + msg)
161 _check_driver()
--> 162 torch._C._cuda_init()
163 _cudart = _load_cudart()
164 _cudart.cudaGetErrorName.restype = ctypes.c_char_p

RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87

The THCGeneral.cpp code can be found at:
https://github.com/pytorch/pytorch/blob/master/aten/src/THC/THCGeneral.cpp

The code block in THCGeneral where the error is thrown is:

for (int i = 0; i < numDevices; ++i) {
THCCudaResourcesPerDevice* res = THCState_getDeviceResourcePtr(state, i);
THCudaCheck(cudaSetDevice(i));

/* The scratch space that we want to have available per each device is
   based on the number of SMs available per device. We guarantee a
   minimum of 128kb of space per device, but to future-proof against
   future architectures that may have huge #s of SMs, we guarantee that
   we have at least 16 bytes for each SM. */
int numSM = at::cuda::getDeviceProperties(i)->multiProcessorCount;
size_t sizePerStream =
  MIN_GLOBAL_SCRATCH_SPACE_PER_DEVICE >= numSM * MIN_GLOBAL_SCRATCH_SPACE_PER_SM_STREAM ?
  MIN_GLOBAL_SCRATCH_SPACE_PER_DEVICE :
  numSM * MIN_GLOBAL_SCRATCH_SPACE_PER_SM_STREAM;
res->scratchSpacePerStream = sizePerStream;

}

Line 87 of this code is:
int numSM = at::cuda::getDeviceProperties(i)->multiProcessorCount;

Any ideas why I and so many others are experiencing this exact same error?

peterjc123 · 2019-03-01T17:29:00Z

Looks like the callback was accidentally triggered here. https://github.com/pytorch/pytorch/blame/master/torch/cuda/__init__.py#L188. Usually it won't happen. Anyway, I'll try to add a protection clause here for Windows.

Yiyiyimu · 2019-04-14T12:31:11Z

@peterjc123 Regarding situation from https://forums.fast.ai/t/cuda-runtime-error-30-resnet-not-loading/38556/2, I think there is some inner errors between jupyter and pytorch 1.0.1, as a result of downgrading pytorch 1.0.0 could solve the problem.

I noticed several issues raised about the same problem and this might be the best answer till now.

AndreiCostinescu · 2019-04-16T15:55:09Z

My system:
Windows 10
Cuda 10.1
Python 3.7.2
PyTorch 1.0.1
NVIDIA GeForce GTX 1050 Ti

The following always works:

import torch
torch.cuda.current_device()

The following always fails for me:

import torch
torch.cuda.is_available()
torch.cuda.current_device()  # fails here

My solution was to add to my scripts the call to torch.cuda.current_device() before any other cuda calls.
Hope this gives a hint as to where to look for the issue :)

reinhub-1 · 2019-04-19T23:40:25Z

I ran into the same problem (GTX 1050, anaconda environment, Win10, latest pytorch installed with anaconda (both pip and conda).
I uninstalled, reinstalled pytorch in different environments several times without success until now.

Before that issue came up, pytorch worked as usual. I didn't change anything in the settings nor did I install packages, it just came up.

deepseawhale · 2019-04-21T09:25:03Z

Quote from @andrei-rusu

I am getting the same error as this with PyTorch 1.0.1 and CUDA 10. Indeed, updating to one of the nightly builds solved the issue, yet I stumbled upon a "classical nightly issue": some random Assertion Failure which prompted me to message PyTorch developers about it. This is getting really frustrating since I've been losing considerable time in reconfiguring my environment. I think I will have to downgrade some components now...

EDIT: Downgrading to PyTorch 1.0.0 solved the issue for me as well. Clearly, there's a problem with 1.0.1.

Downgrading PyTorch to 1.0.0 solved mine, also I need to ensure script ran in an administrative command. Thanks!

reinhub-1 · 2019-04-21T23:30:18Z

Downgrading to version 1.0.0 also worked for me. Also, I changed some nvidia graphics card settings (maximum performance=yes) which may also have contributed to get it working.

Jonas1312 · 2019-04-22T08:38:57Z

Same issue here:
Windows 10
NVIDIA GeForce GTX 940mx
Python 3.6.8
PyTorch 1.0.1
CUDA 10.1
cudnn 7.5

Downgrading to pytorch 1.0.0 solved the issue

peterjc123 · 2019-04-22T09:12:10Z

Well, would you guys please check whether this error persists in the nightlies? Downgrading is a workaround here, but it does little help to locate the actual cause of this issue. Let me conclude all the known possible reasons that may cause this issue:

From 1.0.0 to 1.0.1, we switched to use the cuda libraries provided by the conda-forge channel in the conda package. Previously, we copied these libraries in the build machine into the binaries. We can ignore this factor if we use the pip package.
The dll loading process of python in conda has changed. It started to use AllDllDirectory, which does not ensure the loading sequence. We can ignore this factor if we downgrade python to 3.6.7 or 3.7.1.
The fix/issue mentioned by @ezyang Unify cudaGetDeviceCount implementations. #18445. We can ignore this factor if we use the nightlies or build from source.

I'd be grateful if you could help me locate the issue. It's currently hard to fix because I cannot reproduce it on my side.

Jonas1312 · 2019-04-22T13:47:41Z

Nightly builds for windows are available here: https://download.pytorch.org/whl/nightly/cu100/torch_nightly.html but only for version 1.0.0

peterjc123 · 2019-04-22T13:51:26Z

@Jonas1312 You mean CUDA 10? If you are talking about the version of PyTorch, it will always build for the latest source every day.

Jonas1312 · 2019-04-22T14:05:02Z

https://download.pytorch.org/whl/nightly/cu100/torch_nightly.html shows the following packages: https://pastebin.com/yYxdEqU5

I tried with the last windows build:

c:\Users\Jonas\Desktop>python36 -m pip install torch_nightly-1.0.0.dev20190421-cp36-cp36m-win_amd64.whl
Processing c:\users\jonas\desktop\torch_nightly-1.0.0.dev20190421-cp36-cp36m-win_amd64.whl
Installing collected packages: torch-nightly
Successfully installed torch-nightly-1.0.0.dev20190421

c:\Users\Jonas\Desktop>python36
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print("torch.cuda.is_available()   =", torch.cuda.is_available())
torch.cuda.is_available()   = True
>>> print("torch.cuda.device_count()   =", torch.cuda.device_count())
torch.cuda.device_count()   = 1
>>> print("torch.cuda.device('cuda')   =", torch.cuda.device('cuda'))
torch.cuda.device('cuda')   = <torch.cuda.device object at 0x00000251657A7518>
>>> print("torch.cuda.current_device() =", torch.cuda.current_device())
torch.cuda.current_device() = 0
>>> torch.__version__
'1.0.0.dev20190421'
>>>

It's working but I don't understand why is it showing version 1.0.0 even if it's built with the latest source?

peterjc123 · 2019-04-22T14:23:56Z

@JohnRambo Oh, I see. I will update the build scripts.

peterjc123 · 2019-04-22T14:40:08Z

@JohnRambo Should be fixed now. Looks like I forgot to sync with upstream after I sent these changes about the version change.

Jonas1312 · 2019-04-25T19:32:16Z

@peterjc123 I've just installed torch_nightly-1.1.0.dev20190424-cp36-cp36m-win_amd64.whl and it seems that it fixed the issue:

C:\Users\Jonas>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:26_Pacific_Standard_Time_2019
Cuda compilation tools, release 10.1, V10.1.105

C:\Users\Jonas>python36
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print("torch.cuda.is_available()   =", torch.cuda.is_available())
torch.cuda.is_available()   = True
>>> print("torch.cuda.device_count()   =", torch.cuda.device_count())
torch.cuda.device_count()   = 1
>>> print("torch.cuda.device('cuda')   =", torch.cuda.device('cuda'))
torch.cuda.device('cuda')   = <torch.cuda.device object at 0x00000262DB837EB8>
>>> print("torch.cuda.current_device() =", torch.cuda.current_device())
torch.cuda.current_device() = 0
>>> torch.cuda.get_device_name(0)
'GeForce 940MX'
>>> torch.__version__
'1.1.0.dev20190424'
>>> a = torch.ones((1,1,1)).cuda()
>>> a
tensor([[[1.]]], device='cuda:0')
>>>

Works with cuda 10.0 also!

trias702 · 2019-05-01T00:57:01Z

I just got this error for the first time today, after running PyTorch 1.0.1 (CUDA 10.0) on Windows 10 for months and months with no problems.

In my case, the error only started happening when I updated my Nvidia graphics driver to 430.53 from 417.35. Luckily, simply reverting to driver version 417.35 caused the error to go away and everything works fine again. I did not need to touch my CUDA or Python environment to fix it, just roll back the graphics driver. Very odd, looks like Nvidia changed something in the driver code which is causing this.

My setup:

Windows 10 1607 64-bit
Python 3.6.8
PyTorch 1.0.1
CUDA 10.0

PyTorch installed via pip

xiaodi68 · 2019-05-01T09:18:11Z

I got the similar issue with an error as "RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:51" on my new machine (Windows 10, Nvidia RTX2070). (Also referred to https://discuss.pytorch.org/t/a-error-when-using-gpu/32761).
I tried a lot of methods suggested, such as cuda downgrading, Anaconda downgrading and upgrading, etc. However not successful.
For my case, the error suddenly gone seems only after I installed the latest Nvidia gaming driver.
Hope it is helpful.

peterjc123 · 2019-05-01T11:05:36Z

We are building this time with the NVIDIA driver 418.96. But according to your test results, I don't know whether I should downgrade or upgrade it. However, if the problem is caused by the driver, we can actually do some tests on this. Also, if you have time, you can try whether building from source solves it.

peterjc123 · 2019-05-01T11:17:29Z

Actually, from the CUDA document, I can only find that there is a lower limit of the driver version for each CUDA version: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#major-components. But it didn't mention what will happen if we compile binaries using newer versions of GPU drivers, or the driver version mismatches with the one on the user's PC.

x1155665 · 2019-05-01T13:38:48Z

I got the similar issue with an error as "RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:51" on my new machine (Windows 10, Nvidia RTX2070). (Also referred to https://discuss.pytorch.org/t/a-error-when-using-gpu/32761).
I tried a lot of methods suggested, such as cuda downgrading, Anaconda downgrading and upgrading, etc. However not successful.
For my case, the error suddenly gone seems only after I installed the latest Nvidia gaming driver.
Hope it is helpful.

Updating Nivida driver (to 430.39) also worked for me.

feferna · 2019-05-02T15:45:40Z

My system:
Windows 10
Cuda 10.1
Python 3.7.2
PyTorch 1.0.1
NVIDIA GeForce GTX 1050 Ti

The following always works:
import torch
torch.cuda.current_device()
The following always fails for me:
import torch
torch.cuda.is_available()
torch.cuda.current_device()  # fails here
My solution was to add to my scripts the call to torch.cuda.current_device() before any other cuda calls.
Hope this gives a hint as to where to look for the issue :)

Thanks! This is the exactly same thing that happens with me on Windows 10.

If I use torch.cuda.current_device() before anything cuda-related, it works like a charm.

ezyang · 2019-05-06T11:16:22Z

If I use torch.cuda.current_device() before anything cuda-related, it works like a charm.

For the record, this isn't supposed to be necessary, but it's possible this is broken.

peterjc123 · 2019-05-29T13:41:17Z

Guys, I seem to find the root cause of this issue with the help of @Jonas1312 in #20635, it is caused by the fact that we changed the way we link against our libraries against cudart. I have made the PR #21062. You can try whether it fixes your problem.

wwyi1828 · 2019-06-12T03:46:05Z

My system:
Windows 10
Cuda 10.1
Python 3.7.2
PyTorch 1.0.1
NVIDIA GeForce GTX 1050 Ti

The following always works:
import torch
torch.cuda.current_device()
The following always fails for me:
import torch
torch.cuda.is_available()
torch.cuda.current_device()  # fails here
My solution was to add to my scripts the call to torch.cuda.current_device() before any other cuda calls.
Hope this gives a hint as to where to look for the issue :)

Thank you! I have the same problem, and I need to reboot the Python every time. According to what you said, I add torch.cuda.current_device() after import torch. It works.

ezyang · 2019-06-13T16:07:29Z

fix was merged

n1tesla · 2019-08-01T07:37:18Z

My system:
Windows 10
Cuda 10.1
Python 3.7.2
PyTorch 1.0.1
NVIDIA GeForce GTX 1050 Ti

The following always works:
import torch
torch.cuda.current_device()
The following always fails for me:
import torch
torch.cuda.is_available()
torch.cuda.current_device()  # fails here
My solution was to add to my scripts the call to torch.cuda.current_device() before any other cuda calls.
Hope this gives a hint as to where to look for the issue :)

it worked for me as well

import torch
torch.cuda.current_device()

yuanzhoulvpi2017 · 2019-08-18T08:40:10Z

I don't think cuda error 30 is an error on our side. Please try these things first.

Re-install latest GPU driver

Reboot

Ensure you have admin access

Yes,,after reboot my computer ,this error has been gone

pytorchbot added the module: windows Windows support for PyTorch label Feb 14, 2019

This was referenced Feb 19, 2019

Windows pytorch CUDA 10 error #17233

Closed

Crash on Windows somewhere between Feb 8-13 #17239

Closed

Jonas1312 mentioned this issue May 17, 2019

CUDA error: unknown error on Windows #20635

Closed

dbolya mentioned this issue May 20, 2019

a very very very strange problem on windows dbolya/yolact#45

Closed

ezyang closed this as completed Jun 13, 2019

nikhilaravi mentioned this issue Feb 16, 2020

Unknown error with CUDA facebookresearch/pytorch3d#62

Closed

wdda mentioned this issue Aug 11, 2020

Consultation on code running problems wdda/nails_detectron2#1

Open

[Running on windows 10] cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87 #17108

[Running on windows 10] cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87 #17108

Comments

juanwulu commented Feb 14, 2019

❓ Questions and Help

Please note that this issue tracker is not a help form and this issue will be closed.

juanwulu commented Feb 14, 2019

peterjc123 commented Feb 15, 2019

gmseabra commented Feb 15, 2019

peterjc123 commented Feb 15, 2019

gmseabra commented Feb 15, 2019

juanwulu commented Feb 15, 2019

gmseabra commented Feb 15, 2019

kuretru commented Feb 16, 2019

gmseabra commented Feb 16, 2019

gmseabra commented Feb 16, 2019 • edited

juanwulu commented Feb 16, 2019

gmseabra commented Feb 16, 2019

kuretru commented Feb 17, 2019

gmseabra commented Feb 19, 2019

peterjc123 commented Feb 19, 2019 • edited

gmseabra commented Feb 19, 2019

gmseabra commented Feb 20, 2019

gmseabra commented Feb 20, 2019

peterjc123 commented Feb 21, 2019

juanwulu commented Feb 21, 2019 • edited

gmseabra commented Feb 21, 2019

gmseabra commented Feb 21, 2019 • edited

peterjc123 commented Feb 22, 2019 • edited

gmseabra commented Feb 22, 2019

andrei-rusu commented Feb 26, 2019 • edited

jsmith8888 commented Mar 1, 2019

torch.cuda.current_device()

torch.cuda.get_device_name(0)

peterjc123 commented Mar 1, 2019

Yiyiyimu commented Apr 14, 2019

AndreiCostinescu commented Apr 16, 2019 • edited

reinhub-1 commented Apr 19, 2019

deepseawhale commented Apr 21, 2019

reinhub-1 commented Apr 21, 2019

Jonas1312 commented Apr 22, 2019

peterjc123 commented Apr 22, 2019 • edited

Jonas1312 commented Apr 22, 2019

peterjc123 commented Apr 22, 2019

Jonas1312 commented Apr 22, 2019 • edited

peterjc123 commented Apr 22, 2019

peterjc123 commented Apr 22, 2019

Jonas1312 commented Apr 25, 2019 • edited

trias702 commented May 1, 2019

xiaodi68 commented May 1, 2019

peterjc123 commented May 1, 2019

peterjc123 commented May 1, 2019

x1155665 commented May 1, 2019

feferna commented May 2, 2019

ezyang commented May 6, 2019

peterjc123 commented May 29, 2019

wwyi1828 commented Jun 12, 2019

ezyang commented Jun 13, 2019

n1tesla commented Aug 1, 2019

yuanzhoulvpi2017 commented Aug 18, 2019

gmseabra commented Feb 16, 2019 •

edited

peterjc123 commented Feb 19, 2019 •

edited

juanwulu commented Feb 21, 2019 •

edited

gmseabra commented Feb 21, 2019 •

edited

peterjc123 commented Feb 22, 2019 •

edited

andrei-rusu commented Feb 26, 2019 •

edited

AndreiCostinescu commented Apr 16, 2019 •

edited

peterjc123 commented Apr 22, 2019 •

edited

Jonas1312 commented Apr 22, 2019 •

edited

Jonas1312 commented Apr 25, 2019 •

edited