Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RTX 3090: CUDA error: no kernel image is available for execution on the device #209

Open
chenwang1701 opened this issue Apr 21, 2021 · 2 comments

Comments

@chenwang1701
Copy link

Our implementation is RTX 3090 with CUDA 11.0 pytorch 1.7.1+cu110 python 3.7.3

After the follwing operation:

$ git clone https://github.com/mapillary/inplace_abn.git
$ cd inplace_abn
$ python setup.py install

It failed with: unsupported GPU architecture 'compute_86',
then we drop the torch cuda arch by 'export TORCH_CUDA_ARCH_LIST=7.5' , the error logs:


Traceback (most recent call last):
File "train_and_eval.py", line 25, in
model.optimize_parameters()
File "/run/determined/workdir/wangchen/skd/skd/3090ABN/networks/kd_model.py", line 189, in optimize_parameters
self.forward()
File "/run/determined/workdir/wangchen/skd/skd/3090ABN/networks/kd_model.py", line 142, in forward
self.preds_S = self.parallel_student.train()(self.images, parallel=args.parallel)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/run/determined/workdir/wangchen/skd/skd/3090ABN/utils/parallel.py", line 106, in forward
return super().forward(inputs, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 159, in forward
return self.module(*inputs[0], **kwargs[0])
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/run/determined/workdir/wangchen/skd/skd/3090ABN/networks/deeplab_combine.py", line 185, in forward
x = self.relu1(self.bn1(self.conv1(x)))
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/inplace_abn-1.1.1.dev2+g938ffd2.d20210420-py3.7-linux-x86_64.egg/inplace_abn/abn.py", line 322, in forward
self.group,
File "/opt/conda/lib/python3.7/site-packages/inplace_abn-1.1.1.dev2+g938ffd2.d20210420-py3.7-linux-x86_64.egg/inplace_abn/functions.py", line 322, in inplace_abn_sync
1,
File "/opt/conda/lib/python3.7/site-packages/inplace_abn-1.1.1.dev2+g938ffd2.d20210420-py3.7-linux-x86_64.egg/inplace_abn/functions.py", line 94, in forward
count
= count.to(dtype=var.dtype)
RuntimeError: CUDA error: no kernel image is available for execution on the device

@gdippolito
Copy link

gdippolito commented Apr 28, 2021

You can try setting TORCH_CUDA_ARCH_LIST=8.0 it should be compatible with 8.6 (both are Ampere architecture). 7.5 is a different CUDA architecture (Turing), so it is expected not to work.

@chenwang1701
Copy link
Author

Thanks very much, it does work when I set TORCH_CUDA_ARCH_LIST=8.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants