Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[torch.export] RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #126674

Open
siahuat0727 opened this issue May 20, 2024 · 2 comments
Labels
module: export oncall: pt2 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@siahuat0727
Copy link

siahuat0727 commented May 20, 2024

馃悰 Describe the bug

The exported model failed to do inference on cuda.

import torch
ep = torch.export.load('retina.pt2')
gm = ep.module()
gm(torch.rand(1, 3, 800, 1216))  # success

gm = ep.module().cuda()
gm(torch.rand(1, 3, 800, 1216).cuda())  # failed

retina.pt2

Maybe related to #121761 but the solution provided by this comment doesn't work.
@angelayi Could you have a look at this? Thank you.

(Sorry for not providing the code about exporting the model now because it's a bit complicated)

Versions

Collecting environment information...
PyTorch version: 2.3.0
Is debug build: False
CUDA used to build PyTorch: Could not collect
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.6 LTS (x86_64)
GCC version: (Ubuntu 8.4.0-1ubuntu1~18.04) 8.4.0
Clang version: Could not collect
CMake version: version 3.10.2
Libc version: glibc-2.27

Python version: 3.8.19 (default, Mar 20 2024, 19:58:24) 

cc @ezyang @msaroufim @bdhirsh @anijain2305 @chauhang

@angelayi
Copy link
Contributor

Can you please share the error message and what was the issue when you tried applying the suggestion in the comment?

@xmfan xmfan added needs reproduction Someone else needs to try reproducing the issue given the instructions. No action needed from user and removed needs reproduction Someone else needs to try reproducing the issue given the instructions. No action needed from user labels May 20, 2024
@siahuat0727
Copy link
Author

Sure. Thanks for looking at this.

Traceback (most recent call last):
  File "/home/chensf/git/mmdeploy_export_onnx/reproduce_cuda_error_bug.py", line 7, in <module>
    gm(torch.rand(1, 3, 800, 1216).cuda())  # failed
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/fx/graph_module.py", line 737, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/fx/graph_module.py", line 317, in __call__
    raise e
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/fx/graph_module.py", line 304, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1582, in _call_impl
    result = forward_call(*args, **kwargs)
  File "<eval_with_key>.7", line 817, in forward
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/_ops.py", line 595, in __call__
    return self_._op(*args, **kwargs)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

The error message when I tried to apply the suggestion is the same.

Traceback (most recent call last):
  File "/home/chensf/git/mmdeploy_export_onnx/reproduce_cuda_error_bug.py", line 20, in <module>
    gm(torch.rand(1, 3, 800, 1216).cuda())  # failed
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/fx/graph_module.py", line 737, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/fx/graph_module.py", line 317, in __call__
    raise e
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/fx/graph_module.py", line 304, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1582, in _call_impl
    result = forward_call(*args, **kwargs)
  File "<eval_with_key>.7", line 817, in forward
  File "/root/miniconda3/envs/open-mmlab/lib/python3.10/site-packages/torch/_ops.py", line 595, in __call__
    return self_._op(*args, **kwargs)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

And this is the code to try the suggestion:

import torch
ep = torch.export.load('retina.pt2')
gm = ep.module()
gm(torch.rand(1, 3, 800, 1216))  # success

for node in ep.graph.nodes:
    if "device" in node.kwargs:
        kwargs = node.kwargs.copy()
        kwargs["device"] = "cuda"
        node.kwargs = kwargs

# Move state dict tensors to cuda
for k, v in ep.state_dict.items():
    if isinstance(v, torch.nn.Parameter):
        ep._state_dict[k] = torch.nn.Parameter(v.cuda())
    else:
        ep._state_dict[k] = v.cuda()

gm = ep.module()
gm(torch.rand(1, 3, 800, 1216).cuda())  # failed

Also note that the link to this model is provided at the above issue.

@mlazos mlazos added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: export oncall: pt2 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

5 participants