New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
馃悰 [Bug] RuntimeError: [Error thrown at core/runtime/execute_engine.cpp:132] Expected inputs[i].is_cuda() to be true but got false Expected input tensors to have device cuda, found device cpu #2744
Comments
Hi - thanks for the report - I am able to reproduce the issue. For a quick workaround, try one of the following replacements: pseudo_images = torch.zeros(N, self.side_cells, self.side_cells, C).to(pn_feats)
batch_idxs = torch.arange(N).repeat_interleave(P).to(pillar_pixels)
##### Replace the above with:
pseudo_images = torch.zeros(N, self.side_cells, self.side_cells, C).cuda()
batch_idxs = torch.arange(N).repeat_interleave(P).cuda()
##### or
pseudo_images = torch.zeros(N, self.side_cells, self.side_cells, C, device=pn_feats.device)
batch_idxs = torch.arange(N, device=pillar_pixels.device).repeat_interleave(P)
##### or
pseudo_images = torch.zeros(N, self.side_cells, self.side_cells, C).to(pn_feats.device)
batch_idxs = torch.arange(N).repeat_interleave(P).to(pillar_pixels.device) It seems that the With respect to the Dynamo path, I have added #2747 to add support for |
Thanks for the workarounds, @gs-olive! I went with: pseudo_images = torch.zeros(N, self.side_cells, self.side_cells, C).to(pn_feats.device)
batch_idxs = torch.arange(N).repeat_interleave(P).to(pillar_pixels.device) for consistency's sake and that worked for me. For the Dynamo path, I tried enabling fallback with: trt_model = torch_tensorrt.compile(
model,
inputs=inputs,
enabled_precisions=enabled_precisions,
truncate_long_and_double=True,
torch_executed_ops=["aten::masked_select"],
) but was getting the same error. However, I just noticed there's actually an earlier error raised before the second error:
|
That newly discovered error in the Dynamo path led me to this issue and this issue, which was fixed here according to the comments. |
Interestingly, this code, which just uses normal boolean indexing, seems to work with the Dynamo path: import torch
import torch_tensorrt
from torch import nn
DEVICE = "cuda:0"
class Indexer(nn.Module):
def __init__(self, side_cells):
super().__init__()
self.side_cells = side_cells
def forward(self, pn_feats, pillar_pixels):
(N, P, C) = pn_feats.shape
pseudo_images = torch.zeros(N, self.side_cells, self.side_cells, C).to(
pn_feats.device
)
batch_idxs = torch.arange(N).repeat_interleave(P).to(pillar_pixels.device)
rows = pillar_pixels[..., 0].flatten()
cols = pillar_pixels[..., 1].flatten()
mask = rows != -1
batch_idxs = batch_idxs[mask]
rows = rows[mask]
cols = cols[mask]
pn_feats = pn_feats.reshape(-1, C)[mask]
pseudo_images[batch_idxs, rows, cols] = pn_feats
return pseudo_images
def main():
side_cells = 200
pn_feats = torch.rand((1, 12000, 64)).to(DEVICE)
pillar_pixels = torch.randint(0, side_cells, (1, 12000, 2)).to(DEVICE)
pillar_pixels[0, 800:] = -1
model = Indexer(side_cells).to(DEVICE)
model.eval()
with torch.no_grad():
pt_preds = model(pn_feats, pillar_pixels)
inputs = [
torch_tensorrt.Input(pn_feats.shape),
torch_tensorrt.Input(pillar_pixels.shape, dtype=torch.int32),
]
enabled_precisions = {torch.half, torch.float32}
trt_model = torch_tensorrt.compile(
model,
inputs=inputs,
enabled_precisions=enabled_precisions,
truncate_long_and_double=True,
min_block_size=1,
)
trt_preds = trt_model(pn_feats, pillar_pixels.int())
print((pt_preds == trt_preds[0]).sum())
if __name__ == "__main__":
main() |
Bug Description
The code below produces the following error:
This same code works fine with Torch-TensorRT 1.4.0. When using the Dynamo backend, I get the following error:
To Reproduce
Expected behavior
Environment
conda
,pip
,libtorch
, source):nvcr.io/nvidia/pytorch:24.01-py3
Additional context
The text was updated successfully, but these errors were encountered: