❓ [Question] How to specific aten operators must be run by LibTorch in C++? #2830

demuxin · 2024-05-13T10:10:09Z

❓ Question

When I compile the SwinTransformer model using Torch-TensorRT, an error appears:

terminate called after throwing an instance of 'c10::Error'
  what():  0 INTERNAL ASSERT FAILED at "../torch/csrc/jit/ir/alias_analysis.cpp":615, please report a bug to PyTorch. We don't have an op for aten::floor_divide but it isn't a special case.  Argument types: int, int, 

Candidates:
        aten::floor_divide(Tensor self, Tensor other) -> Tensor
        aten::floor_divide.Scalar(Tensor self, Scalar other) -> Tensor
        aten::floor_divide.out(Tensor self, Tensor other, *, Tensor(a!) out) -> Tensor(a!)
        aten::floor_divide.Scalar_out(Tensor self, Scalar other, *, Tensor(a!) out) -> Tensor(a!)

I checked out this link, This error is because torch-trt dont support % op.

Fine, I can select to run floor_divide using LibTorch.

torchtrt::ts::CompileSpec compile_settings({ input });
compile_settings.enabled_precisions.insert(build_type);
compile_settings.workspace_size = _1_GB;
compile_settings.truncate_long_and_double = true;
compile_settings.num_avg_timing_iters = 1;
compile_settings.torch_executed_ops.push_back("aten::floor_divide");  // here
torchtrt::ts::compile(model, compile_settings)

It's strange that the setting does not take effect. This error still persists.

What can I do about this mistake?

Furthermore, How to specific aten operators must be run by LibTorch in C++?

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

PyTorch Version (e.g., 1.0):2.2.1
CPU Architecture:x86
OS (e.g., Linux):ubuntu22.04
How you installed PyTorch (conda, pip, libtorch, source):
Build command you used (if compiling from source):
Are you using local sources or building from archives:
Python version:
CUDA version:12.2
GPU models and configuration:
Any other relevant information:

The text was updated successfully, but these errors were encountered:

demuxin · 2024-05-15T02:25:55Z

I came up with this solution. I use this code below to replace % op:

def TakeRemainder(x: int, y: int) -> int:
    return x - y * int(x / y)

And it works.

I want to know why this setting doesn't take effect.

compile_settings.torch_executed_ops.push_back("aten::floor_divide");

gs-olive · 2024-05-15T15:28:01Z

Hi - thanks for the report. I think this may be related to the following lowering pass, where it's possible that both inputs are upcasted integers, so we accidentally construct a schema which is no longer valid:

TensorRT/core/lowering/passes/remove_unnecessary_casts.cpp

Lines 135 to 141 in 4b993f8

    
           case c10::aten::floor_divide: 
        
             new_node = g->create(c10::aten::floordiv, user->inputs(), 1); 
        
             new_node->insertAfter(user); 
        
             new_node->outputs()[0]->setType(c10::IntType::get()); 
        
             user->outputs()[0]->replaceAllUsesWith(new_node->outputs()[0]); 
        
             user->destroy(); 
        
             break;

Regarding why compile_settings.torch_executed_ops.push_back("aten::floor_divide"); doesn't work - this is likely because the lowering pass puts the graph in an inconsistent or invalid state, so it doesn't have the opportunity to exclude conversion of floor_divide before failure, since the "lowering" phase happens prior to partitioning and conversion to TRT/Torch.

demuxin · 2024-05-16T01:36:59Z

Hi - thanks for the report. I think this may be related to the following lowering pass, where it's possible that both inputs are upcasted integers, so we accidentally construct a schema which is no longer valid:

So this is a bug, right? Will you fix this bug in the future?

gs-olive · 2024-05-17T01:32:04Z

Yes, this appears to be bug and we can work on a fix for this. Do you have a reproducer script or model we could use to recreate the error?

demuxin · 2024-05-17T02:06:29Z

This is code:

torch::Device* device_ = new torch::Device(torch::DeviceType::CUDA);
device_->set_index(0);

torch::jit::script::Module model = torch::jit::load(model_path);
model.to("cuda");
model.eval();
model.to(torch::kHalf);

std::vector<int64_t> input_dim{1, 3, 832, 1440};
auto input = torchtrt::Input(input_dim, torchtrt::DataType::kHalf);

size_t _1_GB = 1 << 30;
torchtrt::ts::CompileSpec compile_settings({ input });
compile_settings.enabled_precisions.insert(torchtrt::DataType::kHalf);
compile_settings.workspace_size = _1_GB;
compile_settings.truncate_long_and_double = true;
compile_settings.num_avg_timing_iters = 1;
torchtrt::ts::compile(model, compile_settings);

Additionally, I provide you with the model with google dirve.

gs-olive · 2024-05-24T05:49:41Z

Hello - thanks for the details. I am unable to access the model at that link, is the model available elsewhere? Also, could you provide the full output debug log as well - using the following logging level: torchtrt::logging::set_reportable_log_level(torchtrt::logging::Level::kGRAPH);?

demuxin · 2024-05-24T05:58:10Z

I changed the access to the model, The model link is accessible.

demuxin added the question Further information is requested label May 13, 2024

narendasan assigned gs-olive and bowang007 May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

❓ [Question] How to specific aten operators must be run by LibTorch in C++? #2830

❓ [Question] How to specific aten operators must be run by LibTorch in C++? #2830

demuxin commented May 13, 2024

demuxin commented May 15, 2024

gs-olive commented May 15, 2024

demuxin commented May 16, 2024

gs-olive commented May 17, 2024

demuxin commented May 17, 2024 •

edited

gs-olive commented May 24, 2024 •

edited

demuxin commented May 24, 2024 •

edited

❓ [Question] How to specific aten operators must be run by LibTorch in C++? #2830

❓ [Question] How to specific aten operators must be run by LibTorch in C++? #2830

Comments

demuxin commented May 13, 2024

❓ Question

Environment

demuxin commented May 15, 2024

gs-olive commented May 15, 2024

demuxin commented May 16, 2024

gs-olive commented May 17, 2024

demuxin commented May 17, 2024 • edited

gs-olive commented May 24, 2024 • edited

demuxin commented May 24, 2024 • edited

demuxin commented May 17, 2024 •

edited

gs-olive commented May 24, 2024 •

edited

demuxin commented May 24, 2024 •

edited