New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export Quantized Model to ONNX: NotImplementedError: Could not run 'quantized::conv2d.new' with arguments from the 'CPU' backend.
#1390
Comments
Some more details: If I wrap x around # Define your main model here
class VeryComplexModel(nn.Module):
def __init__(self):
super().__init__()
self.quant = QuantStub()
self.dequant = DeQuantStub()
self.backbone = timm.create_model("vit_small_patch14_dinov2.lvd142m",
pretrained=True)
self.mlp = nn.Sequential(nn.Linear(self.backbone.num_features, 128),
nn.ReLU(), nn.Linear(128, 10))
def forward(self, x):
x = self.quant(x)
x = self.mlp(self.backbone(x))
x = self.dequant(x)
return x I still have an error, not for
|
I made some progresses, using a very simple model, not involving this # Define your main model here
class VeryComplexModel(nn.Module):
def __init__(self):
super().__init__()
self.quant = torch.quantization.QuantStub()
self.dequant = torch.quantization.DeQuantStub()
self.backbone = nn.Sequential(
nn.Conv2d(1, 2, 3),
nn.ReLU(),
)
self.mlp = nn.Linear(1352, 10)
def forward(self, x):
x = self.quant(x)
x = self.backbone(x)
x = x.flatten(1)
x = self.mlp(x)
x = self.dequant(x)
return x
# Then, define your LightningModule as usual
class Classifier(L.LightningModule):
def __init__(self):
super().__init__()
# This is mandatory for the callbacks
self.model = VeryComplexModel()
def forward(self, x):
return self.model(x)
def training_step(self, batch, batch_idx):
x, y = batch
y_hat = self.forward(x)
loss = F.cross_entropy(y_hat, y)
return loss
def configure_optimizers(self):
optimizer = optim.Adam(self.parameters(), lr=1e-3)
return [optimizer] The issue seems to occur in the forward method of the visiontranformer class, do you have any clue? maybe a config to exclude the |
Hi, any news on the issue? The only workaround I found has been to train normally first, compile to ONNX, then use a PTQ directly on the ONNX model to avoid using QAT, but that's not a real fix :/ |
Hi @clementpoiret , sorry for the late response. This is not an issue of export. The QAT PyTorch model you generated is invalid. Please refer to this document for the usage of QAT. |
Dear @yuwenzho thanks for your answer. You're right, I certainly have a bug in my callbacks. But even following the doc, I can't export dinov2 as ONNX:
Here is a complete code snippet: import os
import timm
import torch
import torch.nn as nn
import torch.nn.functional as F
from neural_compressor import QuantizationAwareTrainingConfig
from neural_compressor.config import Torch2ONNXConfig
from neural_compressor.training import prepare_compression
from torch import optim, utils
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision.datasets import MNIST
from torchvision.transforms import ToTensor
from tqdm import tqdm
# Define your main model here
class VeryComplexModel(nn.Module):
def __init__(self):
super().__init__()
self.backbone = timm.create_model("vit_small_patch14_dinov2.lvd142m",
pretrained=True)
self.clf_layers = nn.Sequential(
nn.Linear(self.backbone.num_features, 128), nn.ReLU(),
nn.Linear(128, 10))
def forward(self, x):
# x = x.repeat(1, 3, 1, 1)
# x = F.interpolate(x, size=(518, 518))
x = self.backbone(x)
x = self.clf_layers(x)
return x
criterion = nn.CrossEntropyLoss()
model = VeryComplexModel()
dataset = MNIST(os.getcwd(), download=True, transform=ToTensor())
train_loader = utils.data.DataLoader(dataset)
def train(model, steps=10):
optimizer = optim.Adam(model.parameters(), lr=0.001)
model.train()
for epoch in range(2):
for i, (data, target) in enumerate(tqdm(train_loader)):
if i > steps:
break
# repeat and interpolate to match the input shape
data = data.repeat(1, 3, 1, 1)
data = F.interpolate(data, size=(518, 518))
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
conf = QuantizationAwareTrainingConfig()
compression_manager = prepare_compression(model, conf)
compression_manager.callbacks.on_train_begin()
model = compression_manager.model
train(model)
compression_manager.callbacks.on_train_end()
compression_manager.save("./output")
# Export as ONNX
model.export(
"int8_model.onnx",
Torch2ONNXConfig(
dtype="int8",
opset_version=17,
quant_format="QDQ",
example_inputs=torch.randn(1, 3, 518, 518),
input_names=["input"],
output_names=["output"],
dynamic_axes={
"input": {
0: "batch_size"
},
"output": {
0: "batch_size"
},
},
)) |
This error means that quantized transpose is not supported. @PenghuiCheng I tried setting the transpose to FP32 but it doesn't work, could you please help to check? from neural_compressor.utils.constant import FP32
conf = QuantizationAwareTrainingConfig(
op_type_dict={"transpose":FP32,}
) |
Hi @clementpoiret , as @yuwenzho mentioned above, quantized For your circumstance, you can try either creating a symbolic function to convert the operator and register it as a custom symbolic function, or contribute to PyTorch to add the same symbolic function to |
Dear all,
In order to easily use Intel Neural Compressor in our team, and because we use PyTorch Lightning, I am building Lightning Callbacks in order to call your hooks when needed in lightning's training loop. Here is the current state of the project: https://github.com/clementpoiret/lightning-nc
Unfortunately, I face issues when trying to export the quantized models in ONNX. Exporting a fp32 model does not cause issues.
Here is a toy example you can play with (requires torch 2.1 and lightning 2.1):
when calling the
export(...)
fn, I end up with the following error:Do you have any clue?
I tried training completely on CPU by setting
accelerator="cpu"
on theTrainer
, same issue.Thanks a lot,
Clément.
The text was updated successfully, but these errors were encountered: