Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请教关于mnncompress的使用问题 #2847

Open
stricklandye opened this issue Apr 26, 2024 · 3 comments
Open

请教关于mnncompress的使用问题 #2847

stricklandye opened this issue Apr 26, 2024 · 3 comments
Labels
question Further information is requested

Comments

@stricklandye
Copy link

stricklandye commented Apr 26, 2024

问题描述

按照文档的描述,如果使用离线量化,移除代码中参数更新部分代码。所以我的理解来看,这就等同于传入校准(calibrate)就好,等同于执行多次推理。实验下来发现量化过程没有生效。

示例代码:

  model = load_pretrained_model()
  quantizer = Quantizer(model, retain_sparsity=False, mode="offline",debug_info=True,bits=8)
  quant_model = quantizer.convert()
  quant_model.to(device)

  valid_dataset = load_dataset("zh-plus/tiny-imagenet", split="valid", streaming=True)
  valid_dataset = valid_dataset.map(transform_image, batched=True)
  valid_dataloader = DataLoader(valid_dataset, batch_size=4)

  # 准备好量化训练图
  quant_model.eval()
  with torch.no_grad():
      for i, ele in enumerate(valid_dataloader):
          if i == 100:
              break
          image, label = ele["image"], ele["label"]
          image, label = image.to(device), label.to(device)
          output = quant_model(image)

  # 移除插入的节点
  quantizer.strip_qat_ops()
  torch.save(quant_model.state_dict(), "quanted_mobilenet.pt")
  x = torch.randn(size=[1, 3, 224, 224]).to(device)
  torch.onnx.export(quant_model, x, "quanted_mobilenet.onnx", input_names=["input"], output_names=["output"])
  quantizer.save_compress_params("quanted_mobilenet.onnx", "compress_params_index.bin", append=False)

模型量化前后的对比:

9.8M  4月 22 16:29 mobilenetv3-small-55df8e1f.pth
11M  4月 26 18:09 quanted_mobilenet.onnx
10M  4月 26 18:18 quant_mobilenet.mnn
62K  4月 26 17:57 compress_params_index.bin

请问是我的使用方式问题吗? 不过使用工具箱当中的quantize.out是有效的

@jxt1234
Copy link
Collaborator

jxt1234 commented Apr 29, 2024

没有生效是指?生成 compress_params_index.bin 之后,需要加上这个参数再转换一下:--compressionParamsFile compress_params_index.bin

@jxt1234 jxt1234 added the question Further information is requested label Apr 29, 2024
@stricklandye
Copy link
Author

感谢回复。没有生效指的是模型大小没有改变,在我所给出的例子当中。quant_mobilenetmobilenetv3-small-55df8e1f.pth文件大小没有发生变化。

@stricklandye
Copy link
Author

自己尝试了一下发现模型需要在tain()模式下? 这好像与torch原生的API使用有些不一样,按照torch的文档,它是在evel()下进行PTQ。请问为什么MNN需要在train()模式下呢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants