mnn推断比pytorch推断耗时长 #2835

jtyan123 · 2024-04-18T02:22:37Z

平台(如果交叉编译请再附上交叉编译目标平台):

ios

Github版本:

release 2.8.1

编译方式:

xcode run ios demo。mnn.framework用的2.8.1release

test脚本输出如下：

build % python ../tools/script/testMNNFromOnnx.py /Users/yjt/Downloads/app/py模型/新模型/lightglue_new.onnx 
Dir exist
onnx/test.onnx
tensor(float)
tensor(float)
tensor(float)
tensor(float)
['matches0', 'mscores0']
inputs:
kpts0
onnx/
kpts1
onnx/
desc0
onnx/
desc1
onnx/
outputs:
onnx/matches0.txt (1, 2)
onnx/
onnx/mscores0.txt (1,)
onnx/
hw.cpufamily: 458787763 , size = 4
The device support i8sdot:1, support fp16:1, support i8mm: 0
Start to Convert Other Model Format To MNN Model..., target version: 2.8
[16:05:53] /Users/yjt/Downloads/MNN-2.8.1/tools/converter/source/onnx/onnxConverter.cpp:46: ONNX Model ir version: 8
[16:05:53] /Users/yjt/Downloads/MNN-2.8.1/tools/converter/source/onnx/onnxConverter.cpp:47: ONNX Model opset version: 17
Start to Optimize the MNN Net...
inputTensors : [ desc0, kpts0, desc1, kpts1, ]
outputTensors: [ matches0, mscores0, ]
Converted Success!
Check convert result by onnx, thredhold is 0.01
kpts0
kpts1
desc0
desc1
output: matches0
output: mscores0
matches0: (1, 2, )
mscores0: (1, )
TEST_SUCCESS

相关代码如下：输入是另一次推断的输出参数填充。

void testMNN() {
    std::vector<MNN::Express::VARP> _mnnInputs;
    _mnnInputs.emplace_back(mnnOutput0[2]);
    _mnnInputs.emplace_back(mnn0);
    _mnnInputs.emplace_back(mnnOutput1[2]);
    _mnnInputs.emplace_back(mnn1);
    
    /*
    MNN::ScheduleConfig sConfig;
    sConfig.type = MNN_FORWARD_OPENCL;
    sConfig.numThread = 4;
    std::shared_ptr<MNN::Express::Executor::RuntimeManager> rtmgr(MNN::Express::Executor::RuntimeManager::createRuntimeManager(sConfig), MNN::Express::Executor::RuntimeManager::destroy);
    rtmgr->setCache(".cachefile");
    */
    MNN::Express::Module::Config mdconfig; // default module config
    mdconfig.shapeMutable = false;
    std::unique_ptr<MNN::Express::Module> mnnMdule(MNN::Express::Module::load({ "desc0", "kpts0", "desc1", "kpts1"}, {"matches0", "mscores0"}, model_file.c_str(), nullptr, &mdconfig));
    auto outputs  = mnnMdule->onForward(_mnnInputs);
}

假若把上面的注释打开。也就是设置RuntimeManager。并把type设置为MNN_FORWARD_OPENCL，则推断比pytorch快一点点。但是感觉还是不达预期。之前用别的模型推测图片，mnn耗时是pytorch的1/9左右。

The text was updated successfully, but these errors were encountered:

jtyan123 · 2024-04-18T02:27:55Z

模型已经发送120543985邮箱

jxt1234 · 2024-04-18T05:47:44Z

你ios上怎么开启 opencl 的？是模拟器么?

jxt1234 · 2024-04-18T05:48:25Z

ios 上一般 gpu 用 MNN_FORWARD_METAL

jtyan123 · 2024-04-18T05:55:09Z

不知道是否真的启用了opencl。就是把type属性设置为MNN_FORWARD_OPENCL了。把type属性设置为MNN_FORWARD_METAL后MNN推断耗费的时间更长了。比pytorch多1s左右

jtyan123 · 2024-04-18T05:55:44Z

是用的真机iphone15plus跑的。不是模拟器

jxt1234 · 2024-04-19T03:32:24Z

你 mnn 是怎么编译的?

jtyan123 · 2024-04-22T02:23:03Z

我直接下载你们的release 2.8.1上的ios framework也是同样的现象。

jxt1234 · 2024-04-23T06:48:20Z

你测试方式是什么？一般是需要第二次 forward 开始计时，连续运行多次。参考 project/ios/Playground 和 tools/cpp/ModuleBasic.cpp 里面的速度测试

jtyan123 mentioned this issue Apr 18, 2024

Error for concat size of op #2838

Closed

jxt1234 added the question Further information is requested label Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mnn推断比pytorch推断耗时长 #2835

mnn推断比pytorch推断耗时长 #2835

jtyan123 commented Apr 18, 2024 •

edited

jtyan123 commented Apr 18, 2024

jxt1234 commented Apr 18, 2024

jxt1234 commented Apr 18, 2024

jtyan123 commented Apr 18, 2024

jtyan123 commented Apr 18, 2024

jxt1234 commented Apr 19, 2024

jtyan123 commented Apr 22, 2024

jxt1234 commented Apr 23, 2024

mnn推断比pytorch推断耗时长 #2835

mnn推断比pytorch推断耗时长 #2835

Comments

jtyan123 commented Apr 18, 2024 • edited

平台(如果交叉编译请再附上交叉编译目标平台):

Github版本:

编译方式:

jtyan123 commented Apr 18, 2024

jxt1234 commented Apr 18, 2024

jxt1234 commented Apr 18, 2024

jtyan123 commented Apr 18, 2024

jtyan123 commented Apr 18, 2024

jxt1234 commented Apr 19, 2024

jtyan123 commented Apr 22, 2024

jxt1234 commented Apr 23, 2024

jtyan123 commented Apr 18, 2024 •

edited