Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mnn推断比pytorch推断耗时长 #2835

Open
jtyan123 opened this issue Apr 18, 2024 · 8 comments
Open

mnn推断比pytorch推断耗时长 #2835

jtyan123 opened this issue Apr 18, 2024 · 8 comments
Labels
question Further information is requested

Comments

@jtyan123
Copy link

jtyan123 commented Apr 18, 2024

平台(如果交叉编译请再附上交叉编译目标平台):

ios

Github版本:

release 2.8.1

编译方式:

xcode run ios demo。mnn.framework用的2.8.1release

test脚本输出如下:

build % python ../tools/script/testMNNFromOnnx.py /Users/yjt/Downloads/app/py模型/新模型/lightglue_new.onnx 
Dir exist
onnx/test.onnx
tensor(float)
tensor(float)
tensor(float)
tensor(float)
['matches0', 'mscores0']
inputs:
kpts0
onnx/
kpts1
onnx/
desc0
onnx/
desc1
onnx/
outputs:
onnx/matches0.txt (1, 2)
onnx/
onnx/mscores0.txt (1,)
onnx/
hw.cpufamily: 458787763 , size = 4
The device support i8sdot:1, support fp16:1, support i8mm: 0
Start to Convert Other Model Format To MNN Model..., target version: 2.8
[16:05:53] /Users/yjt/Downloads/MNN-2.8.1/tools/converter/source/onnx/onnxConverter.cpp:46: ONNX Model ir version: 8
[16:05:53] /Users/yjt/Downloads/MNN-2.8.1/tools/converter/source/onnx/onnxConverter.cpp:47: ONNX Model opset version: 17
Start to Optimize the MNN Net...
inputTensors : [ desc0, kpts0, desc1, kpts1, ]
outputTensors: [ matches0, mscores0, ]
Converted Success!
Check convert result by onnx, thredhold is 0.01
kpts0
kpts1
desc0
desc1
output: matches0
output: mscores0
matches0: (1, 2, )
mscores0: (1, )
TEST_SUCCESS

相关代码如下:输入是另一次推断的输出参数填充。

void testMNN() {
    std::vector<MNN::Express::VARP> _mnnInputs;
    _mnnInputs.emplace_back(mnnOutput0[2]);
    _mnnInputs.emplace_back(mnn0);
    _mnnInputs.emplace_back(mnnOutput1[2]);
    _mnnInputs.emplace_back(mnn1);
    
    /*
    MNN::ScheduleConfig sConfig;
    sConfig.type = MNN_FORWARD_OPENCL;
    sConfig.numThread = 4;
    std::shared_ptr<MNN::Express::Executor::RuntimeManager> rtmgr(MNN::Express::Executor::RuntimeManager::createRuntimeManager(sConfig), MNN::Express::Executor::RuntimeManager::destroy);
    rtmgr->setCache(".cachefile");
    */
    MNN::Express::Module::Config mdconfig; // default module config
    mdconfig.shapeMutable = false;
    std::unique_ptr<MNN::Express::Module> mnnMdule(MNN::Express::Module::load({ "desc0", "kpts0", "desc1", "kpts1"}, {"matches0", "mscores0"}, model_file.c_str(), nullptr, &mdconfig));
    auto outputs  = mnnMdule->onForward(_mnnInputs);
}

假若把上面的注释打开。也就是设置RuntimeManager。并把type设置为MNN_FORWARD_OPENCL,则推断比pytorch快一点点。但是感觉还是不达预期。之前用别的模型推测图片,mnn耗时是pytorch的1/9左右。

@jtyan123
Copy link
Author

模型已经发送120543985邮箱

@jxt1234
Copy link
Collaborator

jxt1234 commented Apr 18, 2024

你ios上怎么开启 opencl 的?是模拟器么?

@jxt1234
Copy link
Collaborator

jxt1234 commented Apr 18, 2024

ios 上一般 gpu 用 MNN_FORWARD_METAL

@jtyan123
Copy link
Author

不知道是否真的启用了opencl。就是把type属性设置为MNN_FORWARD_OPENCL了。把type属性设置为MNN_FORWARD_METAL后MNN推断耗费的时间更长了。比pytorch多1s左右

@jtyan123
Copy link
Author

是用的真机iphone15plus跑的。不是模拟器

@jxt1234
Copy link
Collaborator

jxt1234 commented Apr 19, 2024

你 mnn 是怎么编译的?

@jtyan123
Copy link
Author

我直接下载你们的release 2.8.1上的ios framework也是同样的现象。

@jxt1234
Copy link
Collaborator

jxt1234 commented Apr 23, 2024

你测试方式是什么?一般是需要第二次 forward 开始计时,连续运行多次。参考 project/ios/Playground 和 tools/cpp/ModuleBasic.cpp 里面的速度测试

@jxt1234 jxt1234 added the question Further information is requested label Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants