accuracy is much lower #2

luhc15 · 2017-12-29T06:24:26Z

when I convert the voc101 model to pytorch version , I test on VOC2012 val.txt, but the mean IoU is 79.6% , much lower than the author given which is 85.41%， is there any other details I ignored

zhijiew · 2017-12-30T08:43:44Z

Hi, @luhc15 , I'm also trying to use these codes in voc dataset, could you please share your test codes? Thanks a lot!

kazuto1011 · 2017-12-30T10:09:31Z

Hi guys. I've never evaluated my converted model. One thing I found is that mean values in demo.py are incorrect; slightly different from the ones given by the authors. For precise evaluation, please refer to the original Matlab code in the link below and the Section 5.3 "PASCAL VOC 2012" in the paper. Have you already tried multi scale testing?
https://github.com/hszhao/PSPNet/blob/master/evaluation/eval_all.m#L59

luhc15 · 2018-01-02T18:55:33Z

@Littlebelly , you can reference to
https://github.com/wkentaro/pytorch-fcn
https://github.com/bodokaiser/piwise
, I use the test code from them

luhc15 · 2018-01-02T19:02:46Z

@kazuto1011 I use the voc2007 val as my test dataset, I tried multiscale but got worse result, may be I should check the test details.

zhijiew · 2018-01-24T06:05:50Z

@luhc15 @kazuto1011
I evaluated this model, it can reach 87.42 miou in val data(which contains 1449 images), and if I use mean rgb value as the origin paper author, it can reach 87.47 miou.
But when I use the origin model and evaulate, it can reach 91.67 miou on the same val dataset (I guess the author use all data including training set and validation set to train their model to get higher performance in online test).
But there is still difference in original caffe model (91.67) and pytorch model (87.47), when you transfer this model from caffe to pytorch, is there any layer you skipped? @kazuto1011

Looking forward to your reply!

kazuto1011 · 2018-01-25T07:19:02Z

Thank you for reporting the results! I believe any layers are not skipped and instead suspect slight differences between the models like an interpolation way. Maybe we should compare the intermediate values of Caffe and PyTorch.
Anyway, I have also evaluated the converted model on val set, by averaging softmax results of multi-scaled and flipped 12 inputs, and it reached 86.9 mIoU % (close to yours?). I guess this result may be related to this issue. The author's codes contain "sliced evaluation" on scale_process.m, although I have no confidence that it is also for VOC2012. According to the issue, the score was risen by about 15% on cityscapes dataset against the case of rescaled inputs. Did you use the original MATLAB codes for 91.67%? How are your eval procedures for both?

zhijiew · 2018-01-26T02:15:53Z

I use your codes generate gray images and use the original matlab scripts contained in caffe version to evaluate.

And I also tried test data on the pascal voc server, it reached 80.77 miou, which is much lower than the performance of pspnet on the leaderboard.

wangbofei11 · 2018-01-31T03:20:20Z

Thanks for your excellent work.But I also found the accuracy problem. I test some realword images for the voc modle，the result of the converted pytorch modle is some little worse than the original caffe modle for almost all images。The test code and the input image resolution is same。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

accuracy is much lower #2

accuracy is much lower #2

luhc15 commented Dec 29, 2017

zhijiew commented Dec 30, 2017

kazuto1011 commented Dec 30, 2017

luhc15 commented Jan 2, 2018

luhc15 commented Jan 2, 2018

zhijiew commented Jan 24, 2018

kazuto1011 commented Jan 25, 2018

zhijiew commented Jan 26, 2018

wangbofei11 commented Jan 31, 2018

accuracy is much lower #2

accuracy is much lower #2

Comments

luhc15 commented Dec 29, 2017

zhijiew commented Dec 30, 2017

kazuto1011 commented Dec 30, 2017

luhc15 commented Jan 2, 2018

luhc15 commented Jan 2, 2018

zhijiew commented Jan 24, 2018

kazuto1011 commented Jan 25, 2018

zhijiew commented Jan 26, 2018

wangbofei11 commented Jan 31, 2018