Fix accuracy issues for both Keras (all models) and PyTorch (inception_v3) #5

rwightman · 2019-04-12T21:35:31Z

This should improve accuracy as it makes preprocessing consistent with typical imagenet handling. I'm not sure why Keras makes this so difficult. I defaulted the crop scaling to 0.875 for all, but some models, especially larger inception/xception/nasnet may work better with a larger value. I've seen 0.8975 used in places.

Inception_v3 in PyTorch now produces, acc1: 77.32%, acc5: 93.43%

ResNet50 in Keras, acc1: 74.79%, acc5: 91.96%

…ng comparison

* use bilinear interpolation, Keras default of 'nearest' is wrong for imagenet * scale shortest edge of image to target_size / scale_factor before center crop

rwightman · 2019-04-12T21:55:03Z

ResNet50 in Keras is acc1: 75.00%, acc5: 92.16% w/ the interpolation set to bicubic instead of bilinear

I'd create a per framework per model scale factor and interpolation method to get the best results. It'll be different based on the source of each model's original trained weights...

cgnorthcutt · 2019-04-17T20:56:36Z

@rwightman This will still work based on how the code works here: https://github.com/pytorch/vision/blob/master/torchvision/transforms/transforms.py#L182, but you're right that its less readable and makes assumptions about whether or not the torchvision authors may or may not choose to throw an exception. I can remove later on.

cgnorthcutt · 2019-04-17T20:59:05Z

@rwightman In the big scope of your PR. Overall, this is great. The thing is, this fundamentally changes what the goal of this package is. Ideally, we have a package here that shows you benchmarks for out-of-the-box comparison. If we start adding more and more code to make Keras more like PyTorch... and then maybe PyTorch more like Keras... eventually this repo will just become a way to make PyTorch act like Keras and Keras act like PyTorch. Is this really what we want? Does this help researchers? I've run your code internally and everything looks good, but I'm fundamentally not sure if this is the direction we want to go. Happy to hear your thoughts.

rwightman · 2019-04-18T01:36:04Z

@cgnorthcutt I understand what you mean wrt your goal. Ultimately up to you. You can provide the code that does the eval as it should be, and provide a reference for those trying to do the same. It appears many people are confused by this who use Keras and are looking for the proper way. Or you can leave it underperforming and make a statement that better defaults should be made obvious.

I'm not a Keras person and was actually blown away with the lack of support for evaluating on ImageNet with the defaults. I expected there'd be a one liner 'evaluate' method with some sort of arguments to spec the necessary crop and scaling, but it didn't appear that the evaluate method available gave you the freedom to do the necessary preprocessing. It seems you must resort to building the batches manually and running inference as is done here. mind blown

The expectation, from AlexNet era ILSVRC challenges to now, has been to take a center crop of an image, such that the crop is 87.5% of the shortest edge of the original. It started as 'scale shortest to 256 and then c crop to 224', but the trend was followed with other input sizes like 299 for the inception models. This has generally been the formula with which many of the subsequent imagenet models have compared their results to each other in academic papers and code releases. Most of source models (from Caffe, Tensorflow, etc) that Keras pulled into their pretrained weights were built and evaluated under these assumptions, so not evaluating against imagenet for those weights will not reflect the actual capability of the model.

The default arg of 'nearest neighbour' interpolation for image scaling in Keras is also a huge WTF. Every other framework defaults to either bilinear or bicubic and results in some sort of sane scaling between source image size and the target network size. Nearest neighbour is just not acceptable and results in a 2-3% drop in performance on top of the 2-3% from using the wrong crop...

ozabluda · 2019-04-19T00:11:39Z

@rwightman

The default arg of 'nearest neighbour' interpolation for image scaling in Keras is also a huge WTF. Every other framework defaults to either bilinear or bicubic and results in some sort of sane scaling between source image size and the target network size. Nearest neighbour is just not acceptable and results in a 2-3% drop in performance on top of the 2-3% from using the wrong crop...

Internal Keras Data Augmentation preprocessing was unconditionally changed from nearest to bilinear in PR keras-team/keras#8849

but in load_img(), you have to specify bilinear explicitly, see PR keras-team/keras#8435

Both PRs provide historical/etc context

rwightman added 2 commits April 12, 2019 14:09

Fixes #4, incorrect image scaling for inception v3 and incorrect stri…

17b049c

…ng comparison

Fix Keras imagenet preprocessing, add batching for some speed gains

0885597

* use bilinear interpolation, Keras default of 'nearest' is wrong for imagenet * scale shortest edge of image to target_size / scale_factor before center crop

Simplify division arithmetic. Remove extra space.

0b14966

Remove redundancy

0fdcba6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix accuracy issues for both Keras (all models) and PyTorch (inception_v3) #5

Fix accuracy issues for both Keras (all models) and PyTorch (inception_v3) #5

rwightman commented Apr 12, 2019

rwightman commented Apr 12, 2019

cgnorthcutt commented Apr 17, 2019

cgnorthcutt commented Apr 17, 2019 •

edited

rwightman commented Apr 18, 2019 •

edited

ozabluda commented Apr 19, 2019

Fix accuracy issues for both Keras (all models) and PyTorch (inception_v3) #5

Are you sure you want to change the base?

Fix accuracy issues for both Keras (all models) and PyTorch (inception_v3) #5

Conversation

rwightman commented Apr 12, 2019

rwightman commented Apr 12, 2019

cgnorthcutt commented Apr 17, 2019

cgnorthcutt commented Apr 17, 2019 • edited

rwightman commented Apr 18, 2019 • edited

ozabluda commented Apr 19, 2019

cgnorthcutt commented Apr 17, 2019 •

edited

rwightman commented Apr 18, 2019 •

edited