Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix accuracy issues for both Keras (all models) and PyTorch (inception_v3) #5

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

rwightman
Copy link

This should improve accuracy as it makes preprocessing consistent with typical imagenet handling. I'm not sure why Keras makes this so difficult. I defaulted the crop scaling to 0.875 for all, but some models, especially larger inception/xception/nasnet may work better with a larger value. I've seen 0.8975 used in places.

Inception_v3 in PyTorch now produces, acc1: 77.32%, acc5: 93.43%

ResNet50 in Keras, acc1: 74.79%, acc5: 91.96%

* use bilinear interpolation, Keras default of 'nearest' is wrong for imagenet
* scale shortest edge of image to target_size / scale_factor before center crop
@rwightman
Copy link
Author

ResNet50 in Keras is acc1: 75.00%, acc5: 92.16% w/ the interpolation set to bicubic instead of bilinear

I'd create a per framework per model scale factor and interpolation method to get the best results. It'll be different based on the source of each model's original trained weights...

@cgnorthcutt
Copy link
Owner

@rwightman This will still work based on how the code works here: https://github.com/pytorch/vision/blob/master/torchvision/transforms/transforms.py#L182, but you're right that its less readable and makes assumptions about whether or not the torchvision authors may or may not choose to throw an exception. I can remove later on.

@cgnorthcutt
Copy link
Owner

cgnorthcutt commented Apr 17, 2019

@rwightman In the big scope of your PR. Overall, this is great. The thing is, this fundamentally changes what the goal of this package is. Ideally, we have a package here that shows you benchmarks for out-of-the-box comparison. If we start adding more and more code to make Keras more like PyTorch... and then maybe PyTorch more like Keras... eventually this repo will just become a way to make PyTorch act like Keras and Keras act like PyTorch. Is this really what we want? Does this help researchers? I've run your code internally and everything looks good, but I'm fundamentally not sure if this is the direction we want to go. Happy to hear your thoughts.

@rwightman
Copy link
Author

rwightman commented Apr 18, 2019

@cgnorthcutt I understand what you mean wrt your goal. Ultimately up to you. You can provide the code that does the eval as it should be, and provide a reference for those trying to do the same. It appears many people are confused by this who use Keras and are looking for the proper way. Or you can leave it underperforming and make a statement that better defaults should be made obvious.

I'm not a Keras person and was actually blown away with the lack of support for evaluating on ImageNet with the defaults. I expected there'd be a one liner 'evaluate' method with some sort of arguments to spec the necessary crop and scaling, but it didn't appear that the evaluate method available gave you the freedom to do the necessary preprocessing. It seems you must resort to building the batches manually and running inference as is done here. mind blown

The expectation, from AlexNet era ILSVRC challenges to now, has been to take a center crop of an image, such that the crop is 87.5% of the shortest edge of the original. It started as 'scale shortest to 256 and then c crop to 224', but the trend was followed with other input sizes like 299 for the inception models. This has generally been the formula with which many of the subsequent imagenet models have compared their results to each other in academic papers and code releases. Most of source models (from Caffe, Tensorflow, etc) that Keras pulled into their pretrained weights were built and evaluated under these assumptions, so not evaluating against imagenet for those weights will not reflect the actual capability of the model.

The default arg of 'nearest neighbour' interpolation for image scaling in Keras is also a huge WTF. Every other framework defaults to either bilinear or bicubic and results in some sort of sane scaling between source image size and the target network size. Nearest neighbour is just not acceptable and results in a 2-3% drop in performance on top of the 2-3% from using the wrong crop...

@ozabluda
Copy link

@rwightman

The default arg of 'nearest neighbour' interpolation for image scaling in Keras is also a huge WTF. Every other framework defaults to either bilinear or bicubic and results in some sort of sane scaling between source image size and the target network size. Nearest neighbour is just not acceptable and results in a 2-3% drop in performance on top of the 2-3% from using the wrong crop...

Internal Keras Data Augmentation preprocessing was unconditionally changed from nearest to bilinear in PR keras-team/keras#8849

but in load_img(), you have to specify bilinear explicitly, see PR keras-team/keras#8435

Both PRs provide historical/etc context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants