Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are there better results when using images in range [0, 255] instead of [0, 1]? #1

Open
Nick-Morgan opened this issue Sep 2, 2020 · 1 comment

Comments

@Nick-Morgan
Copy link

I was running into issues trying to re-create the original paper, and stumbled upon this repository.

I was able to re-create the results when using the caffe pretrained model (which has images in the range of [0, 255]), but had drastically different results when using pytorch's pretrained model (which has images in the range of [0, 1]). I noticed this tidbit of code in your repository:

# normalize using ImageNet's mean
# [0, 255] range worked much better for me than [0, 1] range (even though PyTorch models were trained on latter)
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Lambda(lambda x: x.mul(255)),
transforms.Normalize(mean=IMAGENET_MEAN_255, std=IMAGENET_STD_NEUTRAL)
])

I applied that same transformation, and got results that are comparable to the original paper. I am somewhat confused about why this works, though. If pytorch's vgg19 is trained on millions of images in the range of [0, 1], wouldn't it just interpret anything above 1 as being pure white?

@gordicaleksa
Copy link
Owner

Hi Nick!

I have it on my backlog to try and make it work on the [0, 1] range as it feels more natural for PyTorch models as they were pre-trained, as you already said, on [0, 1] range imagery in contrast with those old caffe models.

What I did, because I was puzzled the same as you are, was pass in a [0, 255] say dog image into VGG and check whether the classification output is correct. And it was. The argument being (my hypothesis) that of symmetry. That's why it's working. VGG is able to do correct classifications even for [0, 255] range.

It should/must work for [0, 1] range I'd just need a bit more experimentation. If you figure it out before me please feel free to create a PR and notify me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants