Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding IMAGENET1K_V1 and IMAGENET1K_V2 weights #8382

Open
asusdisciple opened this issue Apr 17, 2024 · 0 comments
Open

Regarding IMAGENET1K_V1 and IMAGENET1K_V2 weights #8382

asusdisciple opened this issue Apr 17, 2024 · 0 comments

Comments

@asusdisciple
Copy link

asusdisciple commented Apr 17, 2024

馃悰 Describe the bug

I found a very strange "bug" while I was trying to find similiar instances in a vector database of pictures. The model I used is ResNet50. The problem occurs only when using the IMAGENET1K_V2 weights, but does not appear when using the legacy V1 weights (referring to https://pytorch.org/blog/how-to-train-state-of-the-art-models-using-torchvision-latest-primitives/).

When I calculate the cosine similarity with V1 weights for two almost identical pictures I get values > 0.95, however when I use V2 weights with the same pictures I get values < 0.7. In layman terms with V2 identical pictures are not recognized as such anymore. I gave you two example pictures below and the code to reproduce the problem. Does somebody have a concise explanation for this behaviour?

When you increase the size in your transform.resize((x, y)) the problem gradually begins to vanish, however this is not really a good solution since it produces overhead during inference.

Would be happy for any insights on this topic :)

from torchvision import models
from torchvision.models import ResNet50_Weights
import torchvision.io
from torch import nn
import numpy as np
from numpy.linalg import norm

class Identity(nn.Module):
    def __init__(self):
        super(Identity, self).__init__()

    def forward(self, x):
        return x

# Get weights
weights = ResNet50_Weights.IMAGENET1K_V1
preprocess = weights.transforms()

model = models.resnet50(weights=ResNet50_Weights.IMAGENET1K_V1).to("cuda:0")
model.fc = Identity()

a = model(preprocess(torchvision.io.read_image("/raid/..../datasets/lion/lion_ori_small.jpg").unsqueeze(dim=0).to("cuda:0"))).cpu().detach().numpy().squeeze()
b = model(preprocess(torchvision.io.read_image("/raid/.../datasets/lion/lion_fake_small.jpg").unsqueeze(dim=0).to("cuda:0"))).cpu().detach().numpy().squeeze()
cosine = np.dot(a,b)/(norm(a)*norm(b))

lion_fake
lion_ori

Versions

torchvision 0.19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant