Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using the cosine similarity between embeddings generated by OsNET-Ain doesn't provide good results, HELP !! #568

Open
IGlace opened this issue Jan 10, 2024 · 0 comments

Comments

@IGlace
Copy link

IGlace commented Jan 10, 2024

Hello,

I'm working on a project that use graph neural networks, using embedding of bounding boxes images of persons and their similarity.
The model i use is OSNet-AIN, because according to the researches is the one that provide the best results in the REID field, and the method i use to calculate the similarity between images is simply the cosine similarity by calculating the similarity between the embedding and use that as input to the graph.

My problem is the similarity results are not logical, i did a simple test where i took two pictures of person 1 from front and behind and a picture of person 2 from the front, i extract the exact bounding boxes of the person in the 3 images, and i extracted the embedding from those bounding boxes using OSNet-AIN model, and finally i calculated the similarity between the embedding of person 1 from the front and person 2 from the front, and similarity of person 1 from the front and person 1 from the back.

The expected result following the concept of REID models, is that the second experiment of person 1 from both sides should have more similarity that person 1 and person 2 in the same side, but that's not the result i received, in fact i got that person 1 and person 2 are more similar just because their picture got taken from the same side, compared to the similarity between person 1 from both side, which is not logical, i would like someone to explain to me this situation, or I'm doing something wrong by comparing the similarity using cosine alone, is there something I'm missing, you can check the code I'm using for the experiment below.

import torchreid
from torchreid import utils
from torchreid import models
from scipy.spatial import distance
from torchreid.utils import FeatureExtractor, load_pretrained_weights
import os
import cv2
import time
from torchvision import transforms
import torch
import torchvision.transforms as T

extractor = FeatureExtractor(
    model_name = 'osnet_ain_x1_0',
    device = 'cpu' 
)

image_size=(256, 128)
transforms = []
transforms += [T.ToPILImage()]
transforms += [T.Resize(image_size)]
transforms += [T.ToTensor()]
preprocess = T.Compose(transforms)

img1 = cv2.imread(os.path.join("image_1.jpg"))
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)
img1 = preprocess(img1)
img1 = img1.unsqueeze(0)

img2 = cv2.imread(os.path.join("image_2.jpg"))
img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)
img2 = preprocess(img2)
img2 = img2.unsqueeze(0)

concat_tensor = torch.cat([img1, img2])
with torch.no_grad():
   features = extractor(concat_tensor)

print("Similarity : ", torch.nn.functional.cosine_similarity(features.data[0], features.data[1], dim=0))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant