Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SwinV2在MSMT17测试集的问题 #367

Open
lianshengzhou opened this issue May 7, 2023 · 14 comments
Open

SwinV2在MSMT17测试集的问题 #367

lianshengzhou opened this issue May 7, 2023 · 14 comments

Comments

@lianshengzhou
Copy link

您好,想问一下为什么用swinV2在MSMT17测试的时候出现BUG:
RuntimeError: The size of tensor a (4) must match the size of tensor b (32) at non-singleton dimension 1
发现是在这一步出现问题:Traceback (most recent call last):
File "/content/Person_reID_baseline_pytorch/test.py", line 316, in
gallery_feature = extract_feature(model,dataloaders['gallery'])
File "/content/Person_reID_baseline_pytorch/test.py", line 207, in extract_feature
outputs = model(input_img)
为什么在Market-1501数据集上不会出现这种BUG呢?

@lianshengzhou lianshengzhou changed the title Swin在MSMT17测试集的问题 SwinV2在MSMT17测试集的问题 May 7, 2023
@chesianatalia
Copy link

Hi, I also have the same issue with a different dataset.
here is the screenshot, I tried months ago and found no problem like this, and suddenly this problem appears. Do you know how to fix it? Thank You

image

@layumi
Copy link
Owner

layumi commented May 8, 2023

Hello. It seems like the input size.
Could you resize the image size as the same one during your training?

@lianshengzhou
Copy link
Author

但是我在market数据集用Resnet-50、Densennet以及Swin Transformer V2均可以正常运行,在MSMT17数据集上Resnet-50和Densenet均可以正常运行,是两个数据集之间的差异吗?

@layumi
Copy link
Owner

layumi commented May 11, 2023

@lianshengzhou
诶,msmt17应该也是可以的跑的。我怀疑是 input size不对。 您方便的话 可以 print 一下 input size对比一下。

@chesianatalia
Copy link

Hello, I've checked the input size, and found that the input size was the same during my training and I'm still getting the same error, is there any other solution for this issue? Thank You
the screenshot is attached below:
image

this is the input size in my training:
image

@layumi
Copy link
Owner

layumi commented May 11, 2023

Thank you @chesianatalia

I will try it and return to you soon.

@layumi
Copy link
Owner

layumi commented May 11, 2023

Thank you @chesianatalia @lianshengzhou

I found why. It is due to torch.jit.trace
You can comment out these two lines. The test should be Okay now.
https://github.com/layumi/Person_reID_baseline_pytorch/blob/master/test.py#L306-L307

But I am also curious why torch.jit can work with Market but MSMT. It is interesting.

@layumi
Copy link
Owner

layumi commented May 11, 2023

Just for your reference.
You can check my test MSMT code at https://github.com/layumi/Person_reID_baseline_pytorch/blob/master/test_MSMT.py

@lianshengzhou
Copy link
Author

@layumi @chesianatalia 试了一下,能够成功运行,结果也很好,感谢及时回复和帮助。

@chesianatalia
Copy link

Thank you for your explanation, i'll working on it

@chesianatalia
Copy link

chesianatalia commented May 12, 2023

Hi @layumi i've already comment out two lines as you mentioned earlier, turns out there are new error comes up. I've never see an error like this before, and when i undo what i did, the error is still the same, and now it affect on my other model (not only SwinV2). Kindly need your advice

the screenshot attached below:
image

here are the full error:

File "/content/drive/MyDrive/ChesiaGraceNatalia-TA/Person_reID_baseline_pytorch/test.py", line 279, in
model = load_network(model_structure)
File "/content/drive/MyDrive/ChesiaGraceNatalia-TA/Person_reID_baseline_pytorch/test.py", line 157, in load_network
network.load_state_dict(torch.load(save_path))
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ft_net_swinv2:
Missing key(s) in state_dict: "model.layers.3.downsample.reduction.weight", "model.layers.3.downsample.norm.weight", "model.layers.3.downsample.norm.bias".
Unexpected key(s) in state_dict: "model.layers.0.downsample.reduction.weight", "model.layers.0.downsample.norm.weight", "model.layers.0.downsample.norm.bias", "model.layers.0.blocks.1.attn_mask", "model.layers.1.blocks.1.attn_mask", "model.layers.2.blocks.1.attn_mask", "model.layers.2.blocks.3.attn_mask", "model.layers.2.blocks.5.attn_mask", "model.layers.2.blocks.7.attn_mask", "model.layers.2.blocks.9.attn_mask", "model.layers.2.blocks.11.attn_mask", "model.layers.2.blocks.13.attn_mask", "model.layers.2.blocks.15.attn_mask", "model.layers.2.blocks.17.attn_mask".
size mismatch for model.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]).
size mismatch for model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for model.layers.1.downsample.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for model.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([512, 1024]).
size mismatch for model.layers.2.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for model.layers.2.downsample.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).

@layumi
Copy link
Owner

layumi commented May 12, 2023

@chesianatalia

  1. Could I know the version of your timm?

  2. Could you re-train a new SwinV2 model and test the result?

@chesianatalia
Copy link

@layumi Hi, I think you are right, I checked another issue, it was due to timm version, I downgraded it, and suddenly it worked, Thank you very much for your suggestion.
Have a great day!

@layumi
Copy link
Owner

layumi commented May 13, 2023

Thank you @chesianatalia

Yes.

BTW, old timm like 0.6. 0.7 may be not a stable SwinV2.

For the latest timm, you also need python >=3.8, otherwise you may meet some ``typing'' error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants