SwinV2在MSMT17测试集的问题 #367

lianshengzhou · 2023-05-07T06:04:31Z

您好，想问一下为什么用swinV2在MSMT17测试的时候出现BUG：
RuntimeError: The size of tensor a (4) must match the size of tensor b (32) at non-singleton dimension 1
发现是在这一步出现问题：Traceback (most recent call last):
File "/content/Person_reID_baseline_pytorch/test.py", line 316, in
gallery_feature = extract_feature(model,dataloaders['gallery'])
File "/content/Person_reID_baseline_pytorch/test.py", line 207, in extract_feature
outputs = model(input_img)
为什么在Market-1501数据集上不会出现这种BUG呢？

chesianatalia · 2023-05-07T08:45:27Z

Hi, I also have the same issue with a different dataset.
here is the screenshot, I tried months ago and found no problem like this, and suddenly this problem appears. Do you know how to fix it? Thank You

layumi · 2023-05-08T03:32:05Z

Hello. It seems like the input size.
Could you resize the image size as the same one during your training?

lianshengzhou · 2023-05-09T15:55:40Z

但是我在market数据集用Resnet-50、Densennet以及Swin Transformer V2均可以正常运行，在MSMT17数据集上Resnet-50和Densenet均可以正常运行，是两个数据集之间的差异吗？

layumi · 2023-05-11T05:28:27Z

@lianshengzhou
诶，msmt17应该也是可以的跑的。我怀疑是 input size不对。您方便的话可以 print 一下 input size对比一下。

chesianatalia · 2023-05-11T06:30:12Z

Hello, I've checked the input size, and found that the input size was the same during my training and I'm still getting the same error, is there any other solution for this issue? Thank You
the screenshot is attached below:

this is the input size in my training:

layumi · 2023-05-11T12:59:52Z

Thank you @chesianatalia

I will try it and return to you soon.

layumi · 2023-05-11T16:55:09Z

Thank you @chesianatalia @lianshengzhou

I found why. It is due to torch.jit.trace
You can comment out these two lines. The test should be Okay now.
https://github.com/layumi/Person_reID_baseline_pytorch/blob/master/test.py#L306-L307

But I am also curious why torch.jit can work with Market but MSMT. It is interesting.

layumi · 2023-05-11T17:05:16Z

Just for your reference.
You can check my test MSMT code at https://github.com/layumi/Person_reID_baseline_pytorch/blob/master/test_MSMT.py

lianshengzhou · 2023-05-12T05:45:08Z

@layumi @chesianatalia 试了一下，能够成功运行，结果也很好，感谢及时回复和帮助。

chesianatalia · 2023-05-12T09:31:27Z

Thank you for your explanation, i'll working on it

chesianatalia · 2023-05-12T10:33:37Z

Hi @layumi i've already comment out two lines as you mentioned earlier, turns out there are new error comes up. I've never see an error like this before, and when i undo what i did, the error is still the same, and now it affect on my other model (not only SwinV2). Kindly need your advice

the screenshot attached below:

here are the full error:

File "/content/drive/MyDrive/ChesiaGraceNatalia-TA/Person_reID_baseline_pytorch/test.py", line 279, in
model = load_network(model_structure)
File "/content/drive/MyDrive/ChesiaGraceNatalia-TA/Person_reID_baseline_pytorch/test.py", line 157, in load_network
network.load_state_dict(torch.load(save_path))
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ft_net_swinv2:
Missing key(s) in state_dict: "model.layers.3.downsample.reduction.weight", "model.layers.3.downsample.norm.weight", "model.layers.3.downsample.norm.bias".
Unexpected key(s) in state_dict: "model.layers.0.downsample.reduction.weight", "model.layers.0.downsample.norm.weight", "model.layers.0.downsample.norm.bias", "model.layers.0.blocks.1.attn_mask", "model.layers.1.blocks.1.attn_mask", "model.layers.2.blocks.1.attn_mask", "model.layers.2.blocks.3.attn_mask", "model.layers.2.blocks.5.attn_mask", "model.layers.2.blocks.7.attn_mask", "model.layers.2.blocks.9.attn_mask", "model.layers.2.blocks.11.attn_mask", "model.layers.2.blocks.13.attn_mask", "model.layers.2.blocks.15.attn_mask", "model.layers.2.blocks.17.attn_mask".
size mismatch for model.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]).
size mismatch for model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for model.layers.1.downsample.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for model.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([512, 1024]).
size mismatch for model.layers.2.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for model.layers.2.downsample.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).

layumi · 2023-05-12T12:39:39Z

@chesianatalia

Could I know the version of your timm?
Could you re-train a new SwinV2 model and test the result?

chesianatalia · 2023-05-13T03:39:22Z

@layumi Hi, I think you are right, I checked another issue, it was due to timm version, I downgraded it, and suddenly it worked, Thank you very much for your suggestion.
Have a great day!

layumi · 2023-05-13T05:34:15Z

Thank you @chesianatalia

Yes.

BTW, old timm like 0.6. 0.7 may be not a stable SwinV2.

For the latest timm, you also need python >=3.8, otherwise you may meet some ``typing'' error.

lianshengzhou changed the title ~~Swin在MSMT17测试集的问题~~ SwinV2在MSMT17测试集的问题 May 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SwinV2在MSMT17测试集的问题 #367

SwinV2在MSMT17测试集的问题 #367

lianshengzhou commented May 7, 2023

chesianatalia commented May 7, 2023

layumi commented May 8, 2023

lianshengzhou commented May 9, 2023

layumi commented May 11, 2023

chesianatalia commented May 11, 2023

layumi commented May 11, 2023

layumi commented May 11, 2023 •

edited

layumi commented May 11, 2023

lianshengzhou commented May 12, 2023

chesianatalia commented May 12, 2023

chesianatalia commented May 12, 2023 •

edited

layumi commented May 12, 2023

chesianatalia commented May 13, 2023

layumi commented May 13, 2023

SwinV2在MSMT17测试集的问题 #367

SwinV2在MSMT17测试集的问题 #367

Comments

lianshengzhou commented May 7, 2023

chesianatalia commented May 7, 2023

layumi commented May 8, 2023

lianshengzhou commented May 9, 2023

layumi commented May 11, 2023

chesianatalia commented May 11, 2023

layumi commented May 11, 2023

layumi commented May 11, 2023 • edited

layumi commented May 11, 2023

lianshengzhou commented May 12, 2023

chesianatalia commented May 12, 2023

chesianatalia commented May 12, 2023 • edited

layumi commented May 12, 2023

chesianatalia commented May 13, 2023

layumi commented May 13, 2023

layumi commented May 11, 2023 •

edited

chesianatalia commented May 12, 2023 •

edited