Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about the model structure and the reappearance of the results. #39

Open
Liu-1994 opened this issue Apr 6, 2020 · 4 comments

Comments

@Liu-1994
Copy link

Liu-1994 commented Apr 6, 2020

Hello, thank you very much for providing the implementation code of the DG-Net model.
I encountered some problems during the implementation of the project. I will be honored if you can give me some suggestions.

  1. The difference of the model structure between the provided trained model and the code.
    I have successfully evaluated the DG-Net trained model that you provided in the github. However, when I resumed the training from the trained model that you provided, the project reported an error as follow:
    2020-04-06 15-52-58屏幕截图
    I found the extra weights were the non_local layer of the ResBlock. However, after I changed the res_type of the ContentEncoder and Decoder, the weights was still wrong.
    2020-04-06 15-52-16屏幕截图

  2. The reapperance cannot achieve the expected map.
    I evaluated the DG-Net trained model that you provided and got the mAP for 0.8609 when alpha was 0.5. However, when I retrained, I only got the mAP for 0.8466. I had loaded the teacher model and the config was configs/latest.yaml. Is there anything else I did not notice?
    By the way, I used the reid_eval/test_2label.py for evaluate.

I will be grateful if you can give me some suggestions. Thank you!

@layumi
Copy link
Contributor

layumi commented Apr 11, 2020

Hi @Liu-1994

  1. It is due to that I added the non-local layer to the model before the paper submission. But we did not use it. When we decided to release our code, we simplified the code and removed the non-local layers which takes extra GPU memory. You may try to load the model by 'strict=False'.

  2. Did you use fp16 or anything else? fp16 will lead to performance drop 1 percent.

@Liu-1994
Copy link
Author

@layumi Thank you for your reply.
I hava understood the structure of the provided model.
About the mAP, I think I do not use fp16 as the config is apex: false. and I have not installed the NVIDIA/apex.
I have a small problem. As the provided trained model have weights about non_local layers, would the structure of the model affect the mAP?
By the way, what version of pytorch did you use? According to Pytorch's official documentation, PyTorch 1.1.0 and later versions adjust the order of lr_scheduler.step() and optimizer.step(), which may affect the reappearance of the results.

@layumi
Copy link
Contributor

layumi commented Apr 12, 2020

@Liu-1994
No. It will not affect the performance, since I did not include the non-local layer in the forward function.

I might utilize the pytorch 1.0.0 before the paper submission.
By the way, how many GPU do you use? It may affect the performance.

@Liu-1994
Copy link
Author

@layumi Thanks for your reply.
I should only use a GPU, as the gpu-ids is only a number 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants