Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doubts about shared classifier in val phase #10

Open
Zysty opened this issue Aug 31, 2022 · 4 comments
Open

doubts about shared classifier in val phase #10

Zysty opened this issue Aug 31, 2022 · 4 comments

Comments

@Zysty
Copy link

Zysty commented Aug 31, 2022

Hi! Thanks for your sharing.

There may be differences between the paper and code.
In the paper , it can be seen that "we share the classifier between with or without the BatchFormer during training, which can thus be removed during testing".
While in the code, it can be found that the output of val phase, "logits", is the average obtained by the "self.logits" and "logits_old".
As a result, It seems like you still use BatchFormer in the val phase.
Could you please answer my doubts?

self.logits = (self.logits + logits_old) / 2.

@Zysty Zysty changed the title doubts about shared classifier in test phase doubts about shared classifier in val phase Aug 31, 2022
@zhihou7
Copy link
Owner

zhihou7 commented Aug 31, 2022

Hi @Zysty,
Thanks for your questions. It is for the ablation study that we do not use in our main experiment. We actually do not include this ablation study in our paper. You can find there is a condition in L320.

The program will execute L321 only when you set eval_batch. "eval_batch" indicates we evaluate the method with a mini-batch. I just want to check the result when I average the features before and after batchformer. Empirically, this kind of evaluation does not improve the performance after we share the classifier during training phrase.

Sorry for the redundant code. I mainly conduct the ablation study, visualized comparison (self.debug) and gradient analysis (self.debug) based on BalancedSoftmax. Therefore, there might be some redundant codes in BalancedSoftmax.

Feel free to post if you have other questions.

Regards,

@Zysty
Copy link
Author

Zysty commented Aug 31, 2022

Thank you for the prompt reply.
So in practice, the "self.logits" is the unique term used in the val phase.

Looking forward to hearing good news from BatchformerV2, V3, and so on. Haha :)

@zhihou7
Copy link
Owner

zhihou7 commented Aug 31, 2022

Yes. When I infer the model, I do not use eval_batch. It is just for debugging and ablation study.

Thanks. It might require providing a novel insight for a new work compared to the current work. Otherwise, it mainly presents a generalized version and shows the possibility of new model architectures compared to current work.

@zhihou7
Copy link
Owner

zhihou7 commented Sep 9, 2022

Hi @Zysty,
Thanks for your questions. I remember I provided the ablation study in the appendix C.2 in BatchFormerV2. Appendix C.2 presents a quantitative illustration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants