Skip to content
This repository has been archived by the owner on Mar 15, 2024. It is now read-only.

Questions about hyper-parameters in the finetuning stage #152

Open
falcon-xu opened this issue Mar 6, 2022 · 3 comments
Open

Questions about hyper-parameters in the finetuning stage #152

falcon-xu opened this issue Mar 6, 2022 · 3 comments
Assignees

Comments

@falcon-xu
Copy link

Hi, it's a wonderful work.
I am trying to replicate the results of the paper that have been fine-tuned to datasets. As you know, it often needs to take a long time to adjust hyper-parameters to get well trained transformer-based models. If you can show all the settings, it will be really helpful.

I saw some settings in #105, #45, but it lacks other size of models and datasets such as DeiT-Ti, iNat.

Could you please help summarize all size of model on different finetuning datasets mentioned in the paper?

Maybe a table form is very clear, just like the following:

model type pretrained dataset finetuned dataset lr bs wd sched epochs warmup ...
ViT-B
ViT-L
...
DeiT-Ti
DeiT-S
...

I really appreciate it.

@TouvronHugo TouvronHugo self-assigned this Mar 29, 2022
@TouvronHugo
Copy link
Contributor

Hi @lostsword,
Thank you for your suggestion.
As soon as I have some time I complete this table and add it to the Readme.
I'll keep you informed.
Best,
Hugo

@falcon-xu
Copy link
Author

Hi @lostsword, Thank you for your suggestion. As soon as I have some time I complete this table and add it to the Readme. I'll keep you informed. Best, Hugo

OK. Thanks a lot.

@HashmatShadab
Copy link

Hi!

This would help a lot! @TouvronHugo any update on when it would be possible?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants