Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't reproduce the results for GLUE and hyperparameter misalignment #149

Open
nbasyl opened this issue Nov 22, 2023 · 4 comments
Open

Can't reproduce the results for GLUE and hyperparameter misalignment #149

nbasyl opened this issue Nov 22, 2023 · 4 comments

Comments

@nbasyl
Copy link

nbasyl commented Nov 22, 2023

Hi,
Thanks for the great work.

I am trying to reproduce the result of Roberta-large on the NLU tasks, however, I got a CoLA score = 0 and MNLI = 31.3 using the provided finetuning scripts, and then I found out that there are misalignments between the hyperparameters in the provided training scripts and those on the paper. For example, in roberta_large_cola.sh the lr is set to 3e-4, but in the paper, it is set to 2e-4. Which settings should I follow to reproduce the reported result?

looking forward to your reply!

Best,
Sean

@nbasyl
Copy link
Author

nbasyl commented Nov 22, 2023

I changed the lr in the CoLA training script to 2e-4 and solved the CoLA constant 0 eval correlation value problem, but still couldn't reproduce the MNLI result :(

@nbasyl
Copy link
Author

nbasyl commented Nov 22, 2023

But I am still only getting 62.82 CoLA score, anyone encountered similar problem when trying to reproduce the result

@zxchasing
Copy link

But I am still only getting 62.82 CoLA score, anyone encountered similar problem when trying to reproduce the result

Hi,Did you solve this problem?

@Car-pe
Copy link

Car-pe commented Apr 14, 2024

I changed the lr in the CoLA training script to 2e-4 and solved the CoLA constant 0 eval correlation value problem, but still couldn't reproduce the MNLI result :(

My result in CoLA is 63.48 which matches the paper. And the random seeds used are (1 3 13 37 71), but I can not reproduce other task, only CoLA can match the paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants