Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors when I run generation #36

Open
ChaoGaoUCR opened this issue Aug 9, 2023 · 5 comments
Open

Errors when I run generation #36

ChaoGaoUCR opened this issue Aug 9, 2023 · 5 comments

Comments

@ChaoGaoUCR
Copy link

Dear Authors,
Sorry for bothering you.
I am hitting errors for all datasets I tried to run the evaluation.

image
Could you please take a look at this?

Thanks!

@HZQ950419
Copy link
Collaborator

Hi,

According to the error message, one possible reason is that the fine-tuning of the model crashed. Can you check the training loss when you are fine-tuning the model? if the training loss goes to 0 and eval loss goes to nan, it means the training crashed.

@ChaoGaoUCR
Copy link
Author

Thank you so much for the fast replying.
Actually, I deleted the config part and it passed...
I am trying to find out why...
I think you are right, the tuned model may have some problems...

@HZQ950419
Copy link
Collaborator

Hi,

According to the issue tloen/alpaca-lora#408, it seems like a CUDA issue. However, I can't reproduce the error from my side. But I found a solution for it by commenting Line 51-53 in evaluate.py.

If you have further questions, please let us know!

@ChaoGaoUCR
Copy link
Author

ChaoGaoUCR commented Aug 9, 2023

Thank you so much,
The problem got resolved!!
I downgrade the Cuda to 11.6 and all is resolved!

@ZeguanXiao
Copy link

Hi,

According to the error message, one possible reason is that the fine-tuning of the model crashed. Can you check the training loss when you are fine-tuning the model? if the training loss goes to 0 and eval loss goes to nan, it means the training crashed.

How to resolve the crash of the experiment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants