Errors when I run generation #36

ChaoGaoUCR · 2023-08-09T02:56:20Z

Dear Authors,
Sorry for bothering you.
I am hitting errors for all datasets I tried to run the evaluation.

Could you please take a look at this?

Thanks!

HZQ950419 · 2023-08-09T04:43:37Z

Hi,

According to the error message, one possible reason is that the fine-tuning of the model crashed. Can you check the training loss when you are fine-tuning the model? if the training loss goes to 0 and eval loss goes to nan, it means the training crashed.

ChaoGaoUCR · 2023-08-09T04:52:28Z

Thank you so much for the fast replying.
Actually, I deleted the config part and it passed...
I am trying to find out why...
I think you are right, the tuned model may have some problems...

HZQ950419 · 2023-08-09T13:35:25Z

Hi,

According to the issue tloen/alpaca-lora#408, it seems like a CUDA issue. However, I can't reproduce the error from my side. But I found a solution for it by commenting Line 51-53 in evaluate.py.

If you have further questions, please let us know!

ChaoGaoUCR · 2023-08-09T17:08:01Z

Thank you so much,
The problem got resolved!!
I downgrade the Cuda to 11.6 and all is resolved!

ZeguanXiao · 2024-05-07T12:58:30Z

Hi,

According to the error message, one possible reason is that the fine-tuning of the model crashed. Can you check the training loss when you are fine-tuning the model? if the training loss goes to 0 and eval loss goes to nan, it means the training crashed.

How to resolve the crash of the experiment?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors when I run generation #36

Errors when I run generation #36

ChaoGaoUCR commented Aug 9, 2023

HZQ950419 commented Aug 9, 2023

ChaoGaoUCR commented Aug 9, 2023

HZQ950419 commented Aug 9, 2023

ChaoGaoUCR commented Aug 9, 2023 •

edited

ZeguanXiao commented May 7, 2024

Errors when I run generation #36

Errors when I run generation #36

Comments

ChaoGaoUCR commented Aug 9, 2023

HZQ950419 commented Aug 9, 2023

ChaoGaoUCR commented Aug 9, 2023

HZQ950419 commented Aug 9, 2023

ChaoGaoUCR commented Aug 9, 2023 • edited

ZeguanXiao commented May 7, 2024

ChaoGaoUCR commented Aug 9, 2023 •

edited