DKD: the nckd loss becomes infinity after one iteration of training. #60

OuYangg · 2024-01-30T11:21:30Z

First of all, thank the author for open source. DKD is a very good job!

I trained a binary classification model and wanted to use DKD for distillation, but after one iteration of training, the output of the student model became very strange, causing the loss to become Inf.

Some details:
iteration 1:

iteration 2:

OuYangg · 2024-01-30T11:44:18Z

After I adjusted the learning rate down, the loss is no longer nan, but the nckd loss is still 0.

Zzzzz1 · 2024-02-19T09:45:20Z

NCKD loss can not be applied on binary classification tasks, since there is no "non-target class". The non-target logits will be always 1.0, may introduce some problems when calculating KLD.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DKD: the nckd loss becomes infinity after one iteration of training. #60

DKD: the nckd loss becomes infinity after one iteration of training. #60

OuYangg commented Jan 30, 2024

OuYangg commented Jan 30, 2024

Zzzzz1 commented Feb 19, 2024

DKD: the nckd loss becomes infinity after one iteration of training. #60

DKD: the nckd loss becomes infinity after one iteration of training. #60

Comments

OuYangg commented Jan 30, 2024

OuYangg commented Jan 30, 2024

Zzzzz1 commented Feb 19, 2024