You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MAML requires 2nd-order gradients, which requires large computation.
For the fast training, we use 1st-order approximation of the gradients at the beginning of the training, and then after SECOND_ORDER_GRAD_ITER, we use the full gradients.
Therefore, SECOND_ORDER_GRAD_ITER is to decide how many steps to approximate the gradients within 1st-order.
For the self.total_loss1, you are right. it is not used for the training. You may ignore that loss and corresponding optimizer.
Dear Sir,
Amazing work ! Congratulation!!
please , I have a question.can you kindly provide me with the full path I should insert of checkpoint the trained large scale training model to be able to use it as a pre-trained to meta transfer training? as it says that there is no check point file
I'm waiting for your reply.
Thanks in advance
What is the usage of
SECOND_ORDER_GRAD_ITER=0
andself.total_loss1
?As for
SECOND_ORDER_GRAD_ITER=0
:If we have finished pre-training on large scale datasets, I think it is useless in this meta-transfer learning step.
As for
self.total_loss1
:In this meta-transfer learning step,
total_loss1
is never used for optimizers. Is it correct?The text was updated successfully, but these errors were encountered: