What is the usage of SECOND_ORDER_GRAD_ITER=0 and self.total_loss1? #11

qhanson · 2020-06-02T21:34:00Z

What is the usage of SECOND_ORDER_GRAD_ITER=0 and self.total_loss1?
As for SECOND_ORDER_GRAD_ITER=0:

if step == SECOND_ORDER_GRAD_ITER:
       second_grad=sess.run(self.second_grad_on)

If we have finished pre-training on large scale datasets, I think it is useless in this meta-transfer learning step.
As for self.total_loss1:

self.total_loss1 = tf.reduce_sum(self.lossesa) / tf.to_float(self.META_BATCH_SIZE)
self.pretrain_op = tf.train.AdamOptimizer(self.META_LR).minimize(self.total_loss1)
        
self.gvs = self.opt.compute_gradients(self.weighted_total_losses2)
self.metatrain_op= self.opt.apply_gradients(self.gvs)

sess.run(self.metatrain_op, feed_dict=feed_dict)

In this meta-transfer learning step, total_loss1 is never used for optimizers. Is it correct?

The text was updated successfully, but these errors were encountered:

JWSoh · 2020-06-08T11:46:56Z

MAML requires 2nd-order gradients, which requires large computation.
For the fast training, we use 1st-order approximation of the gradients at the beginning of the training, and then after SECOND_ORDER_GRAD_ITER, we use the full gradients.

Therefore, SECOND_ORDER_GRAD_ITER is to decide how many steps to approximate the gradients within 1st-order.

For the self.total_loss1, you are right. it is not used for the training. You may ignore that loss and corresponding optimizer.

BassantTolba1234 · 2020-12-14T20:32:23Z

Dear Sir,
Amazing work ! Congratulation!!
please , I have a question.can you kindly provide me with the full path I should insert of checkpoint the trained large scale training model to be able to use it as a pre-trained to meta transfer training? as it says that there is no check point file
I'm waiting for your reply.
Thanks in advance

BassantTolba1234 · 2021-01-06T10:30:13Z

Please can you kindly explain me how to calculate this weight loss ?

def get_loss_weights(self):
loss_weights = tf.ones(shape=[self.TASK_ITER]) * (1.0/self.TASK_ITER)
decay_rate = 1.0 / self.TASK_ITER / (10000 / 3)
min_value= 0.03 / self.TASK_ITER

    loss_weights_pre = tf.maximum(loss_weights[:-1] - (tf.multiply(tf.to_float(self.global_step), decay_rate)), min_value)

    loss_weight_cur= tf.minimum(loss_weights[-1] + (tf.multiply(tf.to_float(self.global_step),(self.TASK_ITER- 1) * decay_rate)), 1.0 - ((self.TASK_ITER - 1) * min_value))
    loss_weights = tf.concat([[loss_weights_pre], [[loss_weight_cur]]], axis=1)
    return loss_weights

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the usage of SECOND_ORDER_GRAD_ITER=0 and self.total_loss1? #11

What is the usage of SECOND_ORDER_GRAD_ITER=0 and self.total_loss1? #11

qhanson commented Jun 2, 2020

JWSoh commented Jun 8, 2020

BassantTolba1234 commented Dec 14, 2020

BassantTolba1234 commented Jan 6, 2021

What is the usage of SECOND_ORDER_GRAD_ITER=0 and self.total_loss1? #11

What is the usage of SECOND_ORDER_GRAD_ITER=0 and self.total_loss1? #11

Comments

qhanson commented Jun 2, 2020

JWSoh commented Jun 8, 2020

BassantTolba1234 commented Dec 14, 2020

BassantTolba1234 commented Jan 6, 2021