Shouldn't we drop on the second dimension in the drop_path function? #143

woodywff · 2020-06-09T04:28:39Z

This is the dropout function in utils.py:

def drop_path(x, drop_prob):
  if drop_prob > 0.:
    keep_prob = 1.-drop_prob
    mask = Variable(torch.cuda.FloatTensor(x.size(0), 1, 1, 1).bernoulli_(keep_prob))
    x.div_(keep_prob)
    x.mul_(mask)
  return x

Question:
Why do we drop on the batchsize dimension (the 1st dimension)?
Shouldn't we randomly keep and drop some of the filters (on the 2nd dimension)?
Thank you :-)

The text was updated successfully, but these errors were encountered:

Jasha10 · 2020-06-11T16:26:21Z

My understanding is that dropping from the second dimension would be an implementation of "drop channel", whereas dropping from the first dimension is "drop path".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shouldn't we drop on the second dimension in the drop_path function? #143

Shouldn't we drop on the second dimension in the drop_path function? #143

woodywff commented Jun 9, 2020

Jasha10 commented Jun 11, 2020

Shouldn't we drop on the second dimension in the drop_path function? #143

Shouldn't we drop on the second dimension in the drop_path function? #143

Comments

woodywff commented Jun 9, 2020

Jasha10 commented Jun 11, 2020