Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shouldn't we drop on the second dimension in the drop_path function? #143

Open
woodywff opened this issue Jun 9, 2020 · 1 comment
Open

Comments

@woodywff
Copy link

woodywff commented Jun 9, 2020

This is the dropout function in utils.py:

def drop_path(x, drop_prob):
  if drop_prob > 0.:
    keep_prob = 1.-drop_prob
    mask = Variable(torch.cuda.FloatTensor(x.size(0), 1, 1, 1).bernoulli_(keep_prob))
    x.div_(keep_prob)
    x.mul_(mask)
  return x

Question:
Why do we drop on the batchsize dimension (the 1st dimension)?
Shouldn't we randomly keep and drop some of the filters (on the 2nd dimension)?
Thank you :-)

@Jasha10
Copy link

Jasha10 commented Jun 11, 2020

My understanding is that dropping from the second dimension would be an implementation of "drop channel", whereas dropping from the first dimension is "drop path".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants