Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistency with the original paper #24

Open
duducheng opened this issue Jan 24, 2019 · 9 comments
Open

inconsistency with the original paper #24

duducheng opened this issue Jan 24, 2019 · 9 comments
Labels
enhancement New feature or request v0.4

Comments

@duducheng
Copy link

Hello, thanks for your nice code!

I found there were 2 inconsistencies with the original paper, and they are very easy to fix indeed:

  1. the gamma: in the original paper, all the block_mask are complete squares (or cubes), sinces its mask are only sampled on the central parts.
  2. in the paper, it said the channels use different masks, while in your implement they use the same.

I just figure them out, actually I do not know whether they are effective tricks, there are insufficient details discussed in the paper :)

@miguelvr
Copy link
Owner

The gamma issue is a minor thing but I can have a look at it.

The channels share the same mask in the paper.

@duducheng
Copy link
Author

“We experimented with a shared DropBlock mask across different feature channels or each feature
channel has its DropBlock mask. Algorithm 1 corresponds to the latter, which tends to work better in
our experiments.” (page 2 bottom line)

@miguelvr
Copy link
Owner

miguelvr commented Jan 24, 2019

Sure, that is easily fixable

Expect it soon

Edit: you can also do a PR if you want

@miguelvr miguelvr added the enhancement New feature or request label Feb 3, 2019
@miguelvr miguelvr added question Further information is requested enhancement New feature or request v0.4 and removed enhancement New feature or request question Further information is requested labels Feb 3, 2019
@huyvnphan
Copy link

Hi,
Any updates on this?
Best

@miguelvr
Copy link
Owner

Hi,
Any updates on this?
Best

I haven't had much free time to deal with this, but I will review and accept merge requests

@JarvisKevin
Copy link

I also found some difference between paper and code.

@Eliza-and-black
Copy link

Eliza-and-black commented Dec 4, 2021

To solve this issue, you could have a look at this folk(only for DropBlock2D)

@miguelvr
Copy link
Owner

miguelvr commented Dec 4, 2021

To solve this issue, you could have a look at this folk(only for DropBlock2D)

I would encourage you to do a pull request

@JohnDLee
Copy link

JohnDLee commented Jan 25, 2022

If you do look at the code linked above, note that mask_center is not initialized on the device, so the part where nn.ZeroPad2d is called will by default run on the CPU. For me, since I was training on a GPU, this slowed down a single forward call (of my model which uses many Dropblocks) from .15 seconds to 3 seconds.

Screen Shot 2022-01-24 at 11 03 11 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request v0.4
Projects
None yet
Development

No branches or pull requests

6 participants