Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation Discrepency Relative to Publication #548

Open
Yatagarasu50469 opened this issue Nov 21, 2023 · 0 comments
Open

Implementation Discrepency Relative to Publication #548

Yatagarasu50469 opened this issue Nov 21, 2023 · 0 comments

Comments

@Yatagarasu50469
Copy link

@JiahuiYu
Salutations,

Examining the original publication, it is indicated that the inputs (I) to a gated convolution are convolved with different weights (one set for features: W_f and another for gating: W_g), activated with different functions (sigmoid for gating and elu for features), then multiplied together. On the first gated convolution in the model, given cnum=48, W_g and W_f would each have a depth of 48, and the output of the gated convolution would have 48 output channels. This interpretation would match the publication and with a comment made in a prior issue's discussion #62 (comment) that gated convolution can be implemented as (Code A):

x1 = self.conv1(x)
x2 = self.conv2(x)
x = sigmoid(x2) * activation(x1)

However, using the gen_conv definition provided, there is a single Conv2D with cnum=48, the result of which is then split in half to create two smaller sets (x and y) of just 24 channels, each of which gets activated and then multiplied together. Here then (Code B), as implemented, the gated convolution only produces 24 output channels.

x, y = tf.split(x, 2, 3)
x = activation(x)
y = tf.nn.sigmoid(y)
x = x * y

Similarly, referencing the same prior issue discussion, the other code given there (Code C):

x = self.conv(x)
x1, x2 = split(x, 2) # split along channels 
x = sigmoid(x2) * activation(x1)

would not be equivalent to Code A unless "self.conv" has double the number of filters as "self.conv1" and "self.conv2" (and assuming, of course, self.conv1 and self.conv2 have the same number of filters).

Any insight you can provide regarding this apparent discrepancy would be greatly appreciated.
Thank you in advanced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant