Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the bias of 3d convolutions in the attention block #36

Open
liucu0135 opened this issue Feb 21, 2020 · 1 comment
Open

About the bias of 3d convolutions in the attention block #36

liucu0135 opened this issue Feb 21, 2020 · 1 comment

Comments

@liucu0135
Copy link

Hello, I noticed you did not set bias=False in the 1x1 3d convolution layers which implements

phi=W_phix+B_phi
g=W_g
x+B_g
theta=W_theta*x+B_theta

I have read some materials and papers. None of them have mentioned if there are bias terms like B_phi, B_g and B_theta.
I have tried my implementation with Bias=True, just like you did, and it did improve the performance.

I just want to ask where did you come to this idea of setting bias=True(as default). Just in case if I was missing something in my reading.

@AlexHex7
Copy link
Owner

Hi @liucu0135 . In fact, I omitted this point.
I guess that bias parameter can improve fitting ability of models. And in some algorithms, in order to avoid overfitting, the Conv layers often don't contain bias parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants