Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about output size of D network #38

Open
jychoi118 opened this issue May 14, 2020 · 3 comments
Open

Question about output size of D network #38

jychoi118 opened this issue May 14, 2020 · 3 comments

Comments

@jychoi118
Copy link

jychoi118 commented May 14, 2020

Output of original StyleGAN's discriminator is a scalar, predicting whether the given image is real or fake. However, the output shape of your D network is batch x (2 * dlatent_size) in the line below.

ALAE/net.py

Line 893 in 5d8362f

outputs = 2 * dlatent_size if i == mapping_layers - 1 else mapping_fmaps

Therefore, you selected one element among 2*dlatent_size elements as the final output of D network (which is used for loss function) in the line below (Z_).

ALAE/model.py

Line 111 in 5d8362f

return Z[:, :1], Z_[:, 1, 0]

I'm curious why the output shape of D network is batch x (2 * dlatent_size), since only one element is used for training and the others are useless.

Plus, I can't understand why the output of D network is reshaped like this.

ALAE/net.py

Line 903 in 5d8362f

return x.view(x.shape[0], 2, x.shape[2] // 2)

@jychoi118 jychoi118 changed the title Question about D network Question about output size of D network May 14, 2020
@rardz
Copy link

rardz commented May 21, 2020

Seems like a hastily modified version from some of stylegan's modules.

@6b5d
Copy link

6b5d commented Sep 1, 2020

im confused, too.

since mapping_tl does not have the activation function, it seems to be the 3-layer multilinear map mentioned in the paper. the output of mapping_tl is supposed to be used to minimize the coding reconstruction error. but the output style of the encoder is used to minimize the coding reconstruction error while the output of mapping_tl is used to compute discriminator loss

i wonder if this is a bug or i just misunderstand something?

@podgorskiy
Copy link
Owner

podgorskiy commented Dec 7, 2020

@jychoi118 ,

I'm curious why the output shape of D network is batch x (2 * dlatent_size), since only one element is used for training and the others are useless.

Yes, just one is used. The others should not affect anything.
That's is the result of trying many configurations, but now have to keep it that way to be compatible with the trained models. I'll adjust the code be a bit more clear here.

@6b5d ,

does not have the activation function

Yes, that's a bug

podgorskiy pushed a commit that referenced this issue Dec 7, 2020
Fixed memory leaking related to matplotlib #17

Hope did not break anything. If you have an issues, try a revision before this commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants