Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorFlow-infogan training objective functions code not consistent with the paper #13

Open
hengshiyu opened this issue Feb 28, 2019 · 0 comments

Comments

@hengshiyu
Copy link

hengshiyu commented Feb 28, 2019

Hi,

I read the paper of InfoGAN (https://arxiv.org/abs/1606.03657) and I found that it has the objective function for Discriminator (D), Generator (G), and mutual information network (Q) in equation (6).

So, in the implementation, D is to maximize the part which originally for GAN, G is to minimize its own part for GAN minus the mutual information times a tuning lambda, and Q is to maximize the mutual information as it uses a lower bound for mutual information. I feel your code is not the paper means.

However, in your InfoGAN code, I actually found in tensorflow-infogan/infogan/__init__.py that:

train_mutual_info = generator_solver.minimize(
            neg_mutual_info_objective,
            var_list=generator_variables + discriminator_variables + mutual_info_variables
        )

So the mutual information is not added into the training of the generator loss, but it incorporates G, D, Q all to maximize itself. This is not consistent with InfoGAN paper's definition in equation (6).

First, the discriminator loss does not have the mutual information lower bound in the formulation. So you shall not use D to maximize the mutual information lower bound.

Second, in equation (6) or any other GANs, G and D should always have opposite signs (+ or -) for the same term. Even if you think D influences mutual information, D and G should have opposite influences. So maybe from the very start, you shall not use all three networks to maximize this mutual lower bound.

Third, I do not think the paper means to directly maximize mutual lower bound here, but to maximize the sum of minus G objective and this maximum lower bound. Separating this term and maximize it outside is not the paper means.

Could you help me with this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant