Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

high res version #10

Open
kidach1 opened this issue Jan 11, 2020 · 6 comments
Open

high res version #10

kidach1 opened this issue Jan 11, 2020 · 6 comments

Comments

@kidach1
Copy link

kidach1 commented Jan 11, 2020

Thank you for sharing.
Did you try high res ver (like 256x or 512x)?
If not, what difficulties can be considered?

@utkarshojha
Copy link
Collaborator

Hi,

Yes we did try one version for 256x256 resolution, and it was working decently well. You can simply add more upsampling layers towards the end of all the generator modules.
We haven't tried it for 512x512, but it might be tricky to directly generate images at the 512x512 resolution (in FineGAN, the resolution remains same throughout the pipeline). You might need some variant of StackGAN or ProgressiveGAN to reach that resolution.

@kidach1
Copy link
Author

kidach1 commented Jan 14, 2020

@utkarshojha
thank you for reply!

Yes we did try one version for 256x256 resolution, and it was working decently well. You can simply add more upsampling layers towards the end of all the generator modules.

But if generator outputs size become twice, discriminator inputs and real_imgs size also should be twice (otherwise it causes size mismatch error), right? Though I tried to do that, I couldn't get satisfactory results as below.

fake_imgs[0] (background stage)

count_000076000_fake_samples0

fake_imgs[1] (parent stage)

count_000076000_fake_samples1

fake_imgs[2] (child stage)

count_000076000_fake_samples2

It seems that bounding box process doesn't work well and disentaglement of the background fails.
Could you share your code for 256x256 if it's possible?

@utkarshojha
Copy link
Collaborator

utkarshojha commented Jan 15, 2020

One problem which could have been there in your implementation of the 256x256 version would be at your background stage. We use PatchGAN at the background stage, due to which we need to define the values of some hyperparameters, which are defined in lines 381-383 of trainer.py. These parameters are needed to accurately extract patches lying outside the bounding box.

For 256x256 version, the updated values of those parameters would be:
self.patch_stride = float(8)
self.n_out = 24
self.recp_field = 70
And yes, the real images and the discriminator inputs (and consequently the discriminator itself) would be different, and there isn't anything different we do for the 256 case, apart from adding more layers to process higher resolution inputs.

My version of 256x256 isn't cleaned up, so I don't think it will be helpful to you. You should try the correction I mentioned and let me know if it works. If not then I can further look into it.

@kidach1
Copy link
Author

kidach1 commented Jan 15, 2020

@utkarshojha
I tried following your suggestion but things don't seem to change.

fake_imgs[0] (background stage)

count_000012000_fake_samples0

fake_imgs[1] (parent stage)

count_000012000_fake_samples1

fake_imgs[2] (child stage)

count_000012000_fake_samples2

And this is my changes.
kidach1@e5c8abd

Could you check this?

@utkarshojha
Copy link
Collaborator

Hi kidach1,
Sorry for the late response. I went through the changes you made, and it looks fine. The only difference is the CROP_IMG_SIZE parameter. You've set it to 252, while in my version it is 254 (apologies for not mentioning this before). I'm not sure what difference would this make, but you should try it.
I've been a lot busy these past few weeks, but you should let me know of the result. Its just that I might be a bit late to respond.

Thanks

@kidach1
Copy link
Author

kidach1 commented Jan 22, 2020

@utkarshojha
Thank you for reply despite your busy schedule!
Unfortunately, change of CROP_IMG_SIZE doesn't seem to work (though the training progress is only 25 epochs yet, generated images in each stage are almost the same like above my comments).

Are the channel sizes of G and D (this and this line) same as yours?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants