Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train from scratch using openimage failed #35

Open
sddai opened this issue Nov 23, 2021 · 4 comments
Open

train from scratch using openimage failed #35

sddai opened this issue Nov 23, 2021 · 4 comments

Comments

@sddai
Copy link

sddai commented Nov 23, 2021

HI Justin:
Thanks for the great repo. I met the following strange question, roughly investigated all the issue feedbacks but did not find the answer. See below for detail.
1. I want to reproduce the hific low with openimages, but failed to train from scratch(warmup+gan), bpp much higher than expected like 0.3, while your pretrain works fine, same file, only 0.078bpp
2. for openimages, the train contain 100,000 images(first 100,0000 images of the original train_0 sub zip), while validation contain 41620 images(full set of the original validation)
3. I see that you have only get 200K steps(much smaller than 1M as in paper), while I modified the epoch number so that, I train 500K step for each.
4. Except for the above mentioned difference, no config is changed.
5. The below tensorboard message seemed to show that test data and validation data have much difference in bpp, and bpp in training seemed to go in a not-too-small-range. I tried to use larger batch_size like 16 but also failed.
I hope it's not a burden for you to give out some suggestions and insights.
Thanks!
004f3fb46c4c9823a14a54663f67747

@xyann237
Copy link

xyann237 commented Dec 4, 2021

Excuse me, I encountered the same problem recently, have you found the cause or solved it?

@Justin-Tan
Copy link
Owner

That's strange - are you are training from scratch completely? If so, you may need to train the base model (warmup phase) for a longer period than you are currently, I remember encountering this issue at the start.

@sddai
Copy link
Author

sddai commented Dec 13, 2021

Thanks for your feedback. No, I trained 500K steps for warmp and 500K steps for gan. I'll try to train for longer steps.

@yifeipet
Copy link

I also have this issue. But it's okay as long as the test images look good. I saw my test images look good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants