train from scratch using openimage failed #35

sddai · 2021-11-23T08:23:44Z

HI Justin:
Thanks for the great repo. I met the following strange question, roughly investigated all the issue feedbacks but did not find the answer. See below for detail.
1. I want to reproduce the hific low with openimages, but failed to train from scratch(warmup+gan), bpp much higher than expected like 0.3, while your pretrain works fine, same file, only 0.078bpp
2. for openimages, the train contain 100,000 images(first 100,0000 images of the original train_0 sub zip), while validation contain 41620 images(full set of the original validation)
3. I see that you have only get 200K steps(much smaller than 1M as in paper), while I modified the epoch number so that, I train 500K step for each.
4. Except for the above mentioned difference, no config is changed.
5. The below tensorboard message seemed to show that test data and validation data have much difference in bpp, and bpp in training seemed to go in a not-too-small-range. I tried to use larger batch_size like 16 but also failed.
I hope it's not a burden for you to give out some suggestions and insights.
Thanks!

xyann237 · 2021-12-04T10:18:49Z

Excuse me, I encountered the same problem recently, have you found the cause or solved it?

Justin-Tan · 2021-12-05T00:11:56Z

That's strange - are you are training from scratch completely? If so, you may need to train the base model (warmup phase) for a longer period than you are currently, I remember encountering this issue at the start.

sddai · 2021-12-13T01:40:51Z

Thanks for your feedback. No, I trained 500K steps for warmp and 500K steps for gan. I'll try to train for longer steps.

yifeipet · 2022-02-11T06:15:56Z

I also have this issue. But it's okay as long as the test images look good. I saw my test images look good.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train from scratch using openimage failed #35

train from scratch using openimage failed #35

sddai commented Nov 23, 2021

xyann237 commented Dec 4, 2021

Justin-Tan commented Dec 5, 2021

sddai commented Dec 13, 2021

yifeipet commented Feb 11, 2022

train from scratch using openimage failed #35

train from scratch using openimage failed #35

Comments

sddai commented Nov 23, 2021

xyann237 commented Dec 4, 2021

Justin-Tan commented Dec 5, 2021

sddai commented Dec 13, 2021

yifeipet commented Feb 11, 2022