I modified the network structure and I have on idea if the training is on the right way #527

langka9 · 2022-04-27T03:10:19Z

Hi, @JiahuiYu
I modified your network structure, and train it. Also, this is my first time to build a model for image inpainting. So during training time, I have no idea if my model training is on the right way.

First of all, my data set is Places365, with a total amount of 1.8 million. Due to the limitation of GPU, I set the batchSize to 16, and the total number of iterations is 5,000,000. Within the total number of iterations, a data set can run about 44 epoch. It takes 112,500 iterations to run an epoch. My training method is: firstly, only using content loss(L1 loss), perpetual loss(vgg loss) and style loss (style loss in style transfer) to make the image inpainting network pre-converge, and then combining GAN to make it fully converge. Training methods refer to globally and locally consistent image completion.

Now it has run 22700 times and is still in the pre-convergence stage. My loss functions are as follows, namely content loss(L1 loss), perpetual loss(vgg loss) and style loss. Then I will check the repair results of the model with a verification diagram every once in a while, but I find that the effect is very unsatisfactory.

Content loss (L1 loss between the repair result and the real result): It can be seen that although the overall situation is declining, it fluctuates greatly. I don't know if this is normal.

Perpetual loss (it compares the feature obtained by vgg convolution of real pictures with the feature obtained by vgg convolution of generated pictures): This is also an overall decline, but the fluctuation is larger.

Style loss (style loss in style transfer): This is also an overall decline, but the fluctuation range is super large.

This is the repair result of the verification diagram: it can be seen that the repair part is like the epitome of the middle part of the original image, and then it is continuously spliced and repaired. I don't know if this result is due to the fact that the network hasn't converged yet, or because this is the normal situation in the training process, or there is a problem with my model structure or my code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I modified the network structure and I have on idea if the training is on the right way #527

I modified the network structure and I have on idea if the training is on the right way #527

langka9 commented Apr 27, 2022

I modified the network structure and I have on idea if the training is on the right way #527

I modified the network structure and I have on idea if the training is on the right way #527

Comments

langka9 commented Apr 27, 2022