Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The training does not converge #10

Open
cactuslei opened this issue Feb 23, 2018 · 10 comments
Open

The training does not converge #10

cactuslei opened this issue Feb 23, 2018 · 10 comments

Comments

@cactuslei
Copy link

Hi, thanks a lot for your great work, your code is very clear and easy to understand.

However, when I train with your main.py, the network does not converge. The only change is that I set "SAMPLE" from True to False, since if True, then only 500 samples are used for training. However, the training loss always be around 2.3 and the training is terminated because there is no improvement after 1000 steps. Could you tell me your best accuracy achieved with your code? Thanks a lot.

@jacobunderlinebenseal
Copy link

me too

@bit1002lst
Copy link

me too, and all of the images in samples are empty

@kevinzakka
Copy link
Owner

Hey guys, I'll take a look at the code when I get the time.

@moormoon
Copy link

I have the same issue. Looks like it is related to the initialization of the conv and fc layers. I tried only using fc layers for regression and the training converged. Remember to initialize the weight to zeros and bias to identity.

Haven't figured out how to initialize conv layers yet. If anyone make progress on this please let us know.

@BlueWinters
Copy link

i think that the bilinear interpolation process in funtion bilinear_sampler is wrong, and a good example of this process can be found in https://github.com/tensorflow/models/tree/master/research/transformer
` # get pixel value at corner coords
Ia = get_pixel_value(img, x0, y0)
Ib = get_pixel_value(img, x0, y1)
Ic = get_pixel_value(img, x1, y0)
Id = get_pixel_value(img, x1, y1)

# recast as float for delta calculation
x0 = tf.cast(x0, 'float32')
x1 = tf.cast(x1, 'float32')
y0 = tf.cast(y0, 'float32')
y1 = tf.cast(y1, 'float32')

# calculate deltas
wa = (x1-x) * (y1-y)
wb = (x1-x) * (y-y0)
wc = (x-x0) * (y1-y)
wd = (x-x0) * (y-y0)

# add dimension for addition
wa = tf.expand_dims(wa, axis=3)
wb = tf.expand_dims(wb, axis=3)
wc = tf.expand_dims(wc, axis=3)
wd = tf.expand_dims(wd, axis=3)

# compute output
out = tf.add_n([wa*Ia, wb*Ib, wc*Ic, wd*Id])`

@lifan9880
Copy link

me too

@kakashi571
Copy link

how to start the training?

@wanziz
Copy link

wanziz commented Aug 22, 2019

如何开始培训?

Excuse me,do you know how to train now?

@wanziz
Copy link

wanziz commented Aug 22, 2019

Excuse me, I would like to know how to start training the model?thank you!

@turandai
Copy link

I think the problem is the gradients of bilinear sampling can not be auto-generated by tensorflow properly. In the original paper, author defined special gradients during this process, and this package has not included it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants