Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WARNING:tensorflow:Gradients do not exist for variables ['conv4_3/bias:0',...] when minimizing the loss. #374

Open
jessicametzger opened this issue Feb 4, 2021 · 1 comment

Comments

@jessicametzger
Copy link

jessicametzger commented Feb 4, 2021

I am using ssd_keras with tensorflow2 backend to train the ssd300 model (using this ssd_keras_tf2 repository) and everything "works," except I keep running into this error when I start training:

WARNING:tensorflow:Gradients do not exist for variables ['conv1_1/bias:0', 'conv1_2/bias:0', 'conv2_1/bias:0', 
'conv2_2/bias:0', 'conv3_1/bias:0', 'conv3_2/bias:0', 'conv3_3/bias:0', 'conv4_1/bias:0', 'conv4_2/bias:0', 'conv4_3/bias:0', 
'conv5_1/bias:0', 'conv5_2/bias:0', 'conv5_3/bias:0', 'fc6/bias:0', 'fc7/bias:0', 'conv6_1/bias:0', 'conv6_2/bias:0', 
'conv7_1/bias:0', 'conv7_2/bias:0', 'conv8_1/bias:0', 'conv8_2/bias:0', 'conv9_1/bias:0', 
'conv4_3_norm/conv4_3_norm_gamma:0', 'conv9_2/bias:0', 'conv4_3_norm_mbox_conf/bias:0', 'fc7_mbox_conf/bias:0', 
'conv6_2_mbox_conf/bias:0', 'conv7_2_mbox_conf/bias:0', 'conv8_2_mbox_conf/bias:0', 'conv9_2_mbox_conf/bias:0', 
'conv4_3_norm_mbox_loc/bias:0', 'fc7_mbox_loc/bias:0', 'conv6_2_mbox_loc/bias:0', 'conv7_2_mbox_loc/bias:0', 
'conv8_2_mbox_loc/bias:0', 'conv9_2_mbox_loc/bias:0'] when minimizing the loss.

They ran into the same thing when porting to tf2 in ssd_keras_tf2. Moreover, most parameters aren't changing, and the loss doesn't change after the first couple steps.

System Info

  • Ubuntu 20.04.1
  • tensorflow-gpu 2.2.0
  • keras: using tf.keras through tensorflow-gpu
  • Which commit: latest commit of ssd_keras_tf2 (the error also happens when I manually change the latest ssd_keras code to use tf.keras with tf2 backend)
  • GPU: NVIDIA Corporation TU104GL [Quadro RTX 5000] (the error happens whether I am running on gpu or cpu)

Full error message

Epoch 00001: LearningRateScheduler reducing learning rate to 0.001.
Epoch 1/120
WARNING:tensorflow:Gradients do not exist for variables ['conv1_1/bias:0', 'conv1_2/bias:0', 'conv2_1/bias:0', 'conv2_2/bias:0', 'conv3_1/bias:0', 'conv3_2/bias:0', 'conv3_3/bias:0', 'conv4_1/bias:0', 'conv4_2/bias:0', 'conv4_3/bias:0', 'conv5_1/bias:0', 'conv5_2/bias:0', 'conv5_3/bias:0', 'fc6/bias:0', 'fc7/bias:0', 'conv6_1/bias:0', 'conv6_2/bias:0', 'conv7_1/bias:0', 'conv7_2/bias:0', 'conv8_1/bias:0', 'conv8_2/bias:0', 'conv9_1/bias:0', 'conv4_3_norm/conv4_3_norm_gamma:0', 'conv9_2/bias:0', 'conv4_3_norm_mbox_conf/bias:0', 'fc7_mbox_conf/bias:0', 'conv6_2_mbox_conf/bias:0', 'conv7_2_mbox_conf/bias:0', 'conv8_2_mbox_conf/bias:0', 'conv9_2_mbox_conf/bias:0', 'conv4_3_norm_mbox_loc/bias:0', 'fc7_mbox_loc/bias:0', 'conv6_2_mbox_loc/bias:0', 'conv7_2_mbox_loc/bias:0', 'conv8_2_mbox_loc/bias:0', 'conv9_2_mbox_loc/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['conv1_1/bias:0', 'conv1_2/bias:0', 'conv2_1/bias:0', 'conv2_2/bias:0', 'conv3_1/bias:0', 'conv3_2/bias:0', 'conv3_3/bias:0', 'conv4_1/bias:0', 'conv4_2/bias:0', 'conv4_3/bias:0', 'conv5_1/bias:0', 'conv5_2/bias:0', 'conv5_3/bias:0', 'fc6/bias:0', 'fc7/bias:0', 'conv6_1/bias:0', 'conv6_2/bias:0', 'conv7_1/bias:0', 'conv7_2/bias:0', 'conv8_1/bias:0', 'conv8_2/bias:0', 'conv9_1/bias:0', 'conv4_3_norm/conv4_3_norm_gamma:0', 'conv9_2/bias:0', 'conv4_3_norm_mbox_conf/bias:0', 'fc7_mbox_conf/bias:0', 'conv6_2_mbox_conf/bias:0', 'conv7_2_mbox_conf/bias:0', 'conv8_2_mbox_conf/bias:0', 'conv9_2_mbox_conf/bias:0', 'conv4_3_norm_mbox_loc/bias:0', 'fc7_mbox_loc/bias:0', 'conv6_2_mbox_loc/bias:0', 'conv7_2_mbox_loc/bias:0', 'conv8_2_mbox_loc/bias:0', 'conv9_2_mbox_loc/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['conv1_1/bias:0', 'conv1_2/bias:0', 'conv2_1/bias:0', 'conv2_2/bias:0', 'conv3_1/bias:0', 'conv3_2/bias:0', 'conv3_3/bias:0', 'conv4_1/bias:0', 'conv4_2/bias:0', 'conv4_3/bias:0', 'conv5_1/bias:0', 'conv5_2/bias:0', 'conv5_3/bias:0', 'fc6/bias:0', 'fc7/bias:0', 'conv6_1/bias:0', 'conv6_2/bias:0', 'conv7_1/bias:0', 'conv7_2/bias:0', 'conv8_1/bias:0', 'conv8_2/bias:0', 'conv9_1/bias:0', 'conv4_3_norm/conv4_3_norm_gamma:0', 'conv9_2/bias:0', 'conv4_3_norm_mbox_conf/bias:0', 'fc7_mbox_conf/bias:0', 'conv6_2_mbox_conf/bias:0', 'conv7_2_mbox_conf/bias:0', 'conv8_2_mbox_conf/bias:0', 'conv9_2_mbox_conf/bias:0', 'conv4_3_norm_mbox_loc/bias:0', 'fc7_mbox_loc/bias:0', 'conv6_2_mbox_loc/bias:0', 'conv7_2_mbox_loc/bias:0', 'conv8_2_mbox_loc/bias:0', 'conv9_2_mbox_loc/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['conv1_1/bias:0', 'conv1_2/bias:0', 'conv2_1/bias:0', 'conv2_2/bias:0', 'conv3_1/bias:0', 'conv3_2/bias:0', 'conv3_3/bias:0', 'conv4_1/bias:0', 'conv4_2/bias:0', 'conv4_3/bias:0', 'conv5_1/bias:0', 'conv5_2/bias:0', 'conv5_3/bias:0', 'fc6/bias:0', 'fc7/bias:0', 'conv6_1/bias:0', 'conv6_2/bias:0', 'conv7_1/bias:0', 'conv7_2/bias:0', 'conv8_1/bias:0', 'conv8_2/bias:0', 'conv9_1/bias:0', 'conv4_3_norm/conv4_3_norm_gamma:0', 'conv9_2/bias:0', 'conv4_3_norm_mbox_conf/bias:0', 'fc7_mbox_conf/bias:0', 'conv6_2_mbox_conf/bias:0', 'conv7_2_mbox_conf/bias:0', 'conv8_2_mbox_conf/bias:0', 'conv9_2_mbox_conf/bias:0', 'conv4_3_norm_mbox_loc/bias:0', 'fc7_mbox_loc/bias:0', 'conv6_2_mbox_loc/bias:0', 'conv7_2_mbox_loc/bias:0', 'conv8_2_mbox_loc/bias:0', 'conv9_2_mbox_loc/bias:0'] when minimizing the loss.
   4/1000 [..............................] - ETA: 24:42 - loss: 1.4227

Reproducible example

The error happens whenever I call model.fit(...) or model.fit_generator(...), where model is an ssd300 model, and where the backend is tf2. It happens whether I am using cpu or gpu. E.g. when I run the ssd300_training.ipynb tutorial, I get that error.

I would really appreciate it if someone trying to use tf2 backend found a solution for this, because I'm pretty sure I'm not the only one.

@anashad
Copy link

anashad commented Feb 15, 2021

Hello. As is suggested in this issue #359, the problem comes from the generator that yields a list instead of a tuple. You should be able to cast ret to a tuple in this line
https://github.com/IntranelConsulting/ssd_keras_tf2/blob/183d57b1989fc6d732bf1028b6323bdd5771e78a/data_generator/object_detection_2d_data_generator.py#L1174

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants