NaN appearing on tf.gradients calculation with tf.where and division by zero on the false branch #20091

mikefairbank · 2018-06-18T10:59:47Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
Yes, script is below
OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
Linux 4.15.0-23-generic Cuda 3.0? #25-Ubuntu SMP Wed May 23 18:02:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
VERSION="18.04 LTS (Bionic Beaver)"
TensorFlow installed from (source or binary):
binary
TensorFlow version (use command below):
v1.8.0-0-g93bc2e2072 1.8.0
Python version:
Python 3.6.5
Bazel version (if compiling from source): n/a
GCC/Compiler version (if compiling from source): n/a
CUDA/cuDNN version: n/a, using CPU version
GPU model and memory: n/a using CPU
Exact command to reproduce: just run "python3 script.py"

Describe the problem

When using the tf.where function where a division by zero exists in one of the two where branches, you get a NaN gradient even if the division by zero was on the where branch which was not executed.

This seems similar to #2540 but the workarounds suggested there (e.g. using tf.boolean_mask) did not work.

Source code / logs

import tensorflow as tf
sess = tf.Session()
W1 = tf.Variable([2.0])
W2 = tf.Variable([0.0])
output=tf.where(W1>4, W1/W2, tf.zeros_like(W1))  # gives correct answer (zero) since W1>4 is false
gradient=tf.gradients(output, W2)[0] # should be zero, but it gives NaN
sess.run(tf.global_variables_initializer())
print(sess.run([output, gradient]))

Program output:

#[array([0.], dtype=float32), array([nan], dtype=float32)]

The text was updated successfully, but these errors were encountered:

facaiy · 2018-06-18T12:11:15Z

I agree that the issue is duplication of #2540, do you try the workaround below suggested by @anishathalye ?

x = tf.placeholder(tf.float32)
# y = tf.where(x > 0, 0., tf.exp(x))

# trick: we're not using the result of safe_exp when x > 0, so we can
# substitute a safe value for x in that case
# it doesn't really matter what we put in here, as long as the backward pass
# returns some finite value
safe_exp = tf.exp(tf.where(x > 0, 1.0, x))
y = tf.where(x > 0, 0., safe_exp)

I think it should solve your problem.

In fact, I also propose to implement a new op #15706 to fix the issue totally, unfortunately, google tensorflow don't reply to it.

mikefairbank · 2018-06-18T13:23:26Z

Thanks. Yes it did solve my problem.

import tensorflow as tf
sess = tf.Session()

W1 = tf.Variable([2.0])
W2 = tf.Variable([0.0])

safe_W2 = tf.where(W1>4, W2, [1.0])
output = tf.where(W1 >4, W1/safe_W2, tf.zeros_like(W1))
gradient=tf.gradients(output, W2)[0]
sess.run(tf.global_variables_initializer())

print(sess.run([output, gradient])) # prints 0,0, i.e. correct answers now

tensorflowbutler · 2018-08-15T13:05:51Z

Nagging Assignee @skye: It has been 44 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

asimshankar assigned skye Jun 18, 2018

skye added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jul 1, 2018

This was referenced Aug 14, 2018

unsafe_div op for division by zero #19105

Closed

add tf.div_no_nan op for division by zero #21621

Closed

drpngx closed this as completed Aug 22, 2018

octavian-ganea mentioned this issue Sep 7, 2018

[Bug] Clip by norm NaN gradients #22048

Closed

cottrell mentioned this issue Apr 9, 2019

nan appearing on False fork of tf.where propgates to e #27684

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NaN appearing on tf.gradients calculation with tf.where and division by zero on the false branch #20091

NaN appearing on tf.gradients calculation with tf.where and division by zero on the false branch #20091

mikefairbank commented Jun 18, 2018 •

edited by asimshankar

facaiy commented Jun 18, 2018 •

edited

mikefairbank commented Jun 18, 2018

tensorflowbutler commented Aug 15, 2018

NaN appearing on tf.gradients calculation with tf.where and division by zero on the false branch #20091

NaN appearing on tf.gradients calculation with tf.where and division by zero on the false branch #20091

Comments

mikefairbank commented Jun 18, 2018 • edited by asimshankar

System information

Describe the problem

Source code / logs

Program output:

facaiy commented Jun 18, 2018 • edited

mikefairbank commented Jun 18, 2018

tensorflowbutler commented Aug 15, 2018

mikefairbank commented Jun 18, 2018 •

edited by asimshankar

facaiy commented Jun 18, 2018 •

edited