TF-Keras mixed precision training leads to autograph errors #66374

lgeiger · 2024-04-24T16:05:35Z

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

binary

TensorFlow version

2.16.1, 2.17.0.dev20240423

Custom code

No

OS platform and distribution

Linux, Colab

Mobile device

No response

Python version

3.10, 3.11

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

12.3 / 8.9.7.29

GPU model and memory

NVIDIA Tesla T4

Current behavior?

Since TF version 2.16 mixed precision training is fails to compile with autograph and throws warnings when running a minimal mixed precision training example:

import os

os.environ["TF_USE_LEGACY_KERAS"] = "1"

import tensorflow as tf
from tensorflow import keras

keras.mixed_precision.set_global_policy('mixed_float16')
inputs = keras.Input(shape=(784,))
x = keras.layers.Dense(10)(inputs)
outputs = keras.layers.Activation('softmax', dtype='float32')(x)

model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(loss='sparse_categorical_crossentropy', optimizer=keras.optimizers.RMSprop())

(x_train, y_train), _ = keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000, 784).astype('float32') / 255

_ = model.fit(x_train, y_train, batch_size=128, epochs=1, steps_per_epoch=1, verbose=0)

TF Autograph doesn't transform the create_autocast_variable function and throws the following warnings:

WARNING:tensorflow:AutoGraph could not transform <function create_autocast_variable at 0x7a263a673400> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: <gast.gast.Expr object at 0x7a25b0be2e00>
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert

Autograph should be able to transform the function and should not throw any warnings like it did in TF 2.15.
Unfortunately we're currently unable to upgrade to Keras 3 due to other issues, so it would be good to be able to get this patched in TF-Keras as well.

I'm not sure if this issue is caused by Autograph or by TF-Keras.

Standalone code to reproduce the issue

See the following notebooks: This wasn't an issue in TF 2.15, but fails in TF 2.16 and still fails in TF Nightly.

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

sushreebarsa · 2024-04-25T04:26:58Z

@lgeiger I was not able to replicate the issue on colab using TF v2.16.1 and tf-nightly as well. Please have a look at the attached gists and confirm the same?
Thank you!

lgeiger · 2024-04-25T10:13:17Z

@sushreebarsa As mentioned in the issue this is an issue with TF-Keras, I added os.environ["TF_USE_LEGACY_KERAS"] = "1" to the example code as well to match the colab which should enable you to reproduce this issue.

google-ml-butler bot added the type:bug Bug label Apr 24, 2024

google-ml-butler bot assigned sushreebarsa Apr 24, 2024

sushreebarsa added comp:keras Keras related issues TF 2.16 labels Apr 25, 2024

sushreebarsa added the stat:awaiting response Status - Awaiting response from author label Apr 25, 2024

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TF-Keras mixed precision training leads to autograph errors #66374

TF-Keras mixed precision training leads to autograph errors #66374

lgeiger commented Apr 24, 2024 •

edited

sushreebarsa commented Apr 25, 2024

lgeiger commented Apr 25, 2024

TF-Keras mixed precision training leads to autograph errors #66374

TF-Keras mixed precision training leads to autograph errors #66374

Comments

lgeiger commented Apr 24, 2024 • edited

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

sushreebarsa commented Apr 25, 2024

lgeiger commented Apr 25, 2024

lgeiger commented Apr 24, 2024 •

edited