TensorFlow 2.16 / Keras 3 have undocumented breaking API changes #63792

pandrey-fr · 2024-03-15T10:12:53Z

Issue type

Documentation Feature Request

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

source

TensorFlow version

2.16.1

Custom code

No

OS platform and distribution

No response

Mobile device

No response

Python version

No response

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

With the (documented) switch to Keras 3, TensorFlow 2.16 has introduced undocumented API-breaking changes that derogate to SemVer and results in unexpected code maintenance costs for depending projects.

While I understand the interest of switching to Keras 3 (and acknowledge the possibility to keep using Keras 2 as backend, albeit non-trivial to impose on end-users of a third-party solution that uses TensorFlow as dependency), I would have expected some kind of effort on documenting parts of the API that changed due to the switch. This Keras issue and this guide are presented as listing breaking changes, but seem not to be covering everything that changed, especially when calling keras from TensorFlow rather than as a primary endpoint for setting up models (agnostic to the computation backend).

As I put effort in updating custom code that calls TensorFlow / Keras code, I notably noticed that:

tf.keras.Model.weights (and its variants) as well as tf.keras.optimizers.Optimizer.variables no longer return a list of tf.Variable instances, but instead return a list of tf.keras.Variable that themselves wrap tf.Variable instances.
Some losses were dropped / renamed, e.g. tf.keras.losses.deserialize("mse") no longer works, but was to the best of my knowledge never reported to be deprecated in previous versions.
tf.keras.optimizers.Optimizer.variables() has been removed in favor of tf.keras.optimizers.Optimizer.variables.

These changes can be adjusted for, but require a lot of if hasattr(tf.keras, "version") and tf.keras.version().startswith("3"): branching to keep supporting not-so-old versions while adding support for newer TF versions, and, most importantly, require some tedious experiments to find out what has changed and how. As such, I categorize his issue as "Documentation Feature Request" as I obviously do not expect changes to be rolled back, but believe that it would be a sensible gesture towards the community to publish a clearer and more exhaustive guide as to what has changed with the introduction of Keras 3, and how to accommodate third-party code that does not want to impose on end-users the choice of a single TF version.

Standalone code to reproduce the issue

As a single example of what I am talking about:

This works with TensorFlow 2.15:

import tensorflow as tf

optim = tf.keras.Adagrad()
grads = [tf.zeros((32, 1)), tf.zeros((1,))]
optim.build([tf.Variable(tf.zeros_like(g), name=f"{i}") for i, g in enumerate(grads)])
state = {v.name: v.value() for v in optim.variables()}
assert len(state) == 2

But it fails for numerous reasons in TensorFlow 2.16.1 (with Keras 3 backend):

state = {v.name: v.value() for v in optim.variables()} should be replaced with state = {v.path: v.value.value() for v in optim.variables}...
... and in fact this will not work because the names have not been propagated and all accumulator variables have the shame name/path, hence to put them in a dict I have to deterministically set names, e.g. based on indices, which drops some information.

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

pandrey-fr · 2024-03-18T14:28:46Z

Sorry, but this is as unhelpful an answer as it gets. Points 1, 2, 4 and 5 are merely repetitions of my issue description, and your conclusion literally prompts me to do what I just did by opening this issue.

As for point 3, the all point of my opening this issue is to point out that there is a lack of documentation regarding these changes, which means everyone has to spend time figuring out what is broken or not, and how to adjust to API changes. I managed to do it in my specific case with a few hours' effort, but the parts I use in my software only cover a small part of the entire TensorFlow/Keras API, hence I expect others will not only encounter the same issues as me, but probably more undocumented issues.

Hence my call for a broader migration guide, that would help developers adjust their code to the recent changes with limited effort.

mihaimaruseac · 2024-03-18T16:51:37Z

(it's an answer generated by an LLM, just spam)

google-ml-butler bot added type:docs-feature Doc issues for new feature, or clarifications about functionality type:feature Feature requests labels Mar 15, 2024

google-ml-butler bot assigned Venkat6871 Mar 15, 2024

Venkat6871 added comp:keras Keras related issues TF 2.16 labels Mar 18, 2024

Venkat6871 assigned SuryanarayanaY and unassigned Venkat6871 Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorFlow 2.16 / Keras 3 have undocumented breaking API changes #63792

TensorFlow 2.16 / Keras 3 have undocumented breaking API changes #63792

pandrey-fr commented Mar 15, 2024 •

edited

pandrey-fr commented Mar 18, 2024

mihaimaruseac commented Mar 18, 2024

TensorFlow 2.16 / Keras 3 have undocumented breaking API changes #63792

TensorFlow 2.16 / Keras 3 have undocumented breaking API changes #63792

Comments

pandrey-fr commented Mar 15, 2024 • edited

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

pandrey-fr commented Mar 18, 2024

mihaimaruseac commented Mar 18, 2024

pandrey-fr commented Mar 15, 2024 •

edited