Running the same model in TF and TFLiteMicro produces different outputs #66358

kratmanski-logi · 2024-04-24T08:08:54Z

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

source

TensorFlow version

latest

Custom code

No

OS platform and distribution

both Linux and Windows

Mobile device

No response

Python version

latest

Bazel version

No response

GCC/compiler version

latest

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

I have trained float point tf model and converted it to 8bit .tflite model. I run this .tflite model on the same test input in two environments, both on the same PC with linux (no ):

in tensorflow, installed as typically in linux via "pip install tensorflow" and then imported via "import tensorflow as tf" in python
in gcc in linux, after converting .tflite to flat buffer, where the C reference kernels are in https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/kernels

Problem: Running the same .tflite model/same input in these two different environments produces different outputs. Given that the (very small and simple) model is already heavily quantized, I am expecting to get bit-exact output, whereas I observe the output difference that is beyond some sort of rounding error. In the process of debugging, I narrowed down to one of the kernels that produces not-bit-exact result: Fully_Connected. The source code for this kernel under environment (2) lives here (line 65):
https://github.com/tensorflow/tflite-micro/blob/main/tensorflow/lite/kernels/internal/reference/fully_connected.h

Main Question: Can you point me to the source code of fully_connected computations for the environment (1) so I can compare the computations to the ones for the environment (2) and figure out myself where the output difference comes from? I understand that in (2) python calls precompiled C/C++ kernels, and I would like to see the source code of fully_connected (and other standard kernels such as conv2d) that was used for pre-compiling.

Note: I think the following code is being used in environment (1), however it is still not at the deep level where i can clearly see the matrix multiplication -> add bias -> rescale sequence of fully_connected operations that I can compare with the source code above for the environment (2)

Standalone code to reproduce the issue

https://github.com/tensorflow/tflite-micro/blob/main/tensorflow/lite/kernels/internal/reference/fully_connected.h
https://github.com/sourcecode369/tensorflow-1/blob/master/tensorflow/lite/kernels/cpu_backend_gemm_ruy.h

Relevant log output

No response

sawantkumar · 2024-04-29T06:33:17Z

hi @kratmanski-logi ,

Can you check this link and let me know if this is the one you were looking for.

kratmanski-logi · 2024-04-29T22:56:11Z

Hi,
I checked your link, the core.py seem to have a Dense class reference, however it does not tell me how exactly 8x8bit multiplication (data X weights), adding 32bit bias, and rescaling is happening, on the same detail as in link I provided for Micro side (that line 65 for fully connected). My model is fixed point 8x8 but data X weights, 32bit bias, and since fixed point there is rescaling/zero centering before passing to the next layer. I need to see the reference fixed point computation. Thanks

google-ml-butler bot added the type:bug Bug label Apr 24, 2024

google-ml-butler bot assigned Venkat6871 Apr 24, 2024

Venkat6871 added the comp:lite TF Lite related issues label Apr 25, 2024

Venkat6871 assigned sawantkumar and unassigned Venkat6871 Apr 25, 2024

sawantkumar added the comp:micro Related to TensorFlow Lite Microcontrollers label Apr 29, 2024

sawantkumar added the stat:awaiting response Status - Awaiting response from author label Apr 29, 2024

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Apr 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running the same model in TF and TFLiteMicro produces different outputs #66358

Running the same model in TF and TFLiteMicro produces different outputs #66358

kratmanski-logi commented Apr 24, 2024

sawantkumar commented Apr 29, 2024

kratmanski-logi commented Apr 29, 2024

Running the same model in TF and TFLiteMicro produces different outputs #66358

Running the same model in TF and TFLiteMicro produces different outputs #66358

Comments

kratmanski-logi commented Apr 24, 2024

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

sawantkumar commented Apr 29, 2024

kratmanski-logi commented Apr 29, 2024