Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running the same model in TF and TFLiteMicro produces different outputs #66358

Open
kratmanski-logi opened this issue Apr 24, 2024 · 2 comments
Assignees
Labels
comp:lite TF Lite related issues comp:micro Related to TensorFlow Lite Microcontrollers type:bug Bug

Comments

@kratmanski-logi
Copy link

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

source

TensorFlow version

latest

Custom code

No

OS platform and distribution

both Linux and Windows

Mobile device

No response

Python version

latest

Bazel version

No response

GCC/compiler version

latest

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

I have trained float point tf model and converted it to 8bit .tflite model. I run this .tflite model on the same test input in two environments, both on the same PC with linux (no ):

  1. in tensorflow, installed as typically in linux via "pip install tensorflow" and then imported via "import tensorflow as tf" in python
  2. in gcc in linux, after converting .tflite to flat buffer, where the C reference kernels are in https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/kernels

Problem: Running the same .tflite model/same input in these two different environments produces different outputs. Given that the (very small and simple) model is already heavily quantized, I am expecting to get bit-exact output, whereas I observe the output difference that is beyond some sort of rounding error. In the process of debugging, I narrowed down to one of the kernels that produces not-bit-exact result: Fully_Connected. The source code for this kernel under environment (2) lives here (line 65):
https://github.com/tensorflow/tflite-micro/blob/main/tensorflow/lite/kernels/internal/reference/fully_connected.h

Main Question: Can you point me to the source code of fully_connected computations for the environment (1) so I can compare the computations to the ones for the environment (2) and figure out myself where the output difference comes from? I understand that in (2) python calls precompiled C/C++ kernels, and I would like to see the source code of fully_connected (and other standard kernels such as conv2d) that was used for pre-compiling.

Note: I think the following code is being used in environment (1), however it is still not at the deep level where i can clearly see the matrix multiplication -> add bias -> rescale sequence of fully_connected operations that I can compare with the source code above for the environment (2)

Standalone code to reproduce the issue

https://github.com/tensorflow/tflite-micro/blob/main/tensorflow/lite/kernels/internal/reference/fully_connected.h
https://github.com/sourcecode369/tensorflow-1/blob/master/tensorflow/lite/kernels/cpu_backend_gemm_ruy.h

Relevant log output

No response

@google-ml-butler google-ml-butler bot added the type:bug Bug label Apr 24, 2024
@Venkat6871 Venkat6871 added the comp:lite TF Lite related issues label Apr 25, 2024
@Venkat6871 Venkat6871 assigned sawantkumar and unassigned Venkat6871 Apr 25, 2024
@sawantkumar sawantkumar added the comp:micro Related to TensorFlow Lite Microcontrollers label Apr 29, 2024
@sawantkumar
Copy link

hi @kratmanski-logi ,

Can you check this link and let me know if this is the one you were looking for.

@sawantkumar sawantkumar added the stat:awaiting response Status - Awaiting response from author label Apr 29, 2024
@kratmanski-logi
Copy link
Author

Hi,
I checked your link, the core.py seem to have a Dense class reference, however it does not tell me how exactly 8x8bit multiplication (data X weights), adding 32bit bias, and rescaling is happening, on the same detail as in link I provided for Micro side (that line 65 for fully connected). My model is fixed point 8x8 but data X weights, 32bit bias, and since fixed point there is rescaling/zero centering before passing to the next layer. I need to see the reference fixed point computation. Thanks

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:lite TF Lite related issues comp:micro Related to TensorFlow Lite Microcontrollers type:bug Bug
Projects
None yet
Development

No branches or pull requests

3 participants