Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different results between inference on CPU and TPU #117

Closed
CyberViking997 opened this issue May 2, 2024 · 1 comment
Closed

Different results between inference on CPU and TPU #117

CyberViking997 opened this issue May 2, 2024 · 1 comment

Comments

@CyberViking997
Copy link

CyberViking997 commented May 2, 2024

Description

I was trying to implement the code needed to run the MLPerf-Tiny benchmark suite's networks on the coral micro board.

After finding the problem related to serial communication (described in issue #116) i was able to write the code, send the input values from the pc to the board and run the inference.
I found that while the inference results on cpu are correct, the inference results on TPU are very wrong. In the sense that not only do the results vary a little, but the performance of the benchmark goes from about 80% accuracy to 10% accuracy.

Using as an example the ResNet used in the ImageClassification benchmark:
The original network is (image generated using Netron):
image

This network was created, trained and quantized using the scripts provided in the MLPerf-tiny benchmark repo/training folder

The network, after the compilation using edgetpu-compiler is:
image

Using as an example the file the first test image (an emu):
Figure_1
the results are:
CPU: [0.000,0.000,0.664,0.281,0.051,0.000,0.004,0.000,0.000,0.000]
TPU: [0.000,0.000,0.000,0.004,0.000,0.621,0.000,0.371,0.000,0.000]

label_names: ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

so the CPU recognize that the image depicts a bird while the TPU return very different and wrong results.

I found other issue describing the same problem and I decided to try to run some layers of the network on the CPU in order to find out which one is the cause of these different results.
I tried splitting the model using the -i flag on both the output tensors of the add layers. Both the times the inference was completed but again with wrong results.

I then tried to split the model before the first add layer., specifying the output of the first Conv2D layer, obtaining the network below.
image

This time the inference hanged the code and no result was provided.
( The problem is in the inference phase because i turn on the user led before calling interpreter.Invoke() and turn off it as soon as the inference ends).

Additional informations:

  • Since the code used to parse inputs coming from the serial interface and to save them in the input tensor is the same for both CPU and TPU networks I don't think the issue can be there.
  • Same for the results
  • For the CPU model i added to the MicroMutableOpResolver the 7 operations needed. For the TPU i tried adding only the customOp and adding the customOp and all the 7 operation used by the CPU
  • Using the splitted model i added to the resolver the customOp and the other 7 operations.
  • kTensorArenaSize is sufficently large to be able to run the full model on the CPU. For the splitted network i kept the same value.
  • The TPU is turned on and the tpu_context does not run out of scope. In fact, the white led won't turn off until I reboot the board.
  • No error message is printed on the serial interface. Probably because the code crash somehow and the consoleTask stops running.
Click to expand!

Issue Type

Support

Operating System

Ubuntu

Coral Device

Dev Board Micro

Other Devices

No response

Programming Language

C++

Relevant Log Output

No response

Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant