Different results between inference on CPU and TPU #117

CyberViking997 · 2024-05-02T13:20:54Z

Description

I was trying to implement the code needed to run the MLPerf-Tiny benchmark suite's networks on the coral micro board.

After finding the problem related to serial communication (described in issue #116) i was able to write the code, send the input values from the pc to the board and run the inference.
I found that while the inference results on cpu are correct, the inference results on TPU are very wrong. In the sense that not only do the results vary a little, but the performance of the benchmark goes from about 80% accuracy to 10% accuracy.

Using as an example the ResNet used in the ImageClassification benchmark:
The original network is (image generated using Netron):

This network was created, trained and quantized using the scripts provided in the MLPerf-tiny benchmark repo/training folder

The network, after the compilation using edgetpu-compiler is:

Using as an example the file the first test image (an emu):

the results are:
CPU: [0.000,0.000,0.664,0.281,0.051,0.000,0.004,0.000,0.000,0.000]
TPU: [0.000,0.000,0.000,0.004,0.000,0.621,0.000,0.371,0.000,0.000]

label_names: ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

so the CPU recognize that the image depicts a bird while the TPU return very different and wrong results.

I found other issue describing the same problem and I decided to try to run some layers of the network on the CPU in order to find out which one is the cause of these different results.
I tried splitting the model using the -i flag on both the output tensors of the add layers. Both the times the inference was completed but again with wrong results.

I then tried to split the model before the first add layer., specifying the output of the first Conv2D layer, obtaining the network below.

This time the inference hanged the code and no result was provided.
( The problem is in the inference phase because i turn on the user led before calling interpreter.Invoke() and turn off it as soon as the inference ends).

Additional informations:

Since the code used to parse inputs coming from the serial interface and to save them in the input tensor is the same for both CPU and TPU networks I don't think the issue can be there.
Same for the results
For the CPU model i added to the MicroMutableOpResolver the 7 operations needed. For the TPU i tried adding only the customOp and adding the customOp and all the 7 operation used by the CPU
Using the splitted model i added to the resolver the customOp and the other 7 operations.
kTensorArenaSize is sufficently large to be able to run the full model on the CPU. For the splitted network i kept the same value.
The TPU is turned on and the tpu_context does not run out of scope. In fact, the white led won't turn off until I reboot the board.
No error message is printed on the serial interface. Probably because the code crash somehow and the consoleTask stops running.

Click to expand!

Issue Type

Support

Operating System

Ubuntu

Coral Device

Dev Board Micro

Other Devices

No response

Programming Language

C++

Relevant Log Output

No response

The text was updated successfully, but these errors were encountered:

google-coral-bot · 2024-05-15T16:21:24Z

Are you satisfied with the resolution of your issue?
Yes
No

google-coral-bot bot added comp:compiler Compiler related issues comp:model Hardware:Dev Board Coral Dev Board issues subtype:ubuntu/linux type:support labels May 2, 2024

CyberViking997 closed this as completed May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different results between inference on CPU and TPU #117

Different results between inference on CPU and TPU #117

CyberViking997 commented May 2, 2024 •

edited by google-coral-bot bot

Issue Type

Operating System

Coral Device

Other Devices

Programming Language

Relevant Log Output

google-coral-bot bot commented May 15, 2024

Different results between inference on CPU and TPU #117

Different results between inference on CPU and TPU #117

Comments

CyberViking997 commented May 2, 2024 • edited by google-coral-bot bot

Description

Issue Type

Operating System

Coral Device

Other Devices

Programming Language

Relevant Log Output

google-coral-bot bot commented May 15, 2024

CyberViking997 commented May 2, 2024 •

edited by google-coral-bot bot