Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark_model provided seems ineffective on gpu #77

Open
ireneMsm2020 opened this issue Aug 18, 2022 · 3 comments
Open

Benchmark_model provided seems ineffective on gpu #77

ireneMsm2020 opened this issue Aug 18, 2022 · 3 comments

Comments

@ireneMsm2020
Copy link

Hi,
when I profile the gpu latency on Snapdragon888+ with tf2.7 benchmark_model you provided, the latency seems to be always zero. Is there any idea?

Thank you in advance!

@JiahangXu
Copy link
Collaborator

Hi, it seems there is something wrong in profiling. Maybe you could debug the benchmark model by running command like this:

# push the model to device
adb [-s <device-serial>] push <path-of-your-model> <remote-model-path-to-push>

# run the benchmark model
adb [-s <device-serial>] shell <path-of-your-benchmark-model> --num_threads=1 --num_runs=50 --warmup_runs=10 --graph=<remote-model-path> --enable_op_profiling=true --use_gpu=false

if the benchmark model works well, there will be messages containing latency of each node, and summary message like this:

Timings (microseconds): count=222 first=3897 curr=3924 min=3858 max=4031 avg=3925.67 std=29
Memory (bytes): count=0
133 nodes observed

@JiahangXu JiahangXu reopened this Jan 6, 2023
@lorena527
Copy link

我遇到了相同的问题,同时我进行了测试,可以得到如下的信息:Timings (microseconds): count=100 first=913862 curr=914318 min=877044 max=926036 avg=911992 std=8223
Memory (bytes): count=0
1 nodes observed,但是在正则化匹配时匹配失败 @ @JiahangXu

@lorena527
Copy link

我在dev/profile-in-local上找到了解决方案,谢谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants