Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What do timems, averageMs and percentage mean and how are they calculated? #3842

Open
18liumin opened this issue May 6, 2024 · 1 comment
Assignees
Labels
triaged Issue has been triaged by maintainers

Comments

@18liumin
Copy link

18liumin commented May 6, 2024

image

@zerollzeng
Copy link
Collaborator

It's layer profile. like this:

[05/08/2024-14:05:14] [I] === Profile (2094 iterations ) ===
[05/08/2024-14:05:14] [I]    Time(ms)     Avg.(ms)   Median(ms)   Time(%)   Layer
[05/08/2024-14:05:14] [I]       16.15       0.0077       0.0075       0.5   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/conv1_1 + node_of_gpu_0/res_conv1_bn_1 + node_of_gpu_0/res_conv1_bn_2
[05/08/2024-14:05:14] [I]       47.82       0.0228       0.0225       1.6   node_of_gpu_0/conv1_1 + node_of_gpu_0/res_conv1_bn_1 + node_of_gpu_0/res_conv1_bn_2
[05/08/2024-14:05:14] [I]       14.02       0.0067       0.0069       0.5   node_of_gpu_0/pool1_1
[05/08/2024-14:05:14] [I]       14.86       0.0071       0.0072       0.5   node_of_gpu_0/res2_0_branch2a_1 + node_of_gpu_0/res2_0_branch2a_bn_1 + node_of_gpu_0/res2_0_branch2a_bn_2
[05/08/2024-14:05:14] [I]       39.67       0.0189       0.0191       1.3   node_of_gpu_0/res2_0_branch2b_1 + node_of_gpu_0/res2_0_branch2b_bn_1 + node_of_gpu_0/res2_0_branch2b_bn_2
[05/08/2024-14:05:14] [I]       22.46       0.0107       0.0105       0.7   node_of_gpu_0/res2_0_branch1_1 + node_of_gpu_0/res2_0_branch1_bn_1
[05/08/2024-14:05:14] [I]       24.87       0.0119       0.0123       0.8   node_of_gpu_0/res2_0_branch2c_1 + node_of_gpu_0/res2_0_branch2c_bn_1 + node_of_gpu_0/res2_0_branch2c_bn_2 + node_of_gpu_0/res2_0_branch2c_bn_3
[05/08/2024-14:05:14] [I]       20.51       0.0098       0.0101       0.7   node_of_gpu_0/res2_1_branch2a_1 + node_of_gpu_0/res2_1_branch2a_bn_1 + node_of_gpu_0/res2_1_branch2a_bn_2
[05/08/2024-14:05:14] [I]       39.29       0.0188       0.0184       1.3   node_of_gpu_0/res2_1_branch2b_1 + node_of_gpu_0/res2_1_branch2b_bn_1 + node_of_gpu_0/res2_1_branch2b_bn_2
[05/08/2024-14:05:14] [I]       23.73       0.0113       0.0113       0.8   node_of_gpu_0/res2_1_branch2c_1 + node_of_gpu_0/res2_1_branch2c_bn_1 + node_of_gpu_0/res2_1_branch2c_bn_2 + node_of_gpu_0/res2_1_branch2c_bn_3
[05/08/2024-14:05:14] [I]       21.41       0.0102       0.0102       0.7   node_of_gpu_0/res2_2_branch2a_1 + node_of_gpu_0/res2_2_branch2a_bn_1 + node_of_gpu_0/res2_2_branch2a_bn_2
[05/08/2024-14:05:14] [I]       38.78       0.0185       0.0184       1.3   node_of_gpu_0/res2_2_branch2b_1 + node_of_gpu_0/res2_2_branch2b_bn_1 + node_of_gpu_0/res2_2_branch2b_bn_2
[05/08/2024-14:05:14] [I]       24.15       0.0115       0.0113       0.8   node_of_gpu_0/res2_2_branch2c_1 + node_of_gpu_0/res2_2_branch2c_bn_1 + node_of_gpu_0/res2_2_branch2c_bn_2 + node_of_gpu_0/res2_2_branch2c_bn_3
[05/08/2024-14:05:14] [I]       26.66       0.0127       0.0124       0.9   node_of_gpu_0/res3_0_branch2a_1 + node_of_gpu_0/res3_0_branch2a_bn_1 + node_of_gpu_0/res3_0_branch2a_bn_2
[05/08/2024-14:05:14] [I]       82.39       0.0393       0.0390       2.7   node_of_gpu_0/res3_0_branch2b_1 + node_of_gpu_0/res3_0_branch2b_bn_1 + node_of_gpu_0/res3_0_branch2b_bn_2
[05/08/2024-14:05:14] [I]       38.64       0.0185       0.0184       1.3   node_of_gpu_0/res3_0_branch1_1 + node_of_gpu_0/res3_0_branch1_bn_1
[05/08/2024-14:05:14] [I]       22.68       0.0108       0.0113       0.7   node_of_gpu_0/res3_0_branch2c_1 + node_of_gpu_0/res3_0_branch2c_bn_1 + node_of_gpu_0/res3_0_branch2c_bn_2 + node_of_gpu_0/res3_0_branch2c_bn_3
[05/08/2024-14:05:14] [I]       22.93       0.0110       0.0113       0.8   node_of_gpu_0/res3_1_branch2a_1 + node_of_gpu_0/res3_1_branch2a_bn_1 + node_of_gpu_0/res3_1_branch2a_bn_2
[05/08/2024-14:05:14] [I]        8.37       0.0040       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res3_1_branch2b_1 + node_of_gpu_0/res3_1_branch2b_bn_1 + node_of_gpu_0/res3_1_branch2b_bn_2
[05/08/2024-14:05:14] [I]       58.21       0.0278       0.0276       1.9   node_of_gpu_0/res3_1_branch2b_1 + node_of_gpu_0/res3_1_branch2b_bn_1 + node_of_gpu_0/res3_1_branch2b_bn_2
[05/08/2024-14:05:14] [I]        8.40       0.0040       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res3_1_branch2c_1 + node_of_gpu_0/res3_1_branch2c_bn_1 + node_of_gpu_0/res3_1_branch2c_bn_2 + node_of_gpu_0/res3_1_branch2c_bn_3
[05/08/2024-14:05:14] [I]       22.44       0.0107       0.0105       0.7   node_of_gpu_0/res3_1_branch2c_1 + node_of_gpu_0/res3_1_branch2c_bn_1 + node_of_gpu_0/res3_1_branch2c_bn_2 + node_of_gpu_0/res3_1_branch2c_bn_3
[05/08/2024-14:05:14] [I]       22.44       0.0107       0.0105       0.7   node_of_gpu_0/res3_2_branch2a_1 + node_of_gpu_0/res3_2_branch2a_bn_1 + node_of_gpu_0/res3_2_branch2a_bn_2
[05/08/2024-14:05:14] [I]        8.02       0.0038       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res3_2_branch2b_1 + node_of_gpu_0/res3_2_branch2b_bn_1 + node_of_gpu_0/res3_2_branch2b_bn_2
[05/08/2024-14:05:14] [I]       57.47       0.0274       0.0276       1.9   node_of_gpu_0/res3_2_branch2b_1 + node_of_gpu_0/res3_2_branch2b_bn_1 + node_of_gpu_0/res3_2_branch2b_bn_2
[05/08/2024-14:05:14] [I]        8.47       0.0040       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res3_2_branch2c_1 + node_of_gpu_0/res3_2_branch2c_bn_1 + node_of_gpu_0/res3_2_branch2c_bn_2 + node_of_gpu_0/res3_2_branch2c_bn_3
[05/08/2024-14:05:14] [I]       22.79       0.0109       0.0113       0.7   node_of_gpu_0/res3_2_branch2c_1 + node_of_gpu_0/res3_2_branch2c_bn_1 + node_of_gpu_0/res3_2_branch2c_bn_2 + node_of_gpu_0/res3_2_branch2c_bn_3
[05/08/2024-14:05:14] [I]       22.51       0.0108       0.0109       0.7   node_of_gpu_0/res3_3_branch2a_1 + node_of_gpu_0/res3_3_branch2a_bn_1 + node_of_gpu_0/res3_3_branch2a_bn_2
[05/08/2024-14:05:14] [I]        8.03       0.0038       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res3_3_branch2b_1 + node_of_gpu_0/res3_3_branch2b_bn_1 + node_of_gpu_0/res3_3_branch2b_bn_2
[05/08/2024-14:05:14] [I]       57.45       0.0274       0.0276       1.9   node_of_gpu_0/res3_3_branch2b_1 + node_of_gpu_0/res3_3_branch2b_bn_1 + node_of_gpu_0/res3_3_branch2b_bn_2
[05/08/2024-14:05:14] [I]        8.40       0.0040       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res3_3_branch2c_1 + node_of_gpu_0/res3_3_branch2c_bn_1 + node_of_gpu_0/res3_3_branch2c_bn_2 + node_of_gpu_0/res3_3_branch2c_bn_3
[05/08/2024-14:05:14] [I]       22.24       0.0106       0.0102       0.7   node_of_gpu_0/res3_3_branch2c_1 + node_of_gpu_0/res3_3_branch2c_bn_1 + node_of_gpu_0/res3_3_branch2c_bn_2 + node_of_gpu_0/res3_3_branch2c_bn_3
[05/08/2024-14:05:14] [I]       28.93       0.0138       0.0136       0.9   node_of_gpu_0/res4_0_branch2a_1 + node_of_gpu_0/res4_0_branch2a_bn_1 + node_of_gpu_0/res4_0_branch2a_bn_2
[05/08/2024-14:05:14] [I]      126.68       0.0605       0.0604       4.1   node_of_gpu_0/res4_0_branch2b_1 + node_of_gpu_0/res4_0_branch2b_bn_1 + node_of_gpu_0/res4_0_branch2b_bn_2
[05/08/2024-14:05:14] [I]       59.32       0.0283       0.0287       1.9   node_of_gpu_0/res4_0_branch1_1 + node_of_gpu_0/res4_0_branch1_bn_1
[05/08/2024-14:05:14] [I]       27.70       0.0132       0.0133       0.9   node_of_gpu_0/res4_0_branch2c_1 + node_of_gpu_0/res4_0_branch2c_bn_1 + node_of_gpu_0/res4_0_branch2c_bn_2 + node_of_gpu_0/res4_0_branch2c_bn_3
[05/08/2024-14:05:14] [I]       26.84       0.0128       0.0131       0.9   node_of_gpu_0/res4_1_branch2a_1 + node_of_gpu_0/res4_1_branch2a_bn_1 + node_of_gpu_0/res4_1_branch2a_bn_2
[05/08/2024-14:05:14] [I]        8.02       0.0038       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res4_1_branch2b_1 + node_of_gpu_0/res4_1_branch2b_bn_1 + node_of_gpu_0/res4_1_branch2b_bn_2
[05/08/2024-14:05:14] [I]       82.49       0.0394       0.0392       2.7   node_of_gpu_0/res4_1_branch2b_1 + node_of_gpu_0/res4_1_branch2b_bn_1 + node_of_gpu_0/res4_1_branch2b_bn_2
[05/08/2024-14:05:14] [I]        8.08       0.0039       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res4_1_branch2c_1 + node_of_gpu_0/res4_1_branch2c_bn_1 + node_of_gpu_0/res4_1_branch2c_bn_2 + node_of_gpu_0/res4_1_branch2c_bn_3
[05/08/2024-14:05:14] [I]       26.99       0.0129       0.0132       0.9   node_of_gpu_0/res4_1_branch2c_1 + node_of_gpu_0/res4_1_branch2c_bn_1 + node_of_gpu_0/res4_1_branch2c_bn_2 + node_of_gpu_0/res4_1_branch2c_bn_3
[05/08/2024-14:05:14] [I]       25.67       0.0123       0.0123       0.8   node_of_gpu_0/res4_2_branch2a_1 + node_of_gpu_0/res4_2_branch2a_bn_1 + node_of_gpu_0/res4_2_branch2a_bn_2
[05/08/2024-14:05:14] [I]        7.80       0.0037       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res4_2_branch2b_1 + node_of_gpu_0/res4_2_branch2b_bn_1 + node_of_gpu_0/res4_2_branch2b_bn_2
[05/08/2024-14:05:14] [I]       82.20       0.0393       0.0389       2.7   node_of_gpu_0/res4_2_branch2b_1 + node_of_gpu_0/res4_2_branch2b_bn_1 + node_of_gpu_0/res4_2_branch2b_bn_2
[05/08/2024-14:05:14] [I]        8.00       0.0038       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res4_2_branch2c_1 + node_of_gpu_0/res4_2_branch2c_bn_1 + node_of_gpu_0/res4_2_branch2c_bn_2 + node_of_gpu_0/res4_2_branch2c_bn_3
[05/08/2024-14:05:14] [I]       26.40       0.0126       0.0123       0.9   node_of_gpu_0/res4_2_branch2c_1 + node_of_gpu_0/res4_2_branch2c_bn_1 + node_of_gpu_0/res4_2_branch2c_bn_2 + node_of_gpu_0/res4_2_branch2c_bn_3
[05/08/2024-14:05:14] [I]       25.79       0.0123       0.0123       0.8   node_of_gpu_0/res4_3_branch2a_1 + node_of_gpu_0/res4_3_branch2a_bn_1 + node_of_gpu_0/res4_3_branch2a_bn_2
[05/08/2024-14:05:14] [I]        7.66       0.0037       0.0040       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res4_3_branch2b_1 + node_of_gpu_0/res4_3_branch2b_bn_1 + node_of_gpu_0/res4_3_branch2b_bn_2
[05/08/2024-14:05:14] [I]       81.69       0.0390       0.0389       2.7   node_of_gpu_0/res4_3_branch2b_1 + node_of_gpu_0/res4_3_branch2b_bn_1 + node_of_gpu_0/res4_3_branch2b_bn_2
[05/08/2024-14:05:14] [I]        8.11       0.0039       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res4_3_branch2c_1 + node_of_gpu_0/res4_3_branch2c_bn_1 + node_of_gpu_0/res4_3_branch2c_bn_2 + node_of_gpu_0/res4_3_branch2c_bn_3
[05/08/2024-14:05:14] [I]       27.09       0.0129       0.0132       0.9   node_of_gpu_0/res4_3_branch2c_1 + node_of_gpu_0/res4_3_branch2c_bn_1 + node_of_gpu_0/res4_3_branch2c_bn_2 + node_of_gpu_0/res4_3_branch2c_bn_3
[05/08/2024-14:05:14] [I]       25.72       0.0123       0.0123       0.8   node_of_gpu_0/res4_4_branch2a_1 + node_of_gpu_0/res4_4_branch2a_bn_1 + node_of_gpu_0/res4_4_branch2a_bn_2
[05/08/2024-14:05:14] [I]        7.77       0.0037       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res4_4_branch2b_1 + node_of_gpu_0/res4_4_branch2b_bn_1 + node_of_gpu_0/res4_4_branch2b_bn_2
[05/08/2024-14:05:14] [I]       82.44       0.0394       0.0391       2.7   node_of_gpu_0/res4_4_branch2b_1 + node_of_gpu_0/res4_4_branch2b_bn_1 + node_of_gpu_0/res4_4_branch2b_bn_2
[05/08/2024-14:05:14] [I]        7.98       0.0038       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res4_4_branch2c_1 + node_of_gpu_0/res4_4_branch2c_bn_1 + node_of_gpu_0/res4_4_branch2c_bn_2 + node_of_gpu_0/res4_4_branch2c_bn_3
[05/08/2024-14:05:14] [I]       28.69       0.0137       0.0133       0.9   node_of_gpu_0/res4_4_branch2c_1 + node_of_gpu_0/res4_4_branch2c_bn_1 + node_of_gpu_0/res4_4_branch2c_bn_2 + node_of_gpu_0/res4_4_branch2c_bn_3
[05/08/2024-14:05:14] [I]       25.81       0.0123       0.0123       0.8   node_of_gpu_0/res4_5_branch2a_1 + node_of_gpu_0/res4_5_branch2a_bn_1 + node_of_gpu_0/res4_5_branch2a_bn_2
[05/08/2024-14:05:14] [I]        7.67       0.0037       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res4_5_branch2b_1 + node_of_gpu_0/res4_5_branch2b_bn_1 + node_of_gpu_0/res4_5_branch2b_bn_2
[05/08/2024-14:05:14] [I]       84.49       0.0404       0.0400       2.8   node_of_gpu_0/res4_5_branch2b_1 + node_of_gpu_0/res4_5_branch2b_bn_1 + node_of_gpu_0/res4_5_branch2b_bn_2
[05/08/2024-14:05:14] [I]        8.05       0.0038       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res4_5_branch2c_1 + node_of_gpu_0/res4_5_branch2c_bn_1 + node_of_gpu_0/res4_5_branch2c_bn_2 + node_of_gpu_0/res4_5_branch2c_bn_3
[05/08/2024-14:05:14] [I]       27.10       0.0129       0.0132       0.9   node_of_gpu_0/res4_5_branch2c_1 + node_of_gpu_0/res4_5_branch2c_bn_1 + node_of_gpu_0/res4_5_branch2c_bn_2 + node_of_gpu_0/res4_5_branch2c_bn_3
[05/08/2024-14:05:14] [I]       40.56       0.0194       0.0195       1.3   node_of_gpu_0/res5_0_branch2a_1 + node_of_gpu_0/res5_0_branch2a_bn_1 + node_of_gpu_0/res5_0_branch2a_bn_2
[05/08/2024-14:05:14] [I]      270.12       0.1290       0.1290       8.8   node_of_gpu_0/res5_0_branch2b_1 + node_of_gpu_0/res5_0_branch2b_bn_1 + node_of_gpu_0/res5_0_branch2b_bn_2
[05/08/2024-14:05:14] [I]      114.63       0.0547       0.0545       3.8   node_of_gpu_0/res5_0_branch1_1 + node_of_gpu_0/res5_0_branch1_bn_1
[05/08/2024-14:05:14] [I]       49.27       0.0235       0.0236       1.6   node_of_gpu_0/res5_0_branch2c_1 + node_of_gpu_0/res5_0_branch2c_bn_1 + node_of_gpu_0/res5_0_branch2c_bn_2 + node_of_gpu_0/res5_0_branch2c_bn_3
[05/08/2024-14:05:14] [I]       53.11       0.0254       0.0256       1.7   node_of_gpu_0/res5_1_branch2a_1 + node_of_gpu_0/res5_1_branch2a_bn_1 + node_of_gpu_0/res5_1_branch2a_bn_2
[05/08/2024-14:05:14] [I]        8.11       0.0039       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res5_1_branch2b_1 + node_of_gpu_0/res5_1_branch2b_bn_1 + node_of_gpu_0/res5_1_branch2b_bn_2
[05/08/2024-14:05:14] [I]      184.84       0.0883       0.0881       6.0   node_of_gpu_0/res5_1_branch2b_1 + node_of_gpu_0/res5_1_branch2b_bn_1 + node_of_gpu_0/res5_1_branch2b_bn_2
[05/08/2024-14:05:14] [I]        8.17       0.0039       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res5_1_branch2c_1 + node_of_gpu_0/res5_1_branch2c_bn_1 + node_of_gpu_0/res5_1_branch2c_bn_2 + node_of_gpu_0/res5_1_branch2c_bn_3
[05/08/2024-14:05:14] [I]       51.69       0.0247       0.0246       1.7   node_of_gpu_0/res5_1_branch2c_1 + node_of_gpu_0/res5_1_branch2c_bn_1 + node_of_gpu_0/res5_1_branch2c_bn_2 + node_of_gpu_0/res5_1_branch2c_bn_3
[05/08/2024-14:05:14] [I]       52.93       0.0253       0.0256       1.7   node_of_gpu_0/res5_2_branch2a_1 + node_of_gpu_0/res5_2_branch2a_bn_1 + node_of_gpu_0/res5_2_branch2a_bn_2
[05/08/2024-14:05:14] [I]        7.66       0.0037       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res5_2_branch2b_1 + node_of_gpu_0/res5_2_branch2b_bn_1 + node_of_gpu_0/res5_2_branch2b_bn_2
[05/08/2024-14:05:14] [I]      183.24       0.0875       0.0872       6.0   node_of_gpu_0/res5_2_branch2b_1 + node_of_gpu_0/res5_2_branch2b_bn_1 + node_of_gpu_0/res5_2_branch2b_bn_2
[05/08/2024-14:05:14] [I]        8.08       0.0039       0.0041       0.3   Reformatting CopyNode for Input Tensor 0 to node_of_gpu_0/res5_2_branch2c_1 + node_of_gpu_0/res5_2_branch2c_bn_1 + node_of_gpu_0/res5_2_branch2c_bn_2 + node_of_gpu_0/res5_2_branch2c_bn_3
[05/08/2024-14:05:14] [I]       49.10       0.0234       0.0236       1.6   node_of_gpu_0/res5_2_branch2c_1 + node_of_gpu_0/res5_2_branch2c_bn_1 + node_of_gpu_0/res5_2_branch2c_bn_2 + node_of_gpu_0/res5_2_branch2c_bn_3
[05/08/2024-14:05:14] [I]       11.31       0.0054       0.0051       0.4   node_of_gpu_0/pool5_1
[05/08/2024-14:05:14] [I]       76.69       0.0366       0.0369       2.5   node_of_gpu_0/pred_1 + (Unnamed Layer* 178) [ElementWise]
[05/08/2024-14:05:14] [I]        9.48       0.0045       0.0042       0.3   node_of_gpu_0/softmax_1
[05/08/2024-14:05:14] [I]     3056.46       1.4596       1.4544     100.0   Total
[05/08/2024-14:05:14] [I]
&&&& PASSED TensorRT.trtexec [TensorRT v100001] # trtexec --onnx=ResNet50.onnx --dumpProfile

@zerollzeng zerollzeng self-assigned this May 8, 2024
@zerollzeng zerollzeng added the triaged Issue has been triaged by maintainers label May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

2 participants