forked from cs5220-f20/shallow-water
-
Notifications
You must be signed in to change notification settings - Fork 0
/
profiling_1000.txt
540 lines (520 loc) · 41.4 KB
/
profiling_1000.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
Reading Profile files in profile.*
NODE 0;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 15:34.549 1 1 934549803 .TAU application
100.0 0.558 15:34.544 1 3 934544055 main
99.9 1 15:33.216 1 211 933216504 run_sim
95.8 0.073 14:55.062 50 50 17901247 central2d_run
95.8 4 14:55.062 50 6060 17901246 central2d_xrun
95.4 2,486 14:51.202 2424 104849 367658 central2d_step
47.9 1,509 7:27.775 2424 4.94496E+06 184726 central2d_predict
47.3 7:22.306 7:22.335 4.92314E+06 49980 90 limited_derivk
47.3 7:22.234 7:22.264 4.92314E+06 50021 90 limited_deriv1
47.1 2,192 7:20.567 2424 5.00133E+06 181752 central2d_correct
2.0 0.334 18,910 51 459 370800 gather_sol
1.9 17,793 17,793 459 0 38767 copy_u
1.8 129 16,897 408 816 41415 recv_full_u
1.5 13,580 13,580 51 0 266283 solution_check
0.6 5,550 5,550 51 0 108842 viz_frame
0.4 8 3,621 1212 33936 2988 central2d_periodic
0.4 3,528 3,528 4848 0 728 MPI_Sendrecv()
0.1 1,301 1,301 1 0 1301798 MPI_Init()
0.1 986 986 408 0 2419 MPI_Recv()
0.0 34 372 100001 100001 4 shallow2d_flux [THROTTLED]
0.0 338 338 100001 0 3 shallow2dv_flux [THROTTLED]
0.0 1 209 1212 1212 173 shallow2d_speed
0.0 208 208 1212 0 172 shallow2dv_speed
0.0 84 84 29088 0 3 copy_subgrid
0.0 55 70 1 100001 70546 lua_init_sim
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 42 42 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 36 36 1 0 36135 viz_close
0.0 25 25 1 0 25195 MPI_Finalize()
0.0 23 23 1212 0 19 MPI_Allreduce()
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 14 14 100001 0 0 central2d_offset [THROTTLED]
0.0 2 2 1 0 2043 viz_open
0.0 1 1 2 0 873 central2d_free
0.0 0.028 0.028 1 0 28 MPI_Barrier()
0.0 0.015 0.016 1 2 16 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 0, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 1;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 15:34.549 1 1 934549723 .TAU application
100.0 0.478 15:34.543 1 3 934543933 main
99.8 0.624 15:32.476 1 108 932476951 run_sim
99.4 0.082 15:29.401 50 50 18588036 central2d_run
99.4 4 15:29.401 50 6060 18588034 central2d_xrun
95.1 2,402 14:48.633 2424 104849 366598 central2d_step
47.4 7:23.206 7:23.235 4.92314E+06 49980 90 limited_derivk
47.4 1,505 7:23.156 2424 4.94496E+06 182820 central2d_predict
47.4 2,216 7:22.712 2424 5.00133E+06 182637 central2d_correct
47.0 7:18.840 7:18.869 4.92314E+06 50021 89 limited_deriv1
4.3 9 40,528 1212 33936 33439 central2d_periodic
4.3 40,428 40,428 4848 0 8339 MPI_Sendrecv()
0.2 0.087 2,285 51 51 44809 gather_sol
0.2 0.087 2,285 51 51 44807 send_full_u
0.2 2,285 2,285 51 0 44805 MPI_Send()
0.1 1,302 1,302 1 0 1302093 MPI_Init()
0.1 764 764 1 0 764411 MPI_Finalize()
0.1 718 718 1 0 718044 MPI_Barrier()
0.0 34 362 100001 100001 4 shallow2d_flux [THROTTLED]
0.0 327 327 100001 0 3 shallow2dv_flux [THROTTLED]
0.0 0.949 212 1212 1212 175 shallow2d_speed
0.0 211 211 1212 0 174 shallow2dv_speed
0.0 90 90 29088 0 3 copy_subgrid
0.0 56 70 1 100001 70743 lua_init_sim
0.0 42 57 100001 100001 1 limdiff [THROTTLED]
0.0 41 41 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 23 23 1212 0 20 MPI_Allreduce()
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 14 14 100001 0 0 central2d_offset [THROTTLED]
0.0 0.503 0.503 2 0 252 central2d_free
0.0 0.016 0.017 1 2 17 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 1, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 2;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 15:34.550 1 1 934550022 .TAU application
100.0 0.474 15:34.544 1 3 934544134 main
99.8 0.688 15:32.518 1 108 932518462 run_sim
99.2 0.1 15:27.216 50 50 18544331 central2d_run
99.2 4 15:27.216 50 6060 18544329 central2d_xrun
94.7 2,299 14:44.732 2424 104849 364988 central2d_step
47.6 1,507 7:24.631 2424 4.94496E+06 183429 central2d_predict
47.0 7:19.240 7:19.269 4.92314E+06 50024 89 limited_deriv1
47.0 7:19.080 7:19.109 4.92314E+06 49977 89 limited_derivk
46.8 2,155 7:17.449 2424 5.00133E+06 180466 central2d_correct
4.5 9 42,249 1212 33936 34860 central2d_periodic
4.5 42,149 42,149 4848 0 8694 MPI_Sendrecv()
0.5 0.092 4,555 51 51 89322 gather_sol
0.5 0.102 4,555 51 51 89320 send_full_u
0.5 4,555 4,555 51 0 89318 MPI_Send()
0.1 1,302 1,302 1 0 1302215 MPI_Init()
0.1 722 722 1 0 722983 MPI_Finalize()
0.1 675 675 1 0 675092 MPI_Barrier()
0.0 34 351 100001 100001 4 shallow2d_flux [THROTTLED]
0.0 316 316 100001 0 3 shallow2dv_flux [THROTTLED]
0.0 1 210 1212 1212 174 shallow2d_speed
0.0 209 209 1212 0 173 shallow2dv_speed
0.0 91 91 29088 0 3 copy_subgrid
0.0 55 70 1 100001 70201 lua_init_sim
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 39 39 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 19 19 1212 0 16 MPI_Allreduce()
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 14 14 100001 0 0 central2d_offset [THROTTLED]
0.0 0.508 0.508 2 0 254 central2d_free
0.0 0.016 0.017 1 2 17 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 2, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 3;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 15:34.549 1 1 934549237 .TAU application
100.0 0.473 15:34.543 1 3 934543427 main
99.8 0.688 15:32.559 1 108 932559514 run_sim
99.0 0.089 15:25.217 50 50 18504352 central2d_run
99.0 4 15:25.217 50 6060 18504350 central2d_xrun
94.7 2,299 14:44.878 2424 104849 365049 central2d_step
47.3 1,516 7:21.636 2424 4.94496E+06 182193 central2d_predict
47.1 2,240 7:20.590 2424 5.00133E+06 181762 central2d_correct
47.1 7:20.194 7:20.223 4.92314E+06 49980 89 limited_derivk
46.9 7:18.176 7:18.205 4.92314E+06 50021 89 limited_deriv1
4.3 9 40,098 1212 33936 33085 central2d_periodic
4.3 40,000 40,000 4848 0 8251 MPI_Sendrecv()
0.7 0.096 6,637 51 51 130156 gather_sol
0.7 0.128 6,637 51 51 130154 send_full_u
0.7 6,637 6,637 51 0 130151 MPI_Send()
0.1 1,302 1,302 1 0 1302187 MPI_Init()
0.1 681 681 1 0 681253 MPI_Finalize()
0.1 632 632 1 0 632479 MPI_Barrier()
0.0 34 351 100001 100001 4 shallow2d_flux [THROTTLED]
0.0 317 317 100001 0 3 shallow2dv_flux [THROTTLED]
0.0 0.982 212 1212 1212 175 shallow2d_speed
0.0 211 211 1212 0 174 shallow2dv_speed
0.0 89 89 29088 0 3 copy_subgrid
0.0 55 70 1 100001 70280 lua_init_sim
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 42 42 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 23 23 1212 0 19 MPI_Allreduce()
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 14 14 100001 0 0 central2d_offset [THROTTLED]
0.0 0.503 0.503 2 0 252 central2d_free
0.0 0.016 0.017 1 2 17 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 3, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 4;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 15:34.548 1 1 934548409 .TAU application
100.0 0.342 15:34.542 1 3 934542607 main
99.8 0.635 15:32.600 1 108 932600755 run_sim
98.8 0.09 15:23.303 50 50 18466070 central2d_run
98.8 4 15:23.303 50 6060 18466068 central2d_xrun
95.1 2,378 14:48.362 2424 104849 366486 central2d_step
47.4 7:23.108 7:23.137 4.92314E+06 49980 90 limited_derivk
47.4 1,502 7:23.110 2424 4.94496E+06 182801 central2d_predict
47.4 2,205 7:22.515 2424 5.00133E+06 182556 central2d_correct
46.9 7:18.708 7:18.737 4.92314E+06 50021 89 limited_deriv1
3.7 9 34,699 1212 33936 28630 central2d_periodic
3.7 34,605 34,605 4848 0 7138 MPI_Sendrecv()
0.9 0.103 8,638 51 51 169377 gather_sol
0.9 0.109 8,638 51 51 169375 send_full_u
0.9 8,638 8,638 51 0 169373 MPI_Send()
0.1 1,302 1,302 1 0 1302141 MPI_Init()
0.1 639 639 1 0 639369 MPI_Finalize()
0.1 587 587 1 0 587093 MPI_Barrier()
0.0 34 358 100001 100001 4 shallow2d_flux [THROTTLED]
0.0 324 324 100001 0 3 shallow2dv_flux [THROTTLED]
0.0 0.97 210 1212 1212 173 shallow2d_speed
0.0 209 209 1212 0 173 shallow2dv_speed
0.0 84 84 29088 0 3 copy_subgrid
0.0 55 70 1 100001 70763 lua_init_sim
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 42 42 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 26 26 1212 0 22 MPI_Allreduce()
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 15 15 100001 0 0 central2d_offset [THROTTLED]
0.0 0.481 0.481 2 0 240 central2d_free
0.0 0.018 0.019 1 2 19 central2d_init
0.0 0.012 0.012 1 0 12 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 4, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 5;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 15:34.549 1 1 934549996 .TAU application
100.0 0.409 15:34.544 1 3 934544087 main
99.8 0.708 15:32.642 1 108 932642054 run_sim
98.6 0.089 15:21.454 50 50 18429083 central2d_run
98.6 4 15:21.454 50 6060 18429081 central2d_xrun
95.4 2,477 14:51.909 2424 104849 367950 central2d_step
47.7 1,607 7:25.890 2424 4.94496E+06 183948 central2d_predict
47.5 7:23.778 7:23.807 4.92314E+06 49977 90 limited_derivk
47.4 2,375 7:23.165 2424 5.00133E+06 182824 central2d_correct
47.2 7:21.190 7:21.220 4.92314E+06 50024 90 limited_deriv1
3.1 9 29,308 1212 33936 24182 central2d_periodic
3.1 29,214 29,214 4848 0 6026 MPI_Sendrecv()
1.1 0.105 10,571 51 51 207282 gather_sol
1.1 0.128 10,571 51 51 207280 send_full_u
1.1 10,571 10,571 51 0 207277 MPI_Send()
0.1 1,302 1,302 1 0 1302548 MPI_Init()
0.1 599 599 1 0 599076 MPI_Finalize()
0.1 544 544 1 0 544197 MPI_Barrier()
0.0 35 376 100001 100001 4 shallow2d_flux [THROTTLED]
0.0 341 341 100001 0 3 shallow2dv_flux [THROTTLED]
0.0 1 204 1212 1212 169 shallow2d_speed
0.0 203 203 1212 0 168 shallow2dv_speed
0.0 83 83 29088 0 3 copy_subgrid
0.0 56 71 1 100001 71081 lua_init_sim
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 45 45 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 26 26 1212 0 22 MPI_Allreduce()
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 14 14 100001 0 0 central2d_offset [THROTTLED]
0.0 0.51 0.51 2 0 255 central2d_free
0.0 0.015 0.015 1 2 15 central2d_init
0.0 0.015 0.015 1 0 15 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 5, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 6;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 15:34.550 1 1 934550218 .TAU application
100.0 0.466 15:34.544 1 3 934544255 main
99.8 0.676 15:32.683 1 108 932683215 run_sim
98.4 0.097 15:19.452 50 50 18389042 central2d_run
98.4 4 15:19.452 50 6060 18389040 central2d_xrun
95.5 2,497 14:52.724 2424 104849 368286 central2d_step
47.8 1,658 7:26.268 2424 4.91587E+06 184104 central2d_predict
47.5 7:23.923 7:23.952 4.89406E+06 49980 91 limited_derivk
47.5 2,499 7:23.575 2424 4.97224E+06 182993 central2d_correct
47.3 7:21.653 7:21.682 4.89406E+06 50021 90 limited_deriv1
2.8 9 26,493 1212 33936 21860 central2d_periodic
2.8 26,392 26,392 4848 0 5444 MPI_Sendrecv()
1.4 0.147 12,657 51 51 248180 gather_sol
1.4 0.119 12,657 51 51 248177 send_full_u
1.4 12,656 12,656 51 0 248175 MPI_Send()
0.1 1,302 1,302 1 0 1302216 MPI_Init()
0.1 558 558 1 0 558358 MPI_Finalize()
0.1 501 501 1 0 501567 MPI_Barrier()
0.0 34 382 100001 100001 4 shallow2d_flux [THROTTLED]
0.0 348 348 100001 0 3 shallow2dv_flux [THROTTLED]
0.0 1 195 1212 1212 161 shallow2d_speed
0.0 194 194 1212 0 161 shallow2dv_speed
0.0 91 91 29088 0 3 copy_subgrid
0.0 56 71 1 100001 71160 lua_init_sim
0.0 43 59 100001 100001 1 limdiff [THROTTLED]
0.0 50 50 100001 0 1 central2d_correct_sd [THROTTLED]
0.0 33 33 1212 0 28 MPI_Allreduce()
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 14 14 100001 0 0 central2d_offset [THROTTLED]
0.0 0.521 0.521 2 0 260 central2d_free
0.0 0.016 0.017 1 2 17 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 6, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 7;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 15:34.549 1 1 934549856 .TAU application
100.0 0.472 15:34.544 1 3 934544064 main
99.8 0.728 15:32.724 1 108 932724268 run_sim
98.1 0.062 15:16.880 50 50 18337616 central2d_run
98.1 4 15:16.880 50 6060 18337614 central2d_xrun
94.1 2,206 14:39.814 2424 104849 362960 central2d_step
47.1 1,504 7:20.118 2424 4.91587E+06 181567 central2d_predict
46.9 7:18.380 7:18.409 4.89406E+06 49980 90 limited_derivk
46.8 2,187 7:17.152 2424 4.97224E+06 180343 central2d_correct
46.6 7:15.096 7:15.125 4.89406E+06 50021 89 limited_deriv1
3.9 9 36,827 1212 33936 30386 central2d_periodic
3.9 36,730 36,730 4848 0 7576 MPI_Sendrecv()
1.6 0.118 15,314 51 51 300288 gather_sol
1.6 0.127 15,314 51 51 300286 send_full_u
1.6 15,314 15,314 51 0 300284 MPI_Send()
0.1 1,301 1,301 1 0 1301867 MPI_Init()
0.1 517 517 1 0 517457 MPI_Finalize()
0.0 456 456 1 0 456433 MPI_Barrier()
0.0 34 338 100001 100001 3 shallow2d_flux [THROTTLED]
0.0 304 304 100001 0 3 shallow2dv_flux [THROTTLED]
0.0 0.98 211 1212 1212 174 shallow2d_speed
0.0 210 210 1212 0 173 shallow2dv_speed
0.0 88 88 29088 0 3 copy_subgrid
0.0 56 71 1 100001 71099 lua_init_sim
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 42 42 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 22 22 1212 0 18 MPI_Allreduce()
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 14 14 100001 0 0 central2d_offset [THROTTLED]
0.0 0.497 0.497 2 0 248 central2d_free
0.0 0.017 0.018 1 2 18 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 7, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 8;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 15:34.551 1 1 934551333 .TAU application
100.0 0.479 15:34.545 1 3 934545262 main
99.8 0.625 15:32.765 1 108 932765316 run_sim
98.0 0.087 15:15.441 50 50 18308824 central2d_run
98.0 4 15:15.441 50 6060 18308823 central2d_xrun
95.6 2,396 14:53.719 2424 104849 368696 central2d_step
48.8 1,475 7:36.351 2424 4.91587E+06 188264 central2d_predict
47.7 7:25.675 7:25.704 4.89406E+06 50024 91 limited_deriv1
47.3 7:21.656 7:21.686 4.89406E+06 49977 90 limited_derivk
46.5 2,056 7:14.608 2424 4.97224E+06 179294 central2d_correct
2.3 8 21,491 1212 33936 17732 central2d_periodic
2.3 21,400 21,400 4848 0 4414 MPI_Sendrecv()
1.8 0.095 16,839 51 51 330177 gather_sol
1.8 0.12 16,838 51 51 330176 send_full_u
1.8 16,838 16,838 51 0 330173 MPI_Send()
0.1 1,302 1,302 1 0 1302079 MPI_Init()
0.1 477 477 1 0 477388 MPI_Finalize()
0.0 413 413 1 0 413811 MPI_Barrier()
0.0 34 362 100001 100001 4 shallow2d_flux [THROTTLED]
0.0 327 327 100001 0 3 shallow2dv_flux [THROTTLED]
0.0 1 179 1212 1212 148 shallow2d_speed
0.0 178 178 1212 0 147 shallow2dv_speed
0.0 82 82 29088 0 3 copy_subgrid
0.0 55 70 1 100001 70108 lua_init_sim
0.0 43 59 100001 100001 1 limdiff [THROTTLED]
0.0 46 46 1212 0 38 MPI_Allreduce()
0.0 37 37 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 14 14 100001 0 0 central2d_offset [THROTTLED]
0.0 0.484 0.484 2 0 242 central2d_free
0.0 0.018 0.018 1 2 18 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 8, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
FUNCTION SUMMARY (total):
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 52 2:20:10.948 9 9 934549844 .TAU application
100.0 4 2:20:10.895 9 27 934543980 main
99.8 6 2:19:54.187 9 1075 932687449 run_sim
98.4 0.769 2:17:53.430 450 450 18385400 central2d_run
98.4 41 2:17:53.429 450 54540 18385398 central2d_xrun
95.1 21,442 2:13:15.977 21816 943641 366519 central2d_step
47.7 13,786 1:06:48.940 21816 4.44174E+07 183761 central2d_predict
47.3 1:06:15.635 1:06:15.897 4.4221E+07 449811 90 limited_derivk
47.1 20,128 1:06:02.336 21816 4.49247E+07 181625 central2d_correct
47.1 1:06:00.815 1:06:01.079 4.4221E+07 450198 90 limited_deriv1
3.3 83 4:35.319 10908 305424 25240 central2d_periodic
3.3 4:34.450 4:34.450 43632 0 6290 MPI_Sendrecv()
1.1 1 1:36.409 459 867 210043 gather_sol
0.9 0.92 1:17.498 408 408 189947 send_full_u
0.9 1:17.497 1:17.497 408 0 189945 MPI_Send()
0.2 17,793 17,793 459 0 38767 copy_u
0.2 129 16,897 408 816 41415 recv_full_u
0.2 13,580 13,580 51 0 266283 solution_check
0.1 11,719 11,719 9 0 1302127 MPI_Init()
0.1 5,550 5,550 51 0 108842 viz_frame
0.1 4,985 4,985 9 0 553943 MPI_Finalize()
0.1 4,528 4,528 9 0 503194 MPI_Barrier()
0.0 311 3,257 900009 900009 4 shallow2d_flux [THROTTLED]
0.0 2,946 2,946 900009 0 3 shallow2dv_flux [THROTTLED]
0.0 9 1,846 10908 10908 169 shallow2d_speed
0.0 1,836 1,836 10908 0 168 shallow2dv_speed
0.0 986 986 408 0 2419 MPI_Recv()
0.0 785 785 261792 0 3 copy_subgrid
0.0 503 635 9 900009 70665 lua_init_sim
0.0 389 526 900009 900009 1 limdiff [THROTTLED]
0.0 385 385 900009 0 0 central2d_correct_sd [THROTTLED]
0.0 244 244 10908 0 22 MPI_Allreduce()
0.0 136 136 900009 0 0 xmin2s [THROTTLED]
0.0 132 132 900009 0 0 central2d_offset [THROTTLED]
0.0 36 36 1 0 36135 viz_close
0.0 5 5 18 0 320 central2d_free
0.0 2 2 9 0 227 viz_open
0.0 0.147 0.154 9 18 17 central2d_init
0.0 0.079 0.079 9 0 9 copy_basic_info
0.0 0.005 0.005 9 0 1 MPI_Comm_size()
0.0 0.002 0.002 9 0 0 MPI_Comm_rank()
FUNCTION SUMMARY (mean):
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 15:34.549 1 1 934549844 .TAU application
100.0 0.461 15:34.543 1 3 934543980 main
99.8 0.763 15:32.687 1 119.444 932687449 run_sim
98.4 0.0854 15:19.270 50 50 18385400 central2d_run
98.4 4 15:19.269 50 6060 18385398 central2d_xrun
95.1 2,382 14:48.441 2424 104849 366519 central2d_step
47.7 1,531 7:25.437 2424 4.93526E+06 183761 central2d_predict
47.3 7:21.737 7:21.766 4.91345E+06 49979 90 limited_derivk
47.1 2,236 7:20.259 2424 4.99163E+06 181625 central2d_correct
47.1 7:20.090 7:20.119 4.91345E+06 50022 90 limited_deriv1
3.3 9 30,591 1212 33936 25240 central2d_periodic
3.3 30,494 30,494 4848 0 6290 MPI_Sendrecv()
1.1 0.131 10,712 51 96.3333 210043 gather_sol
0.9 0.102 8,610 45.3333 45.3333 189947 send_full_u
0.9 8,610 8,610 45.3333 0 189945 MPI_Send()
0.2 1,977 1,977 51 0 38767 copy_u
0.2 14 1,877 45.3333 90.6667 41415 recv_full_u
0.2 1,508 1,508 5.66667 0 266283 solution_check
0.1 1,302 1,302 1 0 1302127 MPI_Init()
0.1 616 616 5.66667 0 108842 viz_frame
0.1 553 553 1 0 553943 MPI_Finalize()
0.1 503 503 1 0 503194 MPI_Barrier()
0.0 34 361 100001 100001 4 shallow2d_flux [THROTTLED]
0.0 327 327 100001 0 3 shallow2dv_flux [THROTTLED]
0.0 1 205 1212 1212 169 shallow2d_speed
0.0 204 204 1212 0 168 shallow2dv_speed
0.0 109 109 45.3333 0 2419 MPI_Recv()
0.0 87 87 29088 0 3 copy_subgrid
0.0 55 70 1 100001 70665 lua_init_sim
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 42 42 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 27 27 1212 0 22 MPI_Allreduce()
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 14 14 100001 0 0 central2d_offset [THROTTLED]
0.0 4 4 0.111111 0 36135 viz_close
0.0 0.639 0.639 2 0 320 central2d_free
0.0 0.227 0.227 1 0 227 viz_open
0.0 0.0163 0.0171 1 2 17 central2d_init
0.0 0.00878 0.00878 1 0 9 copy_basic_info
0.0 0.000556 0.000556 1 0 1 MPI_Comm_size()
0.0 0.000222 0.000222 1 0 0 MPI_Comm_rank()