forked from cs5220-f20/shallow-water
-
Notifications
You must be signed in to change notification settings - Fork 0
/
profiling_1000_t3.txt
4070 lines (3906 loc) · 311 KB
/
profiling_1000_t3.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Reading Profile files in profile_1000_tstep3/profile.*
NODE 0;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.127 1 1 173127783 .TAU application
100.0 0.871 2:53.122 1 3 173122133 main
98.3 2 2:50.111 1 211 170111700 run_sim
76.3 0.044 2:12.025 50 50 2640500 central2d_run
76.3 1 2:12.024 50 5161 2640500 central2d_xrun
73.5 187 2:07.279 2382 104765 53434 central2d_step
39.3 447 1:08.099 2382 1.91513E+06 28589 central2d_predict
36.5 1:03.078 1:03.107 1.77935E+06 49982 35 limited_derivk
36.3 1:02.870 1:02.900 1.77935E+06 50019 35 limited_deriv1
34.0 506 58,884 2382 1.74358E+06 24721 central2d_correct
10.8 1 18,702 51 4131 366718 gather_sol
10.7 4 18,453 4080 8160 4523 recv_full_u
10.4 18,031 18,040 4131 62369 4367 copy_u
7.8 13,576 13,576 51 0 266203 solution_check
3.2 5,586 5,586 51 0 109534 viz_frame
1.9 3,317 3,317 1191 0 2786 MPI_Allreduce()
1.4 2,366 2,366 1 0 2366233 MPI_Init()
0.8 2 1,405 397 11116 3540 central2d_periodic
0.8 1,391 1,391 1588 0 876 MPI_Sendrecv()
0.4 656 656 4080 0 161 MPI_Recv()
0.4 643 643 1 0 643329 MPI_Finalize()
0.1 172 172 1 0 172193 viz_close
0.1 34 107 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 73 73 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 29 29 1 0 29874 viz_open
0.0 22 22 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.51 20 1191 1191 17 shallow2d_speed
0.0 19 19 1191 0 17 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 14 14 100001 0 0 central2d_offset [THROTTLED]
0.0 9 14 1 37632 14649 lua_init_sim
0.0 10 10 9528 0 1 copy_subgrid
0.0 1 1 2 0 877 central2d_free
0.0 0.024 0.024 1 0 24 MPI_Barrier()
0.0 0.015 0.016 1 2 16 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 0, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 1;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.127 1 1 173127138 .TAU application
100.0 0.895 2:53.121 1 3 173121536 main
97.7 0.594 2:49.199 1 108 169199710 run_sim
97.1 0.047 2:48.119 50 50 3362390 central2d_run
97.1 1 2:48.119 50 5161 3362389 central2d_xrun
73.1 186 2:06.640 2382 104765 53166 central2d_step
39.2 438 1:07.830 2382 1.91513E+06 28476 central2d_predict
36.3 1:02.764 1:02.793 1.77935E+06 49982 35 limited_derivk
36.2 1:02.556 1:02.585 1.77935E+06 50019 35 limited_deriv1
33.8 506 58,517 2382 1.74358E+06 24566 central2d_correct
21.8 2 37,734 397 11116 95048 central2d_periodic
21.8 37,719 37,719 1588 0 23753 MPI_Sendrecv()
2.1 3,721 3,721 1191 0 3125 MPI_Allreduce()
1.4 2,369 2,369 1 0 2369195 MPI_Init()
0.9 1,551 1,551 1 0 1551736 MPI_Finalize()
0.4 745 745 1 0 745086 MPI_Barrier()
0.2 0.052 319 51 51 6268 gather_sol
0.2 0.041 319 51 51 6267 send_full_u
0.2 319 319 51 0 6266 MPI_Send()
0.1 34 107 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 73 73 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 23 23 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.557 20 1191 1191 17 shallow2d_speed
0.0 20 20 1191 0 17 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14654 lua_init_sim
0.0 11 11 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.155 0.155 2 0 78 central2d_free
0.0 0.012 0.014 1 2 14 central2d_init
0.0 0.01 0.01 1 0 10 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 1, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 2;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.125 1 1 173125489 .TAU application
100.0 0.855 2:53.119 1 3 173119914 main
97.7 0.462 2:49.204 1 108 169204246 run_sim
97.0 0.048 2:47.901 50 50 3358030 central2d_run
97.0 1 2:47.901 50 5161 3358029 central2d_xrun
73.2 187 2:06.806 2382 104765 53235 central2d_step
39.3 443 1:08.035 2382 1.91513E+06 28562 central2d_predict
36.4 1:02.927 1:02.956 1.77935E+06 49982 35 limited_derivk
36.1 1:02.538 1:02.567 1.77935E+06 50019 35 limited_deriv1
33.8 519 58,474 2382 1.74358E+06 24548 central2d_correct
13.7 23,739 23,739 1191 0 19932 MPI_Allreduce()
10.0 2 17,332 397 11116 43658 central2d_periodic
10.0 17,318 17,318 1588 0 10906 MPI_Sendrecv()
1.4 2,366 2,366 1 0 2366264 MPI_Init()
0.9 1,548 1,548 1 0 1548549 MPI_Finalize()
0.4 737 737 1 0 737218 MPI_Barrier()
0.3 0.044 550 51 51 10789 gather_sol
0.3 0.047 550 51 51 10788 send_full_u
0.3 550 550 51 0 10787 MPI_Send()
0.1 33 108 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 74 74 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 23 23 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.519 21 1191 1191 18 shallow2d_speed
0.0 21 21 1191 0 18 shallow2dv_speed
0.0 14 14 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14675 lua_init_sim
0.0 10 10 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.133 0.133 2 0 66 central2d_free
0.0 0.014 0.015 1 2 15 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0.001 0.001 1 0 1 viz_open
0.0 0 0 1 0 0 MPI_Comm_size()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 2, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 3;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.126 1 1 173126104 .TAU application
100.0 0.864 2:53.120 1 3 173120529 main
97.7 0.493 2:49.208 1 108 169208853 run_sim
96.9 0.046 2:47.673 50 50 3353465 central2d_run
96.9 1 2:47.673 50 5161 3353464 central2d_xrun
73.2 187 2:06.769 2382 104765 53220 central2d_step
39.3 442 1:08.028 2382 1.91513E+06 28559 central2d_predict
36.4 1:02.919 1:02.948 1.77935E+06 49982 35 limited_derivk
36.1 1:02.533 1:02.562 1.77935E+06 50019 35 limited_deriv1
33.8 498 58,444 2382 1.74358E+06 24536 central2d_correct
13.6 23,553 23,553 1191 0 19776 MPI_Allreduce()
10.0 2 17,327 397 11116 43645 central2d_periodic
10.0 17,313 17,313 1588 0 10903 MPI_Sendrecv()
1.4 2,366 2,366 1 0 2366290 MPI_Init()
0.9 1,544 1,544 1 0 1544522 MPI_Finalize()
0.5 0.047 787 51 51 15448 gather_sol
0.5 0.045 787 51 51 15447 send_full_u
0.5 787 787 51 0 15446 MPI_Send()
0.4 732 732 1 0 732361 MPI_Barrier()
0.1 33 108 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 74 74 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.519 21 1191 1191 18 shallow2d_speed
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 21 21 1191 0 18 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14734 lua_init_sim
0.0 10 10 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.149 0.149 2 0 74 central2d_free
0.0 0.014 0.014 1 2 14 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 3, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 4;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.132 1 1 173132416 .TAU application
100.0 0.851 2:53.126 1 3 173126727 main
97.7 0.473 2:49.213 1 108 169213351 run_sim
96.7 0.038 2:47.456 50 50 3349140 central2d_run
96.7 1 2:47.456 50 5161 3349139 central2d_xrun
73.3 186 2:06.879 2382 104765 53266 central2d_step
39.3 443 1:08.037 2382 1.91513E+06 28563 central2d_predict
36.4 1:02.999 1:03.028 1.77935E+06 49982 35 limited_derivk
36.1 1:02.542 1:02.572 1.77935E+06 50019 35 limited_deriv1
33.8 517 58,547 2382 1.74358E+06 24579 central2d_correct
13.5 23,309 23,309 1191 0 19571 MPI_Allreduce()
10.0 2 17,246 397 11116 43441 central2d_periodic
10.0 17,233 17,233 1588 0 10852 MPI_Sendrecv()
1.4 2,366 2,366 1 0 2366297 MPI_Init()
0.9 1,546 1,546 1 0 1546228 MPI_Finalize()
0.6 0.045 1,013 51 51 19871 gather_sol
0.6 0.04 1,013 51 51 19871 send_full_u
0.6 1,013 1,013 51 0 19870 MPI_Send()
0.4 727 727 1 0 727535 MPI_Barrier()
0.1 33 107 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 73 73 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 23 23 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.501 20 1191 1191 18 shallow2d_speed
0.0 20 20 1191 0 17 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14763 lua_init_sim
0.0 10 10 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.133 0.133 2 0 66 central2d_free
0.0 0.012 0.014 1 2 14 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 4, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 5;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.131 1 1 173131154 .TAU application
100.0 0.871 2:53.125 1 3 173125568 main
97.7 0.561 2:49.217 1 108 169217918 run_sim
96.6 0.044 2:47.240 50 50 3344808 central2d_run
96.6 1 2:47.240 50 5161 3344807 central2d_xrun
73.3 187 2:06.984 2382 104765 53310 central2d_step
39.4 436 1:08.250 2382 1.91513E+06 28652 central2d_predict
36.4 1:02.918 1:02.947 1.77935E+06 49982 35 limited_derivk
36.3 1:02.761 1:02.790 1.77935E+06 50019 35 limited_deriv1
33.8 493 58,440 2382 1.74358E+06 24534 central2d_correct
13.3 23,086 23,086 1191 0 19384 MPI_Allreduce()
9.9 2 17,146 397 11116 43191 central2d_periodic
9.9 17,134 17,134 1588 0 10790 MPI_Sendrecv()
1.4 2,369 2,369 1 0 2369261 MPI_Init()
0.9 1,537 1,537 1 0 1537518 MPI_Finalize()
0.7 0.055 1,239 51 51 24302 gather_sol
0.7 0.041 1,239 51 51 24301 send_full_u
0.7 1,239 1,239 51 0 24300 MPI_Send()
0.4 722 722 1 0 722701 MPI_Barrier()
0.1 34 107 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 73 73 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.49 20 1191 1191 17 shallow2d_speed
0.0 19 19 1191 0 16 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14703 lua_init_sim
0.0 10 10 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.141 0.141 2 0 70 central2d_free
0.0 0.015 0.016 1 2 16 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 5, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 6;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:53.135 1 1 173135983 .TAU application
100.0 0.873 2:53.129 1 3 173129050 main
97.7 0.737 2:49.222 1 108 169222413 run_sim
96.5 0.046 2:47.147 50 50 3342948 central2d_run
96.5 1 2:47.147 50 5161 3342947 central2d_xrun
75.1 187 2:10.044 2382 104765 54595 central2d_step
39.3 438 1:08.010 2382 1.91513E+06 28552 central2d_predict
37.2 1:04.462 1:04.491 1.77935E+06 50019 36 limited_deriv1
37.1 1:04.270 1:04.299 1.77935E+06 49982 36 limited_derivk
35.7 499 1:01.740 2382 1.74358E+06 25919 central2d_correct
12.0 20,836 20,836 1191 0 17495 MPI_Allreduce()
9.4 2 16,249 397 11116 40931 central2d_periodic
9.4 16,236 16,236 1588 0 10225 MPI_Sendrecv()
1.4 2,372 2,372 1 0 2372860 MPI_Init()
0.9 1,532 1,532 1 0 1532904 MPI_Finalize()
0.8 0.065 1,341 51 51 26306 gather_sol
0.8 0.031 1,341 51 51 26304 send_full_u
0.8 1,341 1,341 51 0 26304 MPI_Send()
0.4 717 717 1 0 717848 MPI_Barrier()
0.1 34 107 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 72 72 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 57 100001 100001 1 limdiff [THROTTLED]
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 0.516 14 1191 1191 13 shallow2d_speed
0.0 9 14 1 37632 14655 lua_init_sim
0.0 14 14 1191 0 12 shallow2dv_speed
0.0 9 9 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.144 0.144 2 0 72 central2d_free
0.0 0.013 0.014 1 2 14 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0.001 0.001 1 0 1 viz_open
0.0 0 0 1 0 0 MPI_Comm_size()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 6, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 7;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.128 1 1 173128399 .TAU application
100.0 0.855 2:53.122 1 3 173122724 main
97.7 0.459 2:49.226 1 108 169226821 run_sim
96.3 0.042 2:46.794 50 50 3335895 central2d_run
96.3 1 2:46.794 50 5161 3335894 central2d_xrun
73.3 189 2:06.844 2382 104765 53251 central2d_step
39.3 444 1:08.037 2382 1.91513E+06 28563 central2d_predict
36.4 1:02.956 1:02.985 1.77935E+06 49982 35 limited_derivk
36.1 1:02.545 1:02.574 1.77935E+06 50019 35 limited_deriv1
33.8 518 58,508 2382 1.74358E+06 24563 central2d_correct
13.2 22,767 22,767 1191 0 19116 MPI_Allreduce()
9.9 2 17,161 397 11116 43228 central2d_periodic
9.9 17,147 17,147 1588 0 10798 MPI_Sendrecv()
1.4 2,366 2,366 1 0 2366244 MPI_Init()
1.0 0.05 1,703 51 51 33405 gather_sol
1.0 0.043 1,703 51 51 33404 send_full_u
1.0 1,703 1,703 51 0 33403 MPI_Send()
0.9 1,528 1,528 1 0 1528804 MPI_Finalize()
0.4 712 712 1 0 712999 MPI_Barrier()
0.1 33 108 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 74 74 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 23 23 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.481 20 1191 1191 17 shallow2d_speed
0.0 19 19 1191 0 16 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14802 lua_init_sim
0.0 10 10 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.131 0.131 2 0 66 central2d_free
0.0 0.015 0.016 1 2 16 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 7, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 8;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.131 1 1 173131126 .TAU application
100.0 0.858 2:53.125 1 3 173125496 main
97.7 0.72 2:49.231 1 108 169231416 run_sim
96.0 0.06 2:46.258 50 50 3325173 central2d_run
96.0 2 2:46.258 50 5161 3325172 central2d_xrun
68.9 201 1:59.364 2382 104765 50111 central2d_step
37.0 448 1:04.072 2382 1.91513E+06 26898 central2d_predict
34.2 59,196 59,225 1.77935E+06 49979 33 limited_derivk
34.0 58,821 58,850 1.77935E+06 50022 33 limited_deriv1
31.8 503 54,977 2382 1.74358E+06 23080 central2d_correct
22.1 2 38,285 397 11116 96438 central2d_periodic
22.1 38,269 38,269 1588 0 24099 MPI_Sendrecv()
5.0 8,586 8,586 1191 0 7209 MPI_Allreduce()
1.4 2,369 2,369 1 0 2369293 MPI_Init()
1.3 0.065 2,250 51 51 44118 gather_sol
1.3 0.047 2,249 51 51 44116 send_full_u
1.3 2,249 2,249 51 0 44116 MPI_Send()
0.9 1,523 1,523 1 0 1523929 MPI_Finalize()
0.4 708 708 1 0 708164 MPI_Barrier()
0.1 34 114 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 79 79 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 57 100001 100001 1 limdiff [THROTTLED]
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.543 19 1191 1191 17 shallow2d_speed
0.0 19 19 1191 0 16 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 8 13 1 34944 13718 lua_init_sim
0.0 13 13 9528 0 1 copy_subgrid
0.0 5 5 34944 0 0 central2d_offset
0.0 0.143 0.143 2 0 72 central2d_free
0.0 0.015 0.016 1 2 16 central2d_init
0.0 0.009 0.009 1 0 9 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 8, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 9;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.131 1 1 173131388 .TAU application
100.0 0.868 2:53.125 1 3 173125819 main
97.7 0.589 2:49.235 1 108 169235637 run_sim
96.2 0.056 2:46.577 50 50 3331552 central2d_run
96.2 1 2:46.577 50 5161 3331550 central2d_xrun
76.2 178 2:11.922 2382 104765 55383 central2d_step
42.3 441 1:13.189 2382 1.91513E+06 30726 central2d_predict
38.2 1:06.079 1:06.109 1.77935E+06 50019 37 limited_deriv1
37.3 1:04.539 1:04.569 1.77935E+06 49982 36 limited_derivk
33.8 497 58,450 2382 1.74358E+06 24538 central2d_correct
19.9 2 34,416 397 11116 86692 central2d_periodic
19.9 34,402 34,402 1588 0 21664 MPI_Sendrecv()
1.4 2,369 2,369 1 0 2369320 MPI_Init()
1.1 0.061 1,939 51 51 38020 gather_sol
1.1 0.036 1,938 51 51 38019 send_full_u
1.1 1,938 1,938 51 0 38018 MPI_Send()
0.9 1,519 1,519 1 0 1519994 MPI_Finalize()
0.4 703 703 1 0 703613 MPI_Barrier()
0.1 220 220 1191 0 185 MPI_Allreduce()
0.1 34 104 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 70 70 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 44 59 100001 100001 1 limdiff [THROTTLED]
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.515 15 1191 1191 13 shallow2d_speed
0.0 15 15 1191 0 13 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14659 lua_init_sim
0.0 11 11 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.143 0.143 2 0 72 central2d_free
0.0 0.012 0.014 1 2 14 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 9, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 10;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.130 1 1 173130972 .TAU application
100.0 0.866 2:53.125 1 3 173125394 main
97.8 0.567 2:49.240 1 108 169240306 run_sim
96.0 0.054 2:46.154 50 50 3323092 central2d_run
96.0 1 2:46.154 50 5161 3323091 central2d_xrun
73.5 187 2:07.321 2382 104765 53451 central2d_step
39.4 446 1:08.130 2382 1.91513E+06 28602 central2d_predict
36.5 1:03.135 1:03.164 1.77935E+06 49982 35 limited_derivk
36.3 1:02.855 1:02.884 1.77935E+06 50019 35 limited_deriv1
34.0 507 58,894 2382 1.74358E+06 24725 central2d_correct
20.5 2 35,543 397 11116 89531 central2d_periodic
20.5 35,529 35,529 1588 0 22374 MPI_Sendrecv()
1.9 3,267 3,267 1191 0 2743 MPI_Allreduce()
1.4 0.06 2,371 51 51 46509 gather_sol
1.4 0.044 2,371 51 51 46508 send_full_u
1.4 2,371 2,371 51 0 46507 MPI_Send()
1.4 2,369 2,369 1 0 2369248 MPI_Init()
0.9 1,514 1,514 1 0 1514974 MPI_Finalize()
0.4 697 697 1 0 697908 MPI_Barrier()
0.1 34 108 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 73 73 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 22 22 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.491 20 1191 1191 17 shallow2d_speed
0.0 20 20 1191 0 17 shallow2dv_speed
0.0 9 15 1 37632 15114 lua_init_sim
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 11 11 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.143 0.143 2 0 72 central2d_free
0.0 0.014 0.015 1 2 15 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 10, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 11;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.127 1 1 173127251 .TAU application
100.0 0.854 2:53.121 1 3 173121635 main
97.8 0.503 2:49.244 1 108 169244841 run_sim
95.8 0.054 2:45.902 50 50 3318042 central2d_run
95.8 2 2:45.902 50 5161 3318041 central2d_xrun
73.3 189 2:06.850 2382 104765 53254 central2d_step
39.3 440 1:08.068 2382 1.91513E+06 28576 central2d_predict
36.4 1:02.941 1:02.970 1.77935E+06 49982 35 limited_derivk
36.2 1:02.589 1:02.618 1.77935E+06 50019 35 limited_deriv1
33.8 498 58,482 2382 1.74358E+06 24552 central2d_correct
20.6 35,718 35,718 1191 0 29990 MPI_Allreduce()
1.9 2 3,310 397 11116 8338 central2d_periodic
1.9 3,296 3,296 1588 0 2076 MPI_Sendrecv()
1.5 0.054 2,634 51 51 51654 gather_sol
1.5 0.046 2,634 51 51 51652 send_full_u
1.5 2,634 2,634 51 0 51652 MPI_Send()
1.4 2,366 2,366 1 0 2366199 MPI_Init()
0.9 1,509 1,509 1 0 1509741 MPI_Finalize()
0.4 693 693 1 0 693053 MPI_Barrier()
0.1 34 109 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 75 75 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 57 100001 100001 1 limdiff [THROTTLED]
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.527 20 1191 1191 17 shallow2d_speed
0.0 20 20 1191 0 17 shallow2dv_speed
0.0 14 14 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14708 lua_init_sim
0.0 10 10 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.13 0.13 2 0 65 central2d_free
0.0 0.016 0.016 1 2 16 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 11, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 12;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.125 1 1 173125904 .TAU application
100.0 0.998 2:53.120 1 3 173120186 main
97.8 0.556 2:49.249 1 108 169249473 run_sim
95.7 0.043 2:45.674 50 50 3313481 central2d_run
95.7 1 2:45.674 50 5161 3313480 central2d_xrun
73.2 189 2:06.740 2382 104765 53208 central2d_step
39.3 437 1:08.013 2382 1.91513E+06 28553 central2d_predict
36.4 1:02.910 1:02.938 1.77935E+06 49982 35 limited_derivk
36.1 1:02.519 1:02.548 1.77935E+06 50019 35 limited_deriv1
33.7 495 58,429 2382 1.74358E+06 24529 central2d_correct
20.5 35,562 35,562 1191 0 29859 MPI_Allreduce()
1.9 2 3,348 397 11116 8435 central2d_periodic
1.9 3,333 3,333 1588 0 2099 MPI_Sendrecv()
1.7 0.054 2,871 51 51 56305 gather_sol
1.7 0.039 2,871 51 51 56304 send_full_u
1.7 2,871 2,871 51 0 56303 MPI_Send()
1.4 2,366 2,366 1 0 2366239 MPI_Init()
0.9 1,503 1,503 1 0 1503476 MPI_Finalize()
0.4 688 688 1 0 688166 MPI_Barrier()
0.1 33 109 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 75 75 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 57 100001 100001 1 limdiff [THROTTLED]
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.522 20 1191 1191 17 shallow2d_speed
0.0 20 20 1191 0 17 shallow2dv_speed
0.0 9 14 1 37632 14967 lua_init_sim
0.0 14 14 100001 0 0 xmin2s [THROTTLED]
0.0 12 12 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.148 0.148 2 0 74 central2d_free
0.0 0.017 0.017 1 0 17 copy_basic_info
0.0 0.015 0.016 1 2 16 central2d_init
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0.001 0.001 1 0 1 viz_open
0.0 0 0 1 0 0 MPI_Comm_size()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 12, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 13;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.126 1 1 173126588 .TAU application
100.0 0.876 2:53.120 1 3 173120860 main
97.8 0.522 2:49.254 1 108 169254145 run_sim
95.6 0.03 2:45.456 50 50 3309128 central2d_run
95.6 1 2:45.456 50 5161 3309128 central2d_xrun
73.3 187 2:06.875 2382 104765 53264 central2d_step
39.3 438 1:08.080 2382 1.91513E+06 28581 central2d_predict
36.4 1:02.981 1:03.010 1.77935E+06 49982 35 limited_derivk
36.2 1:02.572 1:02.601 1.77935E+06 50019 35 limited_deriv1
33.8 505 58,498 2382 1.74358E+06 24559 central2d_correct
20.4 35,294 35,294 1191 0 29634 MPI_Allreduce()
1.9 2 3,264 397 11116 8223 central2d_periodic
1.9 3,249 3,249 1588 0 2046 MPI_Sendrecv()
1.8 0.05 3,098 51 51 60756 gather_sol
1.8 0.031 3,098 51 51 60755 send_full_u
1.8 3,098 3,098 51 0 60755 MPI_Send()
1.4 2,366 2,366 1 0 2366267 MPI_Init()
0.9 1,499 1,499 1 0 1499572 MPI_Finalize()
0.4 683 683 1 0 683323 MPI_Barrier()
0.1 34 108 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 74 74 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 22 22 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.47 20 1191 1191 17 shallow2d_speed
0.0 20 20 1191 0 17 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 15 1 37632 15152 lua_init_sim
0.0 12 12 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.143 0.143 2 0 72 central2d_free
0.0 0.014 0.015 1 2 15 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0.001 0.001 1 0 1 viz_open
0.0 0 0 1 0 0 MPI_Comm_size()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 13, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 14;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.130 1 1 173130723 .TAU application
100.0 0.868 2:53.125 1 3 173125091 main
97.8 0.44 2:49.258 1 108 169258600 run_sim
95.4 0.035 2:45.226 50 50 3304534 central2d_run
95.4 1 2:45.226 50 5161 3304533 central2d_xrun
73.2 185 2:06.696 2382 104765 53189 central2d_step
39.3 439 1:07.989 2382 1.91513E+06 28543 central2d_predict
36.3 1:02.884 1:02.913 1.77935E+06 49982 35 limited_derivk
36.1 1:02.503 1:02.532 1.77935E+06 50019 35 limited_deriv1
33.7 496 58,412 2382 1.74358E+06 24523 central2d_correct
19.2 33,208 33,208 1191 0 27883 MPI_Allreduce()
3.1 2 5,299 397 11116 13348 central2d_periodic
3.1 5,283 5,283 1588 0 3327 MPI_Sendrecv()
1.9 0.043 3,338 51 51 65455 gather_sol
1.9 0.034 3,338 51 51 65454 send_full_u
1.9 3,338 3,338 51 0 65453 MPI_Send()
1.4 2,369 2,369 1 0 2369329 MPI_Init()
0.9 1,496 1,496 1 0 1496294 MPI_Finalize()
0.4 678 678 1 0 678465 MPI_Barrier()
0.1 33 107 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 73 73 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.478 21 1191 1191 18 shallow2d_speed
0.0 20 20 1191 0 17 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14659 lua_init_sim
0.0 12 12 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.13 0.13 2 0 65 central2d_free
0.0 0.015 0.015 1 2 15 central2d_init
0.0 0.011 0.011 1 0 11 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 14, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 15;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:53.118 1 1 173118700 .TAU application
100.0 0.868 2:53.113 1 3 173113092 main
97.8 0.436 2:49.263 1 108 169263151 run_sim
95.3 0.03 2:45.007 50 50 3300150 central2d_run
95.3 1 2:45.007 50 5161 3300149 central2d_xrun
73.2 189 2:06.796 2382 104765 53231 central2d_step
39.3 443 1:08.004 2382 1.91513E+06 28549 central2d_predict
36.4 1:02.926 1:02.955 1.77935E+06 49982 35 limited_derivk
36.1 1:02.525 1:02.554 1.77935E+06 50019 35 limited_deriv1
33.8 518 58,492 2382 1.74358E+06 24556 central2d_correct
19.0 32,912 32,912 1191 0 27635 MPI_Allreduce()
3.0 2 5,272 397 11116 13281 central2d_periodic
3.0 5,255 5,255 1588 0 3310 MPI_Sendrecv()
2.1 0.041 3,566 51 51 69931 gather_sol
2.1 0.036 3,566 51 51 69931 send_full_u
2.1 3,566 3,566 51 0 69930 MPI_Send()
1.4 2,369 2,369 1 0 2369319 MPI_Init()
0.9 1,479 1,479 1 0 1479754 MPI_Finalize()
0.4 673 673 1 0 673609 MPI_Barrier()
0.1 33 110 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 76 76 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 57 100001 100001 1 limdiff [THROTTLED]
0.0 24 24 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.509 23 1191 1191 20 shallow2d_speed
0.0 23 23 1191 0 20 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14966 lua_init_sim
0.0 14 14 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.131 0.131 2 0 66 central2d_free
0.0 0.015 0.015 1 2 15 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 15, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 16;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:52.892 1 1 172892937 .TAU application
100.0 0.922 2:52.887 1 3 172887429 main
97.9 0.491 2:49.267 1 108 169267682 run_sim
95.3 0.035 2:44.802 50 50 3296043 central2d_run
95.3 1 2:44.802 50 5161 3296042 central2d_xrun
73.6 189 2:07.311 2382 104765 53447 central2d_step
39.4 441 1:08.130 2382 1.91513E+06 28602 central2d_predict
36.6 1:03.172 1:03.201 1.77935E+06 49982 36 limited_derivk
36.3 1:02.812 1:02.841 1.77935E+06 50019 35 limited_deriv1
34.1 507 58,882 2382 1.74358E+06 24720 central2d_correct
13.0 22,418 22,418 1191 0 18823 MPI_Allreduce()
8.7 2 15,049 397 11116 37909 central2d_periodic
8.7 15,037 15,037 1588 0 9470 MPI_Sendrecv()
2.2 0.041 3,781 51 51 74147 gather_sol
2.2 0.046 3,781 51 51 74146 send_full_u
2.2 3,781 3,781 51 0 74145 MPI_Send()
1.2 2,131 2,131 1 0 2131115 MPI_Init()
0.9 1,487 1,487 1 0 1487710 MPI_Finalize()
0.4 668 668 1 0 668704 MPI_Barrier()
0.1 33 108 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 74 74 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.51 20 1191 1191 18 shallow2d_speed
0.0 20 20 1191 0 17 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14657 lua_init_sim
0.0 9 9 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.175 0.175 2 0 88 central2d_free
0.0 0.011 0.011 1 2 11 central2d_init
0.0 0.006 0.006 1 0 6 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 16, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 17;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:52.888 1 1 172888017 .TAU application
100.0 0.892 2:52.882 1 3 172882492 main
97.9 0.467 2:49.272 1 108 169272230 run_sim
95.1 0.043 2:44.349 50 50 3286998 central2d_run
95.1 1 2:44.349 50 5161 3286997 central2d_xrun
70.5 192 2:01.826 2382 104765 51145 central2d_step
37.0 440 1:04.018 2382 1.91513E+06 26876 central2d_predict
35.0 1:00.455 1:00.484 1.77935E+06 49979 34 limited_derivk
34.7 1:00.041 1:00.070 1.77935E+06 50022 34 limited_deriv1
33.3 505 57,505 2382 1.74358E+06 24142 central2d_correct
20.6 2 35,558 397 11116 89569 central2d_periodic
20.6 35,545 35,545 1588 0 22384 MPI_Sendrecv()
4.0 6,942 6,942 1191 0 5829 MPI_Allreduce()
2.5 0.039 4,243 51 51 83213 gather_sol
2.5 0.036 4,243 51 51 83212 send_full_u
2.5 4,243 4,243 51 0 83211 MPI_Send()
1.2 2,131 2,131 1 0 2131333 MPI_Init()
0.9 1,478 1,478 1 0 1478037 MPI_Finalize()
0.4 664 664 1 0 664193 MPI_Barrier()
0.1 34 110 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 76 76 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.495 19 1191 1191 17 shallow2d_speed
0.0 19 19 1191 0 16 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 8 13 1 34944 13649 lua_init_sim
0.0 10 10 9528 0 1 copy_subgrid
0.0 5 5 34944 0 0 central2d_offset
0.0 0.176 0.176 2 0 88 central2d_free
0.0 0.01 0.011 1 2 11 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 17, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 18;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:52.892 1 1 172892977 .TAU application
100.0 0.867 2:52.886 1 3 172886204 main
97.9 0.422 2:49.276 1 108 169276469 run_sim
95.1 0.034 2:44.346 50 50 3286938 central2d_run
95.1 1 2:44.346 50 5161 3286937 central2d_xrun
73.3 184 2:06.761 2382 104765 53217 central2d_step
39.3 441 1:08.019 2382 1.91513E+06 28556 central2d_predict
36.4 1:02.915 1:02.944 1.77935E+06 49982 35 limited_derivk
36.2 1:02.526 1:02.555 1.77935E+06 50019 35 limited_deriv1
33.8 505 58,449 2382 1.74358E+06 24538 central2d_correct
19.6 2 33,924 397 11116 85452 central2d_periodic
19.6 33,912 33,912 1588 0 21355 MPI_Sendrecv()
2.5 0.039 4,253 51 51 83411 gather_sol
2.5 0.044 4,253 51 51 83410 send_full_u
2.5 4,253 4,253 51 0 83409 MPI_Send()
2.1 3,637 3,637 1191 0 3054 MPI_Allreduce()
1.2 2,135 2,135 1 0 2135156 MPI_Init()
0.9 1,473 1,473 1 0 1473712 MPI_Finalize()
0.4 659 659 1 0 659956 MPI_Barrier()
0.1 34 106 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 72 72 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 57 100001 100001 1 limdiff [THROTTLED]
0.0 22 22 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.486 20 1191 1191 17 shallow2d_speed
0.0 20 20 1191 0 17 shallow2dv_speed
0.0 9 15 1 37632 15082 lua_init_sim
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 9 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.16 0.16 2 0 80 central2d_free
0.0 0.011 0.012 1 2 12 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 18, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 19;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:52.894 1 1 172894379 .TAU application
100.0 0.875 2:52.887 1 3 172887737 main
97.9 0.463 2:49.281 1 108 169281191 run_sim
94.9 0.039 2:44.122 50 50 3282454 central2d_run
94.9 1 2:44.122 50 5161 3282453 central2d_xrun
73.5 185 2:06.994 2382 104765 53314 central2d_step
39.4 444 1:08.077 2382 1.91513E+06 28580 central2d_predict
36.5 1:03.084 1:03.113 1.77935E+06 49982 35 limited_derivk
36.2 1:02.563 1:02.593 1.77935E+06 50019 35 limited_deriv1
33.9 524 58,624 2382 1.74358E+06 24611 central2d_correct
19.5 2 33,650 397 11116 84761 central2d_periodic
19.5 33,638 33,638 1588 0 21183 MPI_Sendrecv()
2.6 0.043 4,488 51 51 88009 gather_sol
2.6 0.039 4,488 51 51 88008 send_full_u
2.6 4,488 4,488 51 0 88008 MPI_Send()
2.0 3,454 3,454 1191 0 2901 MPI_Allreduce()
1.2 2,135 2,135 1 0 2135186 MPI_Init()
0.9 1,470 1,470 1 0 1470485 MPI_Finalize()
0.4 654 654 1 0 654562 MPI_Barrier()
0.1 33 107 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 73 73 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 24 24 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 0.477 20 1191 1191 17 shallow2d_speed
0.0 20 20 1191 0 17 shallow2dv_speed
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14822 lua_init_sim
0.0 9 9 9528 0 1 copy_subgrid
0.0 5 5 37632 0 0 central2d_offset
0.0 0.146 0.146 2 0 73 central2d_free
0.0 0.014 0.016 1 2 16 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 19, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1191 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 20;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:52.890 1 1 172890434 .TAU application
100.0 0.993 2:52.884 1 3 172884819 main
97.9 0.551 2:49.285 1 108 169285714 run_sim
94.8 0.039 2:43.889 50 50 3277791 central2d_run
94.8 1 2:43.889 50 5161 3277790 central2d_xrun
73.4 182 2:06.872 2382 104765 53263 central2d_step
39.3 444 1:07.969 2382 1.91513E+06 28535 central2d_predict
36.4 1:02.837 1:02.867 1.77935E+06 49982 35 limited_derivk
36.3 1:02.715 1:02.744 1.77935E+06 50019 35 limited_deriv1