forked from cs5220-f20/shallow-water
-
Notifications
You must be signed in to change notification settings - Fork 0
/
profiling_1000_sub.txt
4070 lines (3906 loc) · 311 KB
/
profiling_1000_sub.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Reading Profile files in profile.*
NODE 0;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 7 2:32.998 1 1 152998969 .TAU application
100.0 267 2:32.991 1 3 152991735 main
97.4 3 2:29.094 1 211 149094235 run_sim
72.4 0.06 1:50.751 50 50 2215036 central2d_run
72.4 2 1:50.751 50 6060 2215035 central2d_xrun
70.0 145 1:47.099 2424 104849 44183 central2d_step
35.3 394 53,932 2424 1.71619E+06 22249 central2d_predict
34.7 53,042 53,071 1.69438E+06 49969 31 limited_derivk
34.6 505 52,922 2424 1.77256E+06 21833 central2d_correct
34.6 52,832 52,861 1.69438E+06 50032 31 limited_deriv1
12.3 1 18,825 51 4131 369126 gather_sol
12.1 4 18,579 4080 8160 4554 recv_full_u
12.0 18,301 18,311 4131 62369 4433 copy_u
8.9 13,583 13,583 51 0 266338 solution_check
3.7 5,591 5,591 51 0 109638 viz_frame
2.0 2,997 2,997 1 0 2997232 MPI_Init()
1.5 7 2,276 1212 33936 1878 central2d_periodic
1.5 2,251 2,251 4848 0 464 MPI_Sendrecv()
0.9 1,336 1,336 1212 0 1103 MPI_Allreduce()
0.4 632 632 1 0 632444 MPI_Finalize()
0.3 508 508 4080 0 125 MPI_Recv()
0.2 321 321 1 0 321839 viz_close
0.1 34 98 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 64 64 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 57 100001 100001 1 limdiff [THROTTLED]
0.0 0.521 36 1212 1212 30 shallow2d_speed
0.0 36 36 1212 0 30 shallow2dv_speed
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 16 16 29088 0 1 copy_subgrid
0.0 14 14 100001 0 0 xmin2s [THROTTLED]
0.0 14 14 100001 0 0 central2d_offset [THROTTLED]
0.0 9 14 1 37632 14666 lua_init_sim
0.0 1 1 2 0 808 central2d_free
0.0 0.945 0.945 1 0 945 viz_open
0.0 0.023 0.023 1 0 23 MPI_Barrier()
0.0 0.014 0.014 1 2 14 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 0, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 1;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:32.999 1 1 152999553 .TAU application
100.0 255 2:32.992 1 3 152992936 main
96.8 0.57 2:28.043 1 108 148043461 run_sim
96.1 0.055 2:27.028 50 50 2940567 central2d_run
96.1 2 2:27.028 50 6060 2940565 central2d_xrun
70.1 145 1:47.308 2424 104849 44269 central2d_step
35.6 397 54,506 2424 1.71619E+06 22486 central2d_predict
34.7 53,026 53,055 1.69438E+06 49969 31 limited_derivk
34.7 53,026 53,055 1.69438E+06 50032 31 limited_deriv1
34.4 532 52,558 2424 1.77256E+06 21682 central2d_correct
25.1 7 38,342 1212 33936 31635 central2d_periodic
25.0 38,315 38,315 4848 0 7903 MPI_Sendrecv()
2.0 2,997 2,997 1 0 2997111 MPI_Init()
1.1 1,697 1,697 1 0 1697054 MPI_Finalize()
0.9 1,344 1,344 1212 0 1109 MPI_Allreduce()
0.5 742 742 1 0 742626 MPI_Barrier()
0.2 0.047 257 51 51 5042 gather_sol
0.2 0.041 257 51 51 5041 send_full_u
0.2 257 257 51 0 5040 MPI_Send()
0.1 33 98 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 64 64 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.588 30 1212 1212 25 shallow2d_speed
0.0 30 30 1212 0 25 shallow2dv_speed
0.0 23 23 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 18 18 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14671 lua_init_sim
0.0 5 5 37632 0 0 central2d_offset
0.0 0.1 0.1 2 0 50 central2d_free
0.0 0.014 0.014 1 2 14 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 1, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 2;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:33.002 1 1 153002051 .TAU application
100.0 255 2:32.995 1 3 152995702 main
96.8 0.516 2:28.048 1 108 148048085 run_sim
95.9 0.054 2:26.786 50 50 2935726 central2d_run
95.9 2 2:26.786 50 6060 2935725 central2d_xrun
69.8 143 1:46.867 2424 104849 44087 central2d_step
35.3 395 54,016 2424 1.71619E+06 22284 central2d_predict
34.6 52,883 52,912 1.69438E+06 49969 31 limited_derivk
34.5 52,734 52,763 1.69438E+06 50032 31 limited_deriv1
34.4 531 52,610 2424 1.77256E+06 21704 central2d_correct
13.6 20,794 20,794 1212 0 17157 MPI_Allreduce()
12.5 7 19,086 1212 33936 15748 central2d_periodic
12.5 19,059 19,059 4848 0 3931 MPI_Sendrecv()
2.0 2,996 2,996 1 0 2996872 MPI_Init()
1.1 1,695 1,695 1 0 1695443 MPI_Finalize()
0.5 734 734 1 0 734827 MPI_Barrier()
0.3 0.055 511 51 51 10024 gather_sol
0.3 0.04 511 51 51 10023 send_full_u
0.3 511 511 51 0 10022 MPI_Send()
0.1 34 97 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 62 62 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 57 100001 100001 1 limdiff [THROTTLED]
0.0 0.488 35 1212 1212 30 shallow2d_speed
0.0 35 35 1212 0 29 shallow2dv_speed
0.0 23 23 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 18 18 29088 0 1 copy_subgrid
0.0 9 15 1 37632 15104 lua_init_sim
0.0 14 14 100001 0 0 xmin2s [THROTTLED]
0.0 5 5 37632 0 0 central2d_offset
0.0 0.087 0.087 2 0 44 central2d_free
0.0 0.014 0.015 1 2 15 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 2, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 3;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 7 2:33.002 1 1 153002844 .TAU application
100.0 255 2:32.995 1 3 152995805 main
96.8 0.463 2:28.052 1 108 148052642 run_sim
95.8 0.051 2:26.550 50 50 2931014 central2d_run
95.8 2 2:26.550 50 6060 2931013 central2d_xrun
69.7 141 1:46.634 2424 104849 43991 central2d_step
35.2 394 53,872 2424 1.71619E+06 22224 central2d_predict
34.6 52,884 52,913 1.69438E+06 49969 31 limited_derivk
34.4 52,532 52,561 1.69438E+06 50032 31 limited_deriv1
34.3 505 52,524 2424 1.77256E+06 21668 central2d_correct
13.5 20,590 20,590 1212 0 16989 MPI_Allreduce()
12.6 7 19,287 1212 33936 15914 central2d_periodic
12.6 19,261 19,261 4848 0 3973 MPI_Sendrecv()
2.0 2,996 2,996 1 0 2996768 MPI_Init()
1.1 1,691 1,691 1 0 1691109 MPI_Finalize()
0.5 0.035 756 51 51 14836 gather_sol
0.5 0.045 756 51 51 14836 send_full_u
0.5 756 756 51 0 14835 MPI_Send()
0.5 730 730 1 0 730039 MPI_Barrier()
0.1 34 96 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 62 62 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.511 36 1212 1212 30 shallow2d_speed
0.0 35 35 1212 0 29 shallow2dv_speed
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 17 17 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14692 lua_init_sim
0.0 5 5 37632 0 0 central2d_offset
0.0 0.083 0.083 2 0 42 central2d_free
0.0 0.013 0.014 1 2 14 central2d_init
0.0 0.012 0.012 1 0 12 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 3, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 4;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 7 2:32.997 1 1 152997350 .TAU application
100.0 255 2:32.990 1 3 152990074 main
96.8 0.539 2:28.057 1 108 148057202 run_sim
95.6 0.047 2:26.335 50 50 2926713 central2d_run
95.6 2 2:26.335 50 6060 2926712 central2d_xrun
69.8 139 1:46.820 2424 104849 44068 central2d_step
35.1 395 53,714 2424 1.71619E+06 22160 central2d_predict
34.6 52,848 52,877 1.69438E+06 49969 31 limited_derivk
34.6 506 52,870 2424 1.77256E+06 21811 central2d_correct
34.5 52,754 52,783 1.69438E+06 50032 31 limited_deriv1
13.3 20,389 20,389 1212 0 16823 MPI_Allreduce()
12.5 7 19,094 1212 33936 15755 central2d_periodic
12.5 19,069 19,069 4848 0 3934 MPI_Sendrecv()
2.0 2,997 2,997 1 0 2997270 MPI_Init()
1.1 1,680 1,680 1 0 1680253 MPI_Finalize()
0.6 0.057 981 51 51 19236 gather_sol
0.6 0.04 980 51 51 19235 send_full_u
0.6 980 980 51 0 19234 MPI_Send()
0.5 725 725 1 0 725265 MPI_Barrier()
0.1 34 95 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 61 61 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.522 29 1212 1212 24 shallow2d_speed
0.0 28 28 1212 0 24 shallow2dv_speed
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 17 17 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14663 lua_init_sim
0.0 5 5 37632 0 0 central2d_offset
0.0 0.058 0.058 2 0 29 central2d_free
0.0 0.011 0.011 1 2 11 central2d_init
0.0 0.006 0.006 1 0 6 copy_basic_info
0.0 0.001 0.001 1 0 1 viz_open
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 4, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 5;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:33.003 1 1 153003464 .TAU application
100.0 255 2:32.997 1 3 152997134 main
96.8 0.436 2:28.061 1 108 148061783 run_sim
95.5 0.053 2:26.106 50 50 2922137 central2d_run
95.5 2 2:26.106 50 6060 2922136 central2d_xrun
69.7 146 1:46.658 2424 104849 44001 central2d_step
35.2 390 53,870 2424 1.71619E+06 22224 central2d_predict
34.6 52,884 52,913 1.69438E+06 49969 31 limited_derivk
34.4 52,540 52,569 1.69438E+06 50032 31 limited_deriv1
34.3 515 52,541 2424 1.77256E+06 21676 central2d_correct
13.7 20,950 20,950 1212 0 17286 MPI_Allreduce()
12.1 7 18,467 1212 33936 15237 central2d_periodic
12.1 18,441 18,441 4848 0 3804 MPI_Sendrecv()
2.0 2,997 2,997 1 0 2997051 MPI_Init()
1.1 1,683 1,683 1 0 1683007 MPI_Finalize()
0.8 0.066 1,218 51 51 23900 gather_sol
0.8 0.034 1,218 51 51 23899 send_full_u
0.8 1,218 1,218 51 0 23898 MPI_Send()
0.5 720 720 1 0 720471 MPI_Barrier()
0.1 34 99 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 65 65 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 57 100001 100001 1 limdiff [THROTTLED]
0.0 0.486 28 1212 1212 24 shallow2d_speed
0.0 28 28 1212 0 23 shallow2dv_speed
0.0 23 23 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 17 17 29088 0 1 copy_subgrid
0.0 9 14 1 37632 14999 lua_init_sim
0.0 14 14 100001 0 0 xmin2s [THROTTLED]
0.0 5 5 37632 0 0 central2d_offset
0.0 0.082 0.082 2 0 41 central2d_free
0.0 0.015 0.015 1 2 15 central2d_init
0.0 0.006 0.006 1 0 6 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 5, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 6;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:33.001 1 1 153001963 .TAU application
100.0 255 2:32.996 1 3 152996357 main
96.8 0.488 2:28.066 1 108 148066296 run_sim
95.3 0.046 2:25.877 50 50 2917542 central2d_run
95.3 2 2:25.877 50 6060 2917541 central2d_xrun
69.7 148 1:46.575 2424 104849 43967 central2d_step
35.1 398 53,773 2424 1.71619E+06 22184 central2d_predict
34.5 52,827 52,856 1.69438E+06 49969 31 limited_derivk
34.3 530 52,553 2424 1.77256E+06 21680 central2d_correct
34.3 52,487 52,516 1.69438E+06 50032 31 limited_deriv1
14.7 22,567 22,567 1212 0 18620 MPI_Allreduce()
10.9 7 16,720 1212 33936 13795 central2d_periodic
10.9 16,694 16,694 4848 0 3444 MPI_Sendrecv()
2.0 2,996 2,996 1 0 2996522 MPI_Init()
1.1 1,678 1,678 1 0 1678202 MPI_Finalize()
1.0 0.057 1,458 51 51 28590 gather_sol
1.0 0.033 1,458 51 51 28589 send_full_u
1.0 1,457 1,457 51 0 28588 MPI_Send()
0.5 715 715 1 0 715704 MPI_Barrier()
0.1 34 100 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 66 66 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 24 24 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 17 17 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14814 lua_init_sim
0.0 0.518 12 1212 1212 10 shallow2d_speed
0.0 11 11 1212 0 10 shallow2dv_speed
0.0 5 5 37632 0 0 central2d_offset
0.0 0.079 0.079 2 0 40 central2d_free
0.0 0.036 0.036 1 2 36 central2d_init
0.0 0.009 0.009 1 0 9 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 6, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 7;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 5 2:33.006 1 1 153006138 .TAU application
100.0 255 2:33.000 1 3 153000302 main
96.8 0.453 2:28.070 1 108 148070879 run_sim
95.2 0.055 2:25.658 50 50 2913170 central2d_run
95.2 2 2:25.658 50 6060 2913169 central2d_xrun
69.7 147 1:46.657 2424 104849 44000 central2d_step
35.2 391 53,850 2424 1.71619E+06 22216 central2d_predict
34.6 52,898 52,927 1.69438E+06 49969 31 limited_derivk
34.4 516 52,558 2424 1.77256E+06 21683 central2d_correct
34.3 52,520 52,550 1.69438E+06 50032 31 limited_deriv1
14.6 22,367 22,367 1212 0 18455 MPI_Allreduce()
10.9 7 16,619 1212 33936 13712 central2d_periodic
10.8 16,594 16,594 4848 0 3423 MPI_Sendrecv()
2.0 2,996 2,996 1 0 2996561 MPI_Init()
1.1 0.063 1,686 51 51 33063 gather_sol
1.1 0.037 1,686 51 51 33062 send_full_u
1.1 1,686 1,686 51 0 33061 MPI_Send()
1.1 1,677 1,677 1 0 1677543 MPI_Finalize()
0.5 710 710 1 0 710941 MPI_Barrier()
0.1 34 100 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 65 65 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 23 23 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 16 16 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14668 lua_init_sim
0.0 0.482 12 1212 1212 10 shallow2d_speed
0.0 11 11 1212 0 10 shallow2dv_speed
0.0 5 5 37632 0 0 central2d_offset
0.0 0.08 0.08 2 0 40 central2d_free
0.0 0.014 0.015 1 2 15 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0.001 0.001 1 0 1 viz_open
0.0 0 0 1 0 0 MPI_Comm_rank()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 7, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 8;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:33.001 1 1 153001854 .TAU application
100.0 255 2:32.994 1 3 152994935 main
96.8 0.697 2:28.075 1 108 148075515 run_sim
95.0 0.073 2:25.292 50 50 2905846 central2d_run
95.0 2 2:25.292 50 6060 2905844 central2d_xrun
67.4 139 1:43.192 2424 104849 42571 central2d_step
35.3 396 53,982 2424 1.71619E+06 22270 central2d_predict
33.7 51,547 51,576 1.69438E+06 50050 30 limited_deriv1
33.0 50,427 50,457 1.69438E+06 49951 30 limited_derivk
32.0 508 48,976 2424 1.77256E+06 20205 central2d_correct
25.2 8 38,552 1212 33936 31809 central2d_periodic
25.2 38,526 38,526 4848 0 7947 MPI_Sendrecv()
2.3 3,516 3,516 1212 0 2902 MPI_Allreduce()
2.0 2,997 2,997 1 0 2997290 MPI_Init()
1.3 0.053 2,062 51 51 40445 gather_sol
1.3 0.053 2,062 51 51 40443 send_full_u
1.3 2,062 2,062 51 0 40442 MPI_Send()
1.1 1,666 1,666 1 0 1666794 MPI_Finalize()
0.5 706 706 1 0 706136 MPI_Barrier()
0.1 33 94 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 60 60 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 59 100001 100001 1 limdiff [THROTTLED]
0.0 0.473 27 1212 1212 23 shallow2d_speed
0.0 27 27 1212 0 23 shallow2dv_speed
0.0 20 20 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 18 18 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 8 13 1 34944 13606 lua_init_sim
0.0 5 5 34944 0 0 central2d_offset
0.0 0.062 0.062 2 0 31 central2d_free
0.0 0.038 0.039 1 2 39 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 8, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 9;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:33.003 1 1 153003955 .TAU application
100.0 255 2:32.997 1 3 152997194 main
96.8 0.447 2:28.079 1 108 148079755 run_sim
94.9 0.046 2:25.227 50 50 2904549 central2d_run
94.9 2 2:25.227 50 6060 2904548 central2d_xrun
69.7 145 1:46.705 2424 104849 44020 central2d_step
35.2 398 53,866 2424 1.71619E+06 22222 central2d_predict
34.6 52,889 52,918 1.69438E+06 49969 31 limited_derivk
34.4 528 52,594 2424 1.77256E+06 21697 central2d_correct
34.4 52,561 52,590 1.69438E+06 50032 31 limited_deriv1
24.3 7 37,169 1212 33936 30668 central2d_periodic
24.3 37,141 37,141 4848 0 7661 MPI_Sendrecv()
2.0 2,997 2,997 1 0 2997117 MPI_Init()
1.4 0.05 2,135 51 51 41871 gather_sol
1.4 0.04 2,135 51 51 41870 send_full_u
1.4 2,135 2,135 51 0 41869 MPI_Send()
1.1 1,665 1,665 1 0 1665027 MPI_Finalize()
0.9 1,319 1,319 1212 0 1089 MPI_Allreduce()
0.5 701 701 1 0 701666 MPI_Barrier()
0.1 34 99 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 64 64 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.471 30 1212 1212 26 shallow2d_speed
0.0 30 30 1212 0 25 shallow2dv_speed
0.0 24 24 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 19 19 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14671 lua_init_sim
0.0 5 5 37632 0 0 central2d_offset
0.0 0.077 0.077 2 0 38 central2d_free
0.0 0.016 0.016 1 2 16 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 9, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 10;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 7 2:33.004 1 1 153004044 .TAU application
100.0 255 2:32.996 1 3 152996878 main
96.8 0.579 2:28.084 1 108 148084565 run_sim
94.8 0.062 2:25.107 50 50 2902158 central2d_run
94.8 2 2:25.107 50 6060 2902157 central2d_xrun
71.5 146 1:49.369 2424 104849 45120 central2d_step
36.1 545 55,205 2424 1.77256E+06 22775 central2d_correct
35.4 54,180 54,209 1.69438E+06 50032 32 limited_deriv1
35.3 53,915 53,944 1.69438E+06 49969 32 limited_derivk
35.2 400 53,919 2424 1.71619E+06 22244 central2d_predict
22.5 7 34,385 1212 33936 28371 central2d_periodic
22.5 34,360 34,360 4848 0 7088 MPI_Sendrecv()
2.0 2,997 2,997 1 0 2997236 MPI_Init()
1.5 0.05 2,265 51 51 44417 gather_sol
1.5 0.056 2,265 51 51 44416 send_full_u
1.5 2,265 2,265 51 0 44415 MPI_Send()
1.1 1,659 1,659 1 0 1659728 MPI_Finalize()
0.9 1,335 1,335 1212 0 1102 MPI_Allreduce()
0.5 696 696 1 0 696063 MPI_Barrier()
0.1 34 98 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 64 64 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 24 24 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 17 17 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14633 lua_init_sim
0.0 0.558 14 1212 1212 12 shallow2d_speed
0.0 13 13 1212 0 11 shallow2dv_speed
0.0 5 5 37632 0 0 central2d_offset
0.0 0.083 0.083 2 0 42 central2d_free
0.0 0.016 0.016 1 2 16 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 10, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 11;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:32.997 1 1 152997117 .TAU application
100.0 255 2:32.990 1 3 152990383 main
96.8 0.52 2:28.089 1 108 148089328 run_sim
94.6 0.054 2:24.772 50 50 2895450 central2d_run
94.6 2 2:24.772 50 6060 2895449 central2d_xrun
69.7 144 1:46.678 2424 104849 44009 central2d_step
35.2 399 53,840 2424 1.71619E+06 22211 central2d_predict
34.6 52,866 52,895 1.69438E+06 49969 31 limited_derivk
34.4 531 52,595 2424 1.77256E+06 21698 central2d_correct
34.4 52,556 52,585 1.69438E+06 50032 31 limited_deriv1
21.5 32,872 32,872 1212 0 27122 MPI_Allreduce()
3.4 7 5,189 1212 33936 4282 central2d_periodic
3.4 5,163 5,163 4848 0 1065 MPI_Sendrecv()
2.0 2,997 2,997 1 0 2997108 MPI_Init()
1.7 0.041 2,610 51 51 51179 gather_sol
1.7 0.049 2,610 51 51 51178 send_full_u
1.7 2,610 2,610 51 0 51177 MPI_Send()
1.1 1,648 1,648 1 0 1648634 MPI_Finalize()
0.5 691 691 1 0 691271 MPI_Barrier()
0.1 33 97 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 63 63 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.483 30 1212 1212 25 shallow2d_speed
0.0 30 30 1212 0 25 shallow2dv_speed
0.0 24 24 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 18 18 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14776 lua_init_sim
0.0 5 5 37632 0 0 central2d_offset
0.0 0.088 0.088 2 0 44 central2d_free
0.0 0.014 0.016 1 2 16 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0.001 0.001 1 0 1 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 11, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 12;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 7 2:33.003 1 1 153003838 .TAU application
100.0 255 2:32.996 1 3 152996686 main
96.8 0.503 2:28.093 1 108 148093983 run_sim
94.6 0.056 2:24.668 50 50 2893368 central2d_run
94.6 2 2:24.668 50 6060 2893367 central2d_xrun
71.7 140 1:49.646 2424 104849 45234 central2d_step
37.2 394 56,926 2424 1.71619E+06 23484 central2d_predict
35.8 54,795 54,824 1.69438E+06 50032 32 limited_deriv1
35.1 53,624 53,653 1.69438E+06 49969 32 limited_derivk
34.3 515 52,484 2424 1.77256E+06 21652 central2d_correct
21.3 32,645 32,645 1212 0 26935 MPI_Allreduce()
2.0 2,997 2,997 1 0 2997277 MPI_Init()
1.8 0.043 2,723 51 51 53404 gather_sol
1.8 0.046 2,723 51 51 53403 send_full_u
1.8 2,723 2,723 51 0 53402 MPI_Send()
1.5 7 2,361 1212 33936 1948 central2d_periodic
1.5 2,335 2,335 4848 0 482 MPI_Sendrecv()
1.1 1,650 1,650 1 0 1650105 MPI_Finalize()
0.4 686 686 1 0 686468 MPI_Barrier()
0.1 34 96 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 61 61 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 23 23 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 18 18 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14869 lua_init_sim
0.0 0.49 12 1212 1212 10 shallow2d_speed
0.0 11 11 1212 0 10 shallow2dv_speed
0.0 5 5 37632 0 0 central2d_offset
0.0 0.086 0.086 2 0 43 central2d_free
0.0 0.031 0.031 1 2 31 central2d_init
0.0 0.011 0.011 1 0 11 copy_basic_info
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 12, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 13;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:33.003 1 1 153003439 .TAU application
100.0 255 2:32.996 1 3 152996648 main
96.8 0.43 2:28.098 1 108 148098595 run_sim
94.4 0.052 2:24.409 50 50 2888193 central2d_run
94.4 2 2:24.409 50 6060 2888192 central2d_xrun
71.1 142 1:48.848 2424 104849 44905 central2d_step
35.9 54,829 54,858 1.69438E+06 50032 32 limited_deriv1
35.8 399 54,811 2424 1.71619E+06 22612 central2d_predict
35.2 539 53,796 2424 1.77256E+06 22193 central2d_correct
34.5 52,758 52,787 1.69438E+06 49969 31 limited_derivk
21.2 32,429 32,429 1212 0 26757 MPI_Allreduce()
2.0 7 3,117 1212 33936 2572 central2d_periodic
2.0 3,089 3,089 4848 0 637 MPI_Sendrecv()
2.0 2,997 2,997 1 0 2997269 MPI_Init()
2.0 0.045 2,992 51 51 58669 gather_sol
2.0 0.041 2,992 51 51 58669 send_full_u
2.0 2,992 2,992 51 0 58668 MPI_Send()
1.1 1,645 1,645 1 0 1645440 MPI_Finalize()
0.4 681 681 1 0 681642 MPI_Barrier()
0.1 34 96 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 62 62 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 24 24 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 19 19 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14644 lua_init_sim
0.0 0.501 12 1212 1212 10 shallow2d_speed
0.0 11 11 1212 0 10 shallow2dv_speed
0.0 5 5 37632 0 0 central2d_offset
0.0 0.084 0.084 2 0 42 central2d_free
0.0 0.013 0.015 1 2 15 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 13, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 14;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 7 2:33.004 1 1 153004617 .TAU application
100.0 255 2:32.997 1 3 152997604 main
96.8 0.448 2:28.103 1 108 148103253 run_sim
94.2 0.041 2:24.087 50 50 2881746 central2d_run
94.2 1 2:24.087 50 6060 2881745 central2d_xrun
69.6 144 1:46.566 2424 104849 43963 central2d_step
35.2 396 53,812 2424 1.71619E+06 22200 central2d_predict
34.5 52,833 52,862 1.69438E+06 49969 31 limited_derivk
34.3 52,491 52,520 1.69438E+06 50032 31 limited_deriv1
34.3 520 52,512 2424 1.77256E+06 21663 central2d_correct
21.3 32,581 32,581 1212 0 26882 MPI_Allreduce()
3.2 7 4,906 1212 33936 4048 central2d_periodic
3.2 4,879 4,879 4848 0 1006 MPI_Sendrecv()
2.2 0.036 3,323 51 51 65164 gather_sol
2.2 0.031 3,323 51 51 65164 send_full_u
2.2 3,323 3,323 51 0 65163 MPI_Send()
2.0 2,997 2,997 1 0 2997273 MPI_Init()
1.1 1,641 1,641 1 0 1641761 MPI_Finalize()
0.4 676 676 1 0 676865 MPI_Barrier()
0.1 33 98 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 64 64 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.493 30 1212 1212 25 shallow2d_speed
0.0 30 30 1212 0 25 shallow2dv_speed
0.0 23 23 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 19 19 29088 0 1 copy_subgrid
0.0 9 15 1 37632 15176 lua_init_sim
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 5 5 37632 0 0 central2d_offset
0.0 0.088 0.088 2 0 44 central2d_free
0.0 0.015 0.015 1 2 15 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 viz_open
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 14, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 15;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:33.001 1 1 153001701 .TAU application
100.0 255 2:32.995 1 3 152995215 main
96.8 0.495 2:28.107 1 108 148107938 run_sim
94.0 0.05 2:23.857 50 50 2877155 central2d_run
94.0 1 2:23.857 50 6060 2877154 central2d_xrun
69.6 146 1:46.450 2424 104849 43915 central2d_step
35.1 397 53,674 2424 1.71619E+06 22143 central2d_predict
34.5 52,682 52,711 1.69438E+06 49969 31 limited_derivk
34.3 52,511 52,540 1.69438E+06 50032 31 limited_deriv1
34.3 530 52,530 2424 1.77256E+06 21671 central2d_correct
21.2 32,375 32,375 1212 0 26713 MPI_Allreduce()
3.3 7 5,000 1212 33936 4126 central2d_periodic
3.3 4,972 4,972 4848 0 1026 MPI_Sendrecv()
2.3 0.04 3,562 51 51 69854 gather_sol
2.3 0.037 3,562 51 51 69853 send_full_u
2.3 3,562 3,562 51 0 69852 MPI_Send()
2.0 2,997 2,997 1 0 2997070 MPI_Init()
1.1 1,634 1,634 1 0 1634887 MPI_Finalize()
0.4 672 672 1 0 672059 MPI_Barrier()
0.1 34 99 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 65 65 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.499 28 1212 1212 24 shallow2d_speed
0.0 28 28 1212 0 23 shallow2dv_speed
0.0 23 23 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 19 19 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14979 lua_init_sim
0.0 5 5 37632 0 0 central2d_offset
0.0 0.082 0.082 2 0 41 central2d_free
0.0 0.015 0.016 1 2 16 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 15, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 16;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 7 2:32.464 1 1 152464035 .TAU application
100.0 255 2:32.456 1 3 152456782 main
97.1 0.48 2:28.112 1 108 148112566 run_sim
94.2 0.029 2:23.646 50 50 2872935 central2d_run
94.2 2 2:23.646 50 6060 2872935 central2d_xrun
70.1 148 1:46.838 2424 104849 44075 central2d_step
35.4 400 53,912 2424 1.71619E+06 22241 central2d_predict
34.8 53,006 53,036 1.69438E+06 49969 31 limited_derivk
34.6 519 52,676 2424 1.77256E+06 21731 central2d_correct
34.5 52,582 52,612 1.69438E+06 50032 31 limited_deriv1
21.3 32,497 32,497 1212 0 26813 MPI_Allreduce()
2.8 8 4,280 1212 33936 3531 central2d_periodic
2.8 4,254 4,254 4848 0 878 MPI_Sendrecv()
2.5 0.038 3,783 51 51 74179 gather_sol
2.5 0.045 3,783 51 51 74179 send_full_u
2.5 3,783 3,783 51 0 74178 MPI_Send()
1.6 2,453 2,453 1 0 2453220 MPI_Init()
1.1 1,635 1,635 1 0 1635670 MPI_Finalize()
0.4 667 667 1 0 667226 MPI_Barrier()
0.1 34 100 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 66 66 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.482 28 1212 1212 24 shallow2d_speed
0.0 28 28 1212 0 23 shallow2dv_speed
0.0 22 22 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 17 17 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14813 lua_init_sim
0.0 5 5 37632 0 0 central2d_offset
0.0 0.118 0.118 2 0 59 central2d_free
0.0 0.014 0.015 1 2 15 central2d_init
0.0 0.007 0.007 1 0 7 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0.001 0.001 1 0 1 viz_open
0.0 0 0 1 0 0 MPI_Comm_rank()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 16, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 17;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 7 2:32.458 1 1 152458357 .TAU application
100.0 255 2:32.451 1 3 152451316 main
97.2 0.535 2:28.117 1 108 148117152 run_sim
93.9 0.052 2:23.117 50 50 2862353 central2d_run
93.9 2 2:23.117 50 6060 2862352 central2d_xrun
65.2 138 1:39.443 2424 104849 41025 central2d_step
32.9 391 50,193 2424 1.71619E+06 20707 central2d_predict
32.3 49,274 49,303 1.69438E+06 49951 29 limited_derivk
32.2 508 49,015 2424 1.77256E+06 20221 central2d_correct
32.1 48,954 48,983 1.69438E+06 50050 29 limited_deriv1
26.4 8 40,189 1212 33936 33159 central2d_periodic
26.3 40,164 40,164 4848 0 8285 MPI_Sendrecv()
2.8 0.045 4,322 51 51 84752 gather_sol
2.8 0.052 4,322 51 51 84751 send_full_u
2.8 4,322 4,322 51 0 84750 MPI_Send()
2.3 3,454 3,454 1212 0 2850 MPI_Allreduce()
1.6 2,453 2,453 1 0 2453146 MPI_Init()
1.1 1,625 1,625 1 0 1625692 MPI_Finalize()
0.4 662 662 1 0 662749 MPI_Barrier()
0.1 34 95 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 61 61 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.484 28 1212 1212 23 shallow2d_speed
0.0 27 27 1212 0 23 shallow2dv_speed
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 16 16 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 8 13 1 34944 13707 lua_init_sim
0.0 5 5 34944 0 0 central2d_offset
0.0 0.103 0.103 2 0 52 central2d_free
0.0 0.016 0.017 1 2 17 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 17, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 18;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:32.460 1 1 152460175 .TAU application
100.0 255 2:32.453 1 3 152453402 main
97.2 0.507 2:28.121 1 108 148121528 run_sim
93.9 0.033 2:23.219 50 50 2864391 central2d_run
93.9 2 2:23.219 50 6060 2864390 central2d_xrun
70.2 142 1:47.050 2424 104849 44163 central2d_step
35.6 394 54,346 2424 1.71619E+06 22420 central2d_predict
34.8 52,956 52,985 1.69438E+06 50032 31 limited_deriv1
34.7 52,873 52,902 1.69438E+06 49969 31 limited_derivk
34.4 507 52,464 2424 1.77256E+06 21644 central2d_correct
19.7 30,021 30,021 1212 0 24770 MPI_Allreduce()
4.0 7 6,116 1212 33936 5047 central2d_periodic
4.0 6,092 6,092 4848 0 1257 MPI_Sendrecv()
2.8 0.041 4,227 51 51 82899 gather_sol
2.8 0.034 4,227 51 51 82898 send_full_u
2.8 4,227 4,227 51 0 82898 MPI_Send()
1.6 2,453 2,453 1 0 2453275 MPI_Init()
1.1 1,623 1,623 1 0 1623291 MPI_Finalize()
0.4 658 658 1 0 658549 MPI_Barrier()
0.1 34 97 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 63 63 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 42 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.49 28 1212 1212 24 shallow2d_speed
0.0 28 28 1212 0 23 shallow2dv_speed
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 15 15 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 14 1 37632 14928 lua_init_sim
0.0 5 5 37632 0 0 central2d_offset
0.0 0.111 0.111 2 0 56 central2d_free
0.0 0.012 0.012 1 2 12 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 viz_open
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 MPI_Comm_size()
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 18, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 19;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:32.458 1 1 152458931 .TAU application
100.0 255 2:32.451 1 3 152451933 main
97.2 0.455 2:28.126 1 108 148126268 run_sim
93.8 0.038 2:22.971 50 50 2859438 central2d_run
93.8 2 2:22.971 50 6060 2859437 central2d_xrun
70.0 139 1:46.649 2424 104849 43997 central2d_step
35.3 398 53,848 2424 1.71619E+06 22215 central2d_predict
34.7 52,915 52,944 1.69438E+06 49969 31 limited_derivk
34.5 511 52,564 2424 1.77256E+06 21685 central2d_correct
34.5 52,508 52,537 1.69438E+06 50032 31 limited_deriv1
20.4 31,091 31,091 1212 0 25653 MPI_Allreduce()
3.4 8 5,197 1212 33936 4288 central2d_periodic
3.4 5,172 5,172 4848 0 1067 MPI_Sendrecv()
2.9 0.034 4,485 51 51 87953 gather_sol
2.9 0.049 4,485 51 51 87952 send_full_u
2.9 4,485 4,485 51 0 87951 MPI_Send()
1.6 2,453 2,453 1 0 2453148 MPI_Init()
1.1 1,617 1,617 1 0 1617197 MPI_Finalize()
0.4 653 653 1 0 653219 MPI_Barrier()
0.1 34 96 100001 100001 1 shallow2d_flux [THROTTLED]
0.0 62 62 100001 0 1 shallow2dv_flux [THROTTLED]
0.0 43 58 100001 100001 1 limdiff [THROTTLED]
0.0 0.496 32 1212 1212 26 shallow2d_speed
0.0 31 31 1212 0 26 shallow2dv_speed
0.0 21 21 100001 0 0 central2d_correct_sd [THROTTLED]
0.0 16 16 29088 0 1 copy_subgrid
0.0 15 15 100001 0 0 xmin2s [THROTTLED]
0.0 9 15 1 37632 15009 lua_init_sim
0.0 5 5 37632 0 0 central2d_offset
0.0 0.087 0.087 2 0 44 central2d_free
0.0 0.012 0.013 1 2 13 central2d_init
0.0 0.008 0.008 1 0 8 copy_basic_info
0.0 0.001 0.001 1 0 1 MPI_Comm_size()
0.0 0 0 1 0 0 MPI_Comm_rank()
0.0 0 0 1 0 0 viz_open
---------------------------------------------------------------------------------------
USER EVENTS Profile :NODE 19, CONTEXT 0, THREAD 0
---------------------------------------------------------------------------------------
NumSamples MaxValue MinValue MeanValue Std. Dev. Event Name
---------------------------------------------------------------------------------------
1212 4 4 4 0 Message size for all-reduce
---------------------------------------------------------------------------------------
NODE 20;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time Exclusive Inclusive #Call #Subrs Inclusive Name
msec total msec usec/call
---------------------------------------------------------------------------------------
100.0 6 2:32.460 1 1 152460253 .TAU application
100.0 255 2:32.453 1 3 152453609 main
97.2 0.435 2:28.130 1 108 148130924 run_sim
93.6 0.025 2:22.769 50 50 2855393 central2d_run
93.6 2 2:22.769 50 6060 2855393 central2d_xrun
70.3 136 1:47.239 2424 104849 44241 central2d_step
35.8 392 54,510 2424 1.71619E+06 22488 central2d_predict
34.8 53,052 53,081 1.69438E+06 49969 31 limited_derivk
34.8 52,979 53,008 1.69438E+06 50032 31 limited_deriv1