Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flang-new array expression performance #91847

Open
jeffhammond opened this issue May 11, 2024 · 0 comments
Open

flang-new array expression performance #91847

jeffhammond opened this issue May 11, 2024 · 0 comments
Labels
flang Flang issues not falling into any other category

Comments

@jeffhammond
Copy link
Member

jeffhammond commented May 11, 2024

Array expression performance is ~5x worse than sequential loops for anything other than simple assignment.

I'm running on an AMD Zen4.

Build

jehammond@oppenheimer:~/BabelStream/src/fortran$ make COMPILER=flang IMPLEMENTATION=Array
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_ARRAY  -c BabelStreamTypes.F90
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_ARRAY  -c ArrayStream.F90
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_ARRAY  main.F90 ArrayStream.o BabelStreamTypes.o -o BabelStream.flang.Array
jehammond@oppenheimer:~/BabelStream/src/fortran$ make COMPILER=flang IMPLEMENTATION=Sequential
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_SEQUENTIAL  -c BabelStreamTypes.F90
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_SEQUENTIAL  -c SequentialStream.F90
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_SEQUENTIAL  main.F90 SequentialStream.o BabelStreamTypes.o -o BabelStream.flang.Sequential

Run

$ ./BabelStream.flang.Array -n 10 ; ./BabelStream.flang.Sequential -n 10
BabelStream Fortran
Version:  5.0
Implementation: Array
Running kernels 10 times
Precision: REAL64
Array size:     268.4MB
Total size:     805.3MB
Init:       0.2s (=   3837.6 MBytes/sec)
Read:       0.2s (=   3752.9 MBytes/sec)
Function    MBytes/sec  Min (sec)   Max         Average
Copy        32182.843   0.01668     0.01720     0.01706
Mul         5511.874    0.09740     0.10650     0.10047
Add         7305.358    0.11024     0.11597     0.11346
Triad       7277.179    0.11066     0.12223     0.11406
Dot         23582.615   0.02277     0.02403     0.02346
BabelStream Fortran
Version:  5.0
Implementation: Sequential
Running kernels 10 times
Precision: REAL64
Array size:     268.4MB
Total size:     805.3MB
Init:       0.2s (=   3903.8 MBytes/sec)
Read:       0.2s (=   3583.2 MBytes/sec)
Function    MBytes/sec  Min (sec)   Max         Average
Copy        31461.320   0.01706     0.01799     0.01772
Mul         30492.747   0.01761     0.01913     0.01822
Add         34026.178   0.02367     0.02494     0.02437
Triad       32883.429   0.02449     0.02493     0.02474
Dot         25825.380   0.02079     0.02106     0.02091

Sources

https://github.com/UoB-HPC/BabelStream/blob/main/src/fortran/main.F90
https://github.com/UoB-HPC/BabelStream/blob/main/src/fortran/BabelStreamTypes.F90
https://github.com/UoB-HPC/BabelStream/blob/main/src/fortran/ArrayStream.F90
https://github.com/UoB-HPC/BabelStream/blob/main/src/fortran/SequentialStream.F90

@jeffhammond jeffhammond added the flang Flang issues not falling into any other category label May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang Flang issues not falling into any other category
Projects
None yet
Development

No branches or pull requests

1 participant