f-exx-opt: Results are dependent on compiler/library versions and # of MPI ranks #350

connoraird · 2024-05-14T13:36:34Z

Description

In the branch f-exx-opt the Conquest_out results appear to be dependent on the compiler/libraries and the number of MPI ranks you use.

Compiler and library versions and their results

The results mentioned are for running the benchmark test_EXX_isol_C2H4_4proc_PBE0ERI_fullSZP_0.4_SCF

Mac, Apple ARM

On an M2 mac, the following Homebrew versions of each dependency were used.

fftw v3.3.10
scalapack v2.2.0_1
openblas v0.3.27
libxc v6.2.2
lapack v3.12.0
openmpi v5.0.3
gcc (gfortran) v14.1.0

With the above, the Harris-Foulkes energy obtained is,
np-2, nt-1 - |* Harris-Foulkes energy = -13.443098419459020 Ha
np-4, nt-1 - |* Harris-Foulkes energy = -14.028830939450383 Ha

Linux

On a UCL cluster (myriad) the following version were used.

fftw v3.3.8
scalapack v2.1.0
openblas v0.3.7
libxc v6.2.2 compiled myself with gcc v9.2.0
lapack from the above openblas
openmpi v3.1.5
gcc (gfortran) v9.2.0

With the above, the Harris-Foulkes energy obtained was,
np-2, nt-1 - |* Harris-Foulkes energy = -13.300253717139059Ha
np-4, nt-1 - |* Harris-Foulkes energy = -13.561798008220549 Ha

The text was updated successfully, but these errors were encountered:

connoraird · 2024-05-14T15:41:42Z

Profiling results for test_EXX_isol_C2H4_4proc_PBE0ERI_fullSZP_0.4_SCF on myriad using the intel compiler and the following modules.

Currently Loaded Modulefiles:
 1) beta-modules               5) libxc/6.2.2/intel-2022   9) git/2.41.0-lfs-3.3.0
 2) gcc-libs/10.2.0            6) cmake/3.21.1            10) emacs/28.1
 3) compilers/intel/2022.2     7) python/3.8.6            11) userscripts/1.4.0
 4) mpi/intel/2021.6.0/intel   8) gerun

The results of these runs are:

np-2:
      |* Harris-Foulkes energy   =       -13.038957181266369 Ha
np-4:
      |* Harris-Foulkes energy   =       -12.967703483818312 Ha

When running with 4 ranks and 2 threads (np-4), about 25% of the time for rank 0's primary thread is spent at a call to MPI_Wait.

When running with 2 ranks and 4 threads (np-2), no time is spent in this call for the primary thread of rank 0.

connoraird added the type: bug label May 14, 2024

connoraird added a commit that referenced this issue May 30, 2024

Fix for #350

6cb0a02

connoraird added a commit that referenced this issue May 30, 2024

Fix for #350

a6a6add

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

f-exx-opt: Results are dependent on compiler/library versions and # of MPI ranks #350

f-exx-opt: Results are dependent on compiler/library versions and # of MPI ranks #350

connoraird commented May 14, 2024 •

edited

connoraird commented May 14, 2024 •

edited

f-exx-opt: Results are dependent on compiler/library versions and # of MPI ranks #350

f-exx-opt: Results are dependent on compiler/library versions and # of MPI ranks #350

Comments

connoraird commented May 14, 2024 • edited

Description

Compiler and library versions and their results

Mac, Apple ARM

Linux

connoraird commented May 14, 2024 • edited

connoraird commented May 14, 2024 •

edited

connoraird commented May 14, 2024 •

edited