You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is to document the MPI testing issues on s390x architectures in the unlikely case a RedHat engineer fancies to have a look. If upstream wish to debug it, s390x architecture is available on copr (also ppc if they wish to officially support it) to replicate this issue and add more tracebacks.
The issue was discovered in: https://src.fedoraproject.org/rpms/cp2k/pull-request/6 and should be replicable in the final form of that PR by removing the ExcludeArch: s390x. The error experienced varies, sometimes 1/6 unit-tests fail, sometimes 3/6, but it seems to be an MPI issue.
Error traceback
------------------------------- Errors ---------------------------------
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
/builddir/build/BUILD/cp2k-2024.1/regtesting/local_mpich/psmp/TEST-local_mpich-psmp-2024-04-22_21-01-01/UNIT/libcp2k_unittest/libcp2k_unittest.out
Unit test starts ...
Testing cp_c_get_version(): CP2K version 2024.1.
Unit test starts ...
Testing cp_c_get_version(): CP2K version 2024.1.
**** **** ****** ** PROGRAM STARTED AT 2024-04-22 21:01:03.378
***** ** *** *** ** PROGRAM STARTED ON fa7470e99a474cfa89624bb3b90776
** **** ****** PROGRAM STARTED BY mockbuild
***** ** ** ** ** PROGRAM PROCESS ID 59164
**** ** ******* ** PROGRAM STARTED IN /builddir/build/BUILD/cp2k-2024.1/reg
testing/local_mpich/psmp/TEST-local_m
pich-psmp-2024-04-22_21-01-01/UNIT/li
bcp2k_unittest
CP2K| version string: CP2K version 2024.1
CP2K| source code revision number:
CP2K| cp2kflags: omp fftw3 libxc parallel mpi_f08 scalapack spglib
CP2K| is freely available from https://www.cp2k.org/
CP2K| Program compiled at 2024-04-22 00:00:00
CP2K| Program compiled on
CP2K| Program compiled for s390x
CP2K| Data directory path /builddir/build/BUILDROOT/cp2k-2024.1-5.fc41.s390x
CP2K| Input file name H2.inp
*******************************************************************************
* MPI error 340866319 in mpi_bcast @ mp_bcast_iv_src : Other MPI *
* error, error stack:
internal_Bcast(7723).......................: *
* ___ MPI_Bcast(buffer=0x2aa255108c0, count=1000, MPI_INTEGER, 0, *
* / \ MPI_COMM_WORLD) failed
MPID_Bcast(272)............................: *
* [ABORT]
MPIDI_Bcast_allcomm_composition_json(225)..: *
* \___/
MPIDI_POSIX_mpi_bcast(238).................: *
* |
MPIDI_POSIX_mpi_release_gather_release(218): message sizes do not *
* O/| match across processes in the collective routine: Received 0 but *
* /| | expected 4000 *
* / \ message_passing.F:1305 *
*******************************************************************************
===== Routine Calling Stack =====
6 broadcast_input_information
5 parser_read_line_low
4 parser_read_line
3 section_vals_parse
2 read_input
1 CP2K
Abort(1) on node 1 (rank 1 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1
STOP 1
Runtime failure with code 1.
------------------------------- Timings --------------------------------
Plot: name="timings", title="Timing Distribution", ylabel="time [s]"
PlotPoint: name="100th_percentile", plot="timings", label="100th %ile", y=1.22, yerr=0.0
PlotPoint: name="99th_percentile", plot="timings", label="99th %ile", y=1.20, yerr=0.0
PlotPoint: name="98th_percentile", plot="timings", label="98th %ile", y=1.18, yerr=0.0
PlotPoint: name="95th_percentile", plot="timings", label="95th %ile", y=1.11, yerr=0.0
PlotPoint: name="90th_percentile", plot="timings", label="90th %ile", y=1.01, yerr=0.0
PlotPoint: name="80th_percentile", plot="timings", label="80th %ile", y=0.80, yerr=0.0
----------------------------- Slow Tests -------------------------------
Duration threshold (2x 95th %ile): 2.22 sec
Found 0 slow tests (0 suppressed):
------------------------------- Summary --------------------------------
Number of FAILED tests 1
Number of WRONG tests 0
Number of CORRECT tests 5
Total number of tests 6
Summary: correct: 5 / 6; failed: 1; 0min
Status: FAILED
*************************** Testing ended ******************************
The text was updated successfully, but these errors were encountered:
This issue is to document the MPI testing issues on
s390x
architectures in the unlikely case a RedHat engineer fancies to have a look. If upstream wish to debug it,s390x
architecture is available on copr (alsoppc
if they wish to officially support it) to replicate this issue and add more tracebacks.The issue was discovered in: https://src.fedoraproject.org/rpms/cp2k/pull-request/6 and should be replicable in the final form of that PR by removing the
ExcludeArch: s390x
. The error experienced varies, sometimes 1/6 unit-tests fail, sometimes 3/6, but it seems to be an MPI issue.Error traceback
The text was updated successfully, but these errors were encountered: