Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erroneous MPI_Reduce Results with Data Sets larger than 2KB in Multi-Server Configurations #6893

Closed
jc-word opened this issue Feb 5, 2024 · 8 comments

Comments

@jc-word
Copy link

jc-word commented Feb 5, 2024

Description
I am experiencing an issue where MPI_Reduce, specifically during addition reduction operations, yields incorrect results when dealing with data sets larger than 2KB in size. This problem only manifests in a distributed environment spanning multiple servers, under the condition that the root process (rank 0) is exclusively running on its own server, and at least one of the other servers hosts more than one MPI process.

Environment
MPICH version: 4.1
Operating System: centos 7.4.1708

Configuration and code


#include <iostream>
#include <mpi.h>
using namespace std;
int main(int argc,char* argv[])
{

    MPI_Init(&argc,&argv);

    int myrank;
    int gsize;

    MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
    MPI_Comm_size(MPI_COMM_WORLD,&gsize);

    int count = 0;
    int *sendBuf = new int[4000];
    int *recvBuf = new int[4000];

    int root = 0;

    for (int y = 0; y < 5; y++)
    {   double max = 0;
        for (int x = 0; x < 5; x++)
        {
            for (size_t i = 0; i < 400; i++)
            {
                sendBuf[i] = 12;
                recvBuf[i] = 0;
                // if (myrank == 0) recvBuf[i] = 12;//After uncommenting, the program runs with correct results
            }

             MPI_Reduce(sendBuf, recvBuf, 513, MPI_INT, MPI_SUM, root, MPI_COMM_WORLD);

            if (myrank == root)
            {
                for (size_t i = 0; i < 400; i++)
                {
                   cout<< " "<< recvBuf[i];
                }

                cout<<endl;

            }    

        }
    }

    MPI_Finalize();
    return 0;
}

machine file

smdpu04:1
smdpu01:2

Program Execution Command
mpirun -machinefile machinefile ./mpiexe > test.txt

@hzhou
Copy link
Contributor

hzhou commented Feb 5, 2024

What is your output?

@jc-word
Copy link
Author

jc-word commented Feb 6, 2024

test.txt
This is the output. The correct output should be 36, but the output is 24. In my testing, I found an issue with the root where the sendbuf in root did not participate in the addition. Instead, recvbuf corresponding to root added with the sendbuf from other processes and then written to recvbuf.

Through testing, I found two ways to achieve the correct output:

  1. Ensure MPI_Reduce does not process data exceeding 2KB.

    For example:

      ` MPI_Reduce(sendBuf, recvBuf, 512, MPI_INT, MPI_SUM, root, MPI_COMM_WORLD);`
    
  2. Avoid letting process 0 monopolize a server.

    For example:

  smdpu04:2
  smdpu01:1

@hzhou
Copy link
Contributor

hzhou commented Feb 6, 2024

I am not able to reproduce your results yet. What is your output of mpichversion?

@jc-word
Copy link
Author

jc-word commented Feb 6, 2024

MPICH Version: 4.1
MPICH Release date: Fri Jan 27 13:54:44 CST 2023
MPICH ABI: 14:4:2
MPICH Device: ch4:ucx
MPICH configure: -prefix=/udata/user/workSpace/MPI/MPI_Install --with-ucx=embedded
MPICH CC: gcc -std=gnu99 -O2
MPICH CXX: g++ -O2
MPICH F77: gfortran -O2
MPICH FC: gfortran -O2

@hzhou
Copy link
Contributor

hzhou commented Feb 6, 2024

I reproduced the bug in mpich-4.1. The current main branch doesn't have this issue. It appears we have fixed it at some point. I'll try to determine where we have fixed it.

@hzhou
Copy link
Contributor

hzhou commented Feb 6, 2024

This is the fix: f623723
The upcoming 4.2 series should contain this patch.
@raffenet Are we going to make a 4.1.3 release? If so, consider include this patch.

@raffenet
Copy link
Contributor

raffenet commented Feb 7, 2024

You should be able to use this environment variable to disable the problematic reduce algorithm.

export MPIR_CVAR_DEVICE_COLLECTIVES=none

Still, I will pick up the fix for 4.1.3 if/when that release is made.

@raffenet
Copy link
Contributor

The 4.1.3 release included the fix for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants