You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am implementing a communication pattern where GPUs exchange parts of their local data vector. The exchanged vector entries are 'unstructured' (arbitrary indices) with block size of ~8KB: for each communication index the sender GPU sends ~1k contiguous doubles.
I implemented this pattern with MPI_Type_indexed + MPI_Isend and it works. My question is, which of the following implementations is expected to be most efficient:
use MPI_Isend directly, without packing, with the newly defined indexed type (I guess OpenMPI allocates some internal buffer?)
use MPI_Pack with the newly defined type, copying data to a GPU buffer, then use MPI_Isend on the packed buffer
use MPI_Pack with the newly defined type, copying data to a CPU buffer, then use MPI_Isend on the packed buffer
Or is there any other, better way to implement such scenario?
Thank you!
The text was updated successfully, but these errors were encountered:
I am implementing a communication pattern where GPUs exchange parts of their local data vector. The exchanged vector entries are 'unstructured' (arbitrary indices) with block size of ~8KB: for each communication index the sender GPU sends ~1k contiguous doubles.
I implemented this pattern with
MPI_Type_indexed
+MPI_Isend
and it works. My question is, which of the following implementations is expected to be most efficient:MPI_Isend
directly, without packing, with the newly defined indexed type (I guess OpenMPI allocates some internal buffer?)MPI_Pack
with the newly defined type, copying data to a GPU buffer, then useMPI_Isend
on the packed bufferMPI_Pack
with the newly defined type, copying data to a CPU buffer, then useMPI_Isend
on the packed bufferOr is there any other, better way to implement such scenario?
Thank you!
The text was updated successfully, but these errors were encountered: