Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#2249: Add perf test for raw MPI Send/Recv and ObjGroup's Send #2251

Conversation

JacobDomagala
Copy link
Contributor

@JacobDomagala JacobDomagala commented Feb 26, 2024

Fixes #2249

BuildStats issue - DARMA-tasking/build-stats#31

Copy link

github-actions bot commented Feb 26, 2024

Pipelines results

PR tests (gcc-12, ubuntu, mpich)

Build for 2fde58d (2024-04-19 13:48:17 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (gcc-12, ubuntu, mpich, verbose)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-9, ubuntu, mpich)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-13, alpine, mpich)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (gcc-8, ubuntu, mpich, address sanitizer)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-13, ubuntu, mpich)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-12, ubuntu, mpich)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-11, ubuntu, mpich)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (gcc-10, ubuntu, openmpi, no LB)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (gcc-9, ubuntu, mpich, zoltan, json schema test)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (gcc-11, ubuntu, mpich, trace runtime, coverage)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (intel icpx, ubuntu, mpich, verbose)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-14, ubuntu, mpich, verbose)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-10, ubuntu, mpich)

Build for 33955ec (2024-05-29 08:18:42 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (intel icpc, ubuntu, mpich)

Build for 33955ec (2024-05-29 08:18:42 UTC)

remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get %0D%0A%0D%0A%0D%0A ==> And there is more. Read log. <==

Build log


PR tests (nvidia cuda 11.2, gcc-9, ubuntu, mpich)

Build for 33955ec (2024-05-29 08:18:42 UTC)

/vt/src/vt/pipe/pipe_manager.impl.h(135): warning: missing return statement at end of non-void function "vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&vt::vrt::collection::lb::GreedyLB::collectHandler, Target=vt::objgroup::proxy::ProxyElm<vt::vrt::collection::lb::GreedyLB>]"
          detected during:
            instantiation of "auto vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&vt::vrt::collection::lb::GreedyLB::collectHandler, Target=vt::objgroup::proxy::ProxyElm<vt::vrt::collection::lb::GreedyLB>]" 
/vt/src/vt/objgroup/proxy/proxy_objgroup.impl.h(204): here
            instantiation of "vt::objgroup::proxy::Proxy<ObjT>::PendingSendType vt::objgroup::proxy::Proxy<ObjT>::reduce<f,Op,Target,Args...>(Target, Args &&...) const [with ObjT=vt::vrt::collection::lb::GreedyLB, f=&vt::vrt::collection::lb::GreedyLB::collectHandler, Op=vt::collective::PlusOp, Target=vt::objgroup::proxy::ProxyElm<vt::vrt::collection::lb::GreedyLB>, Args=<vt::vrt::collection::lb::GreedyPayload>]" 
/vt/src/vt/vrt/collection/balance/greedylb/greedylb.cc(222): here

/vt/src/vt/pipe/pipe_manager.impl.h(135): warning: missing return statement at end of non-void function "vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&MyObj::handler, Target=vt::objgroup::proxy::ProxyElm<MyObj>]"
          detected during instantiation of "auto vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&MyObj::handler, Target=vt::objgroup::proxy::ProxyElm<MyObj>]" 
/vt/examples/callback/callback.cc(147): here

/vt/src/vt/pipe/pipe_manager.impl.h(135): warning: missing return statement at end of non-void function "vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&colHan, Target=vt::vrt::collection::VrtElmProxy<MyCol, vt::Index1D>]"
          detected during instantiation of "auto vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&colHan, Target=vt::vrt::collection::VrtElmProxy<MyCol, vt::Index1D>]" 
/vt/examples/callback/callback.cc(153): here

/vt/src/vt/pipe/pipe_manager.impl.h(135): warning: missing return statement at end of non-void function "vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&MyObj::handler, Target=vt::objgroup::proxy::ProxyElm<MyObj>]"
          detected during instantiation of "auto vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&MyObj::handler, Target=vt::objgroup::proxy::ProxyElm<MyObj>]" 
/vt/examples/callback/callback.cc(147): here

/vt/src/vt/pipe/pipe_manager.impl.h(135): warning: missing return statement at end of non-void function "vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&colHan, Target=vt::vrt::collection::VrtElmProxy<MyCol, vt::Index1D>]"
          detected during instantiation of "auto vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&colHan, Target=vt::vrt::collection::VrtElmProxy<MyCol, vt::Index1D>]" 
/vt/examples/callback/callback.cc(153%0D%0A%0D%0A%0D%0A ==> And there is more. Read log. <==

Build log


PR tests (nvidia cuda 12.2.0, gcc-9, ubuntu, mpich, verbose)

Build for 33955ec (2024-05-29 08:18:42 UTC)

/vt/lib/CLI/CLI/CLI11.hpp(1029): warning #2361-D: invalid narrowing conversion from "double" to "unsigned long"
          TT { std::declval<CC>() }
               ^
          detected during:
            instantiation of "vt::CLI::detail::is_direct_constructible<T, C>::test [with T=std::vector<std::string, std::allocator<std::string>>, C=double]" based on template arguments <std::vector<std::string, std::allocator<std::string>>, double> at line 1041
            instantiation of class "vt::CLI::detail::is_direct_constructible<T, C> [with T=std::vector<std::string, std::allocator<std::string>>, C=double]" at line 5005
            instantiation of "void vt::CLI::Option::results(T &) const [with T=std::vector<std::string, std::allocator<std::string>>]" at line 5034
            instantiation of "T vt::CLI::Option::as<T>() const [with T=std::vector<std::string, std::allocator<std::string>>]" at line 7315

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

/vt/lib/CLI/CLI/CLI11.hpp(1029): warning #2361-D: invalid narrowing conversion from "int" to "unsigned long"
          TT { std::declval<CC>() }
               ^
          detected during:
            instantiation of "vt::CLI::detail::is_direct_constructible<T, C>::test [with T=std::vector<std::string, std::allocator<std::string>>, C=int]" based on template arguments <std::vector<std::string, std::allocator<std::string>>, int> at line 1041
            instantiation of class "vt::CLI::detail::is_direct_constructible<T, C> [with T=std::vector<std::string, std::allocator<std::string>>, C=int]" at line 5005
            instantiation of "void vt::CLI::Option::results(T &) const [with T=std::vector<std::string, std::allocator<std::string>>]" at line 5034
            instantiation of "T vt::CLI::Option::as<T>() const [with T=std::vector<std::string, std::allocator<std::string>>]" at line 7315

/vt/tests/perf/send_cost.cc(169): warning #177-D: variable "prevNode" was declared but never referenced
    auto const prevNode = (thisNode - 1 + num_nodes_) % num_nodes_;
               ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

Testing - passed

Build log


@JacobDomagala JacobDomagala force-pushed the 2249-write-benchmark-to-compare-raw-mpi-send-to-objgroup-send-and-collection-send branch 3 times, most recently from 68c939b to 8e435ff Compare February 28, 2024 23:00
@JacobDomagala JacobDomagala force-pushed the 2249-write-benchmark-to-compare-raw-mpi-send-to-objgroup-send-and-collection-send branch 3 times, most recently from 2fde58d to 869b5b7 Compare April 23, 2024 16:17
@JacobDomagala
Copy link
Contributor Author

With the most recent commit:

Raw MPI

Screenshot from 2024-04-25 15-03-34

Objgroup send with payload wrapped in shared_ptr

Screenshot from 2024-04-25 15-03-53

Objgroup send with payload as const&

Screenshot from 2024-04-25 15-05-30

@JacobDomagala JacobDomagala force-pushed the 2249-write-benchmark-to-compare-raw-mpi-send-to-objgroup-send-and-collection-send branch from a32f160 to 720784e Compare April 25, 2024 13:06
@JacobDomagala JacobDomagala marked this pull request as ready for review May 7, 2024 11:25
@JacobDomagala JacobDomagala force-pushed the 2249-write-benchmark-to-compare-raw-mpi-send-to-objgroup-send-and-collection-send branch from 720784e to 309c87a Compare May 7, 2024 11:25

StartTimer(fmt::format("Payload size {}", size));

std::vector<int32_t> recvData(size, data);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is "inside" timer block to match the allocation on vt's receiving side. I think this is fair.

@JacobDomagala
Copy link
Contributor Author

Raw MPI

image

ObjGroup send

image

Collection send

image

@JacobDomagala JacobDomagala force-pushed the 2249-write-benchmark-to-compare-raw-mpi-send-to-objgroup-send-and-collection-send branch from 309c87a to d88ae28 Compare May 7, 2024 11:30
@JacobDomagala
Copy link
Contributor Author

JacobDomagala commented May 7, 2024

Next step would be to update `BuildStats' to plot the results Created DARMA-tasking/build-stats#31

@JacobDomagala JacobDomagala force-pushed the 2249-write-benchmark-to-compare-raw-mpi-send-to-objgroup-send-and-collection-send branch 3 times, most recently from d68673a to 43cf917 Compare May 16, 2024 10:11
@JacobDomagala JacobDomagala force-pushed the 2249-write-benchmark-to-compare-raw-mpi-send-to-objgroup-send-and-collection-send branch from 43cf917 to 5511597 Compare May 28, 2024 16:41
@JacobDomagala JacobDomagala force-pushed the 2249-write-benchmark-to-compare-raw-mpi-send-to-objgroup-send-and-collection-send branch from 5511597 to 33955ec Compare May 29, 2024 08:18
@lifflander
Copy link
Collaborator

@JacobDomagala This look good to me!

@lifflander lifflander merged commit 6617858 into develop May 29, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write benchmark to compare raw MPI send to objgroup send and collection send
2 participants