Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coll/tuned: Extend the collective tuning file to be topology-aware #12321

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jiaxiyan
Copy link
Contributor

@jiaxiyan jiaxiyan commented Feb 8, 2024

TUNED collectives selection should account for communicator topology like HAN. The communicator size and message based algorithm selection logic is no longer sufficient to achieve optimal performance when HAN is used. The best algorithm differs between inter-node and intra-node for the same communicator size and message size based on the tuning results.

This commit introduces communicator topology dimension in both TUNED collective tuning file rule and the algorithm selection logic. The topological level can be single node, disjoint, or default(mixed).

Specify @single_node or @ disjoint after the message size in the dynamic file rules. This is an optional feature so it will not break the old file format. See the file example in coll_tuned_dynamic_file.h

ompi/mca/coll/base/coll_base_util.c Outdated Show resolved Hide resolved
ompi/mca/coll/base/coll_base_util.c Outdated Show resolved Hide resolved
ompi/mca/coll/tuned/coll_tuned_dynamic_rules.c Outdated Show resolved Hide resolved
ompi/mca/coll/tuned/coll_tuned_dynamic_file.h Outdated Show resolved Hide resolved
ompi/mca/coll/tuned/coll_tuned_dynamic_rules.c Outdated Show resolved Hide resolved
ompi/mca/coll/tuned/coll_tuned_dynamic_rules.c Outdated Show resolved Hide resolved
ompi/mca/coll/tuned/coll_tuned_dynamic_file.c Outdated Show resolved Hide resolved
TUNED collectives selection should account for communicator topology
like HAN. The communicator size and message based algorithm selection logic
is no longer sufficient to achieve optimal performance when HAN is used.
The best algorithm differs between inter-node and intra-node for the same
communicator size and message size based on the tuning results.

This commit introduces topology dimension in both TUNED collective tuning
file rule and the algorithm selection logic. The topological level can
be intra-node, internode, or default(mixed).

Specify @inter_node or @intra_node after the message size in the
dynamic file rules. This is an optional feature so it will not break the
old file format. See the file example in coll_tuned_dynamic_file.h

Signed-off-by: Jessie Yang <jiaxiyan@amazon.com>
@wenduwan
Copy link
Contributor

wenduwan commented Feb 9, 2024

@bosilca @devreal Could you please review this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants