New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mca/coll: Add any radix k for alltoall bruck algorithm #12453
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @jiaxiyan. A few small suggestions.
Have you checked that the default faninout will be 2 for the dynamic rules so as to not change performance?
Thanks
if (err != MPI_SUCCESS) { line = __LINE__; goto err_hndl; } | ||
|
||
/* Sendreceive */ | ||
err = ompi_coll_base_sendrecv ( tmpbuf, 1, new_ddt, sendto, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these can be replaced with non-blocking send/recv for better performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried replacing with non-blocking send/recv but it segfaults in osu_alltoall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You won't get any performance buff with large fanout if you are using blocking send/receives. We can setup a meeting next week to see why it crashes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do get some improvement because you decrease the depth of the tree, but you don't get the multi-rail benefit.
Talking about multi-rail, I would expect the p2p communication layer to take advantage of the multi-rail, at least after a certain message size. If that's the case when the upper and lower level of the communication stack both try to generate multi-rail traffic things cant go well. How do we protect against that ? Is the plan to only enable collectives multi-rail only for some ranges of messages ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jiaxiyan is on an international trip. I will follow up with her to resolve the crash she mentioned.
I was planning to review it after they figure out the issue with replacing the blocking ompi_coll_base_sendrecv with non-blocking communications. |
This method extends ompi_coll_base_alltoall_intra_bruck to handle any radix k. Signed-off-by: Jessie Yang <jiaxiyan@amazon.com>
@bosilca I don't follow this comment:
Are you suggesting that collectives shouldn't be making PML isend/irecv directly? |
No, I was wondering how would a multi-rail enabled collective component (which will indeed post multiple isend requests) interacts with a multi-rail enabled low-level communication library (which will split all large messages across multiple lanes) ? |
This method extends ompi_coll_base_alltoall_intra_bruck to handle any radix k.