New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ch4/isend-irecv: add parent reques completion in isend/irecv #6640
base: main
Are you sure you want to change the base?
Conversation
adds support for meta request (parent request relying on the completion of subrequest) in the Isend/Irecv path. A parent request is becoming a used pattern (continuation, partitioned communication). Here we add a completion mechanism to the isend/irecv path to notify the parent request of completion of one of the child request.
The parent function are similar to the MPID_Isend/Irecv function except that they have a completion counter pointer as an additional argument. Those functions are needed by the partitioned communications to track progress of children requests
goto fn_exit; | ||
} | ||
|
||
MPL_STATIC_INLINE_PREFIX int MPID_Isend_coll(const void *buf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this accidentally added?
int rank, | ||
int tag, | ||
MPIR_Comm * comm, int attr, | ||
MPIR_cc_t * parent_cc_ptr, MPIR_Request ** request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we pass the parent request instead of the cc_ptr?
if (parent_cc_ptr) { | ||
if (MPIR_Request_is_complete(*request)) { | ||
/* if the request is already completed, decrement the parent counter */ | ||
MPIR_cc_dec(parent_cc_ptr); | ||
} else { | ||
/* if the request is not done yet, assign the completion pointer to the parent one and it will be decremented later */ | ||
(*request)->dev.completion_notification = parent_cc_ptr; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be done by the caller instead? it might be cleaner, assuming we have access to the parent request when each operation is issued. something like:
MPIDI_NM_mpi_isend(..., &childreq);
if (childreq && !MPIR_Request_is_complete(childreq)) {
childreq->dev.completion_notification = parentreq->cc_ptr;
} else {
MPID_Request_complete(parentreq);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand your suggestion correctly, we would exit the lock around the child request creation which might not work when async progress is enabled (as you suggested a few months ago :-) )
The reason is that if you are not in the same lock section, you have no guarantee the completion notification will be triggered: you might complete the request between the release of the lock at creation and the assignment of childreq->dev.completion_notification = parentreq->cc_ptr;
Pull Request Description
[Partitioned communication] Add support for meta request completion as part of the isend/irecv path.
Meta requests are common in partitioned communication and continuation API.
This PR allows a request to complete an isend/irecv call and notify a parent's completion counter when doing so.
This is needed for tracking down the completion of partitioned communication requests at low cost
Author Checklist
Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
Commits are self-contained and do not do two things at once.
Commit message is of the form:
module: short description
Commit message explains what's in the commit.
Whitespace checker. Warnings test. Additional tests via comments.
For non-Argonne authors, check contribution agreement.
If necessary, request an explicit comment from your companies PR approval manager.