New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libfabric-2.0: separate max_msg_size for RMA #9968
Comments
We could allow RMA max_msg_size to be 0 to take the value from msg max_msg_size. |
OFIWG meeting notes: Sounds good. |
Note, 'msg' in max_msg_size refers to the definition of a 'transport message', not 'msg' as in FI_MSG. A specific transport operation may actually be lower than max_msg_size, such as is usually the case for atomics. If you separate the size of RMA operations out from max_msg_size, you're potentially redefining what max_msg_size means, such that it no longer applies to the transport message, but an API capability. This is especially true if the RMA size will be larger than the message size. FWIW, this change is being driven by the hardware up, rather than application down, and is pushing the burden of dealing with the differences into every application rather than isolating the change in the provider. |
We could argue that the transport msg sizes can have two different limits since the send/recv and RMA protocols may be very different. |
The same is true for atomics and collectives. Even memory registration may have a separate size limit. Collectives and atomics define separate query operations for more precise limits. Tagged and untagged transfers typically use different protocols as well. One of the points behind 2.0 was to remove differences between providers and simplify apps. This is going in the opposite direction. If we're arguing that RMA needs its own size, then split tagged and untagged as well. Then every capability has its own size. Or, keep things simple for apps and make the providers deal with their own HW implementation nonsense. |
struct fi_ep_attr
definesmax_msg_size
which applies to both send/recv and RMA. Some transport may want to have different size for RMA to better align with the hardware / driver features.The text was updated successfully, but these errors were encountered: