Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HG: safe mechanism to deregister an RPC while handles for that RPC are still in use #534

Open
carns opened this issue Dec 16, 2021 · 3 comments

Comments

@carns
Copy link
Contributor

carns commented Dec 16, 2021

Is your feature request related to a problem? Please describe.

Imagine a hypothetical scenario in which a service is periodically receiving a particular RPC type. The service then begins to shut down (without coordinating with clients) and deregisters that RPC as part of the shut down process.

In this case, a the service could have already begun executing handlers for the RPC, and those handlers will continue to execute despite deregistration. Margo includes a workaround for this that seems to cover most cases by simply checking whether the registered data associated with a given RPC is NULL or not when it is retrieved mochi-hpc/mochi-margo#170.

Describe the solution you'd like

It may be cleaner if Mercury had a way to avoid impacting existing handles on a given RPC ID when deregistering. For example it could deny new RPCs on that ID immediately, but use reference counting to defer full deregistration until in-flight handles associated with the ID are all closed. There are probably other solutions; that's just one option.

Describe alternatives you've considered

So far it seems like in-flight RPCs aren't particularly harmed unless they rely on registered data associated with the RPC, but we are still testing.

@shanedsnyder
Copy link

The Margo fix that Phil mentions is only part of the solution, as it just applies to some Margo boiler-plate logic that runs before user RPC handler code. It looks like service RPC handlers themselves have to be careful not to assume they will be able to retrieve data registered with the RPC -- that's not a huge deal to add safety checks there, but it would be nice if Mercury could provide some stricter guarantees in terms of lifetime of registered data for RPC handlers that are already executing.

@soumagne soumagne changed the title safe mechanism to deregister an RPC while handles for that RPC are still in use HG: safe mechanism to deregister an RPC while handles for that RPC are still in use Dec 22, 2021
@soumagne soumagne added this to the mercury-2.1.1 milestone Dec 22, 2021
@soumagne soumagne modified the milestones: mercury-2.2.0, mercury-2.3.0 Dec 9, 2022
@soumagne soumagne modified the milestones: mercury-2.3.0, future Jun 6, 2023
@mdorier
Copy link

mdorier commented Apr 10, 2024

Has this problem been solved in mercury 2.3.0?

@soumagne
Copy link
Member

no this has not been implemented yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants