Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change client resolution semantics to match what's needed for 3PH #494

Open
zenhack opened this issue Mar 29, 2023 · 12 comments
Open

Change client resolution semantics to match what's needed for 3PH #494

zenhack opened this issue Mar 29, 2023 · 12 comments

Comments

@zenhack
Copy link
Contributor

zenhack commented Mar 29, 2023

Background from rpc.capnp:

# *The Tribble 4-way Race Condition*
#
# Any implementation of promise resolution and embargos must be aware of what we call the
# "Tribble 4-way race condition", after Dean Tribble, who explained the problem in a lively
# Friam meeting.
#
# Embargos are designed to work in the case where a two-hop path is being shortened to one hop.
# But sometimes there are more hops. Imagine that Alice has a reference to a remote promise P1
# that eventually resolves to _another_ remote promise P2 (in a third vat), which _at the same
# time_ happens to resolve to Bob (in a fourth vat). In this case, we're shortening from a 3-hop
# path (with four parties) to a 1-hop path (Alice -> Bob).
#
# Extending the embargo/disembargo protocol to be able to shorted multiple hops at once seems
# difficult. Instead, we make a rule that prevents this case from coming up:
#
# One a promise P has been resolved to a remote object reference R, then all further messages
# received addressed to P will be forwarded strictly to R. Even if it turns out later that R is
# itself a promise, and has resolved to some other object Q, messages sent to P will still be
# forwarded to R, not directly to Q (R will of course further forward the messages to Q).
#
# This rule does not cause a significant performance burden because once P has resolved to R, it
# is expected that people sending messages to P will shortly start sending them to R instead and
# drop P. P is at end-of-life anyway, so it doesn't matter if it ignores chances to further
# optimize its path.

Right now, when a capability is resolved, we overwrite the internal clientHook. This has a couple implications:

  1. The fact that the client was a promise at some point in the past is forgotten.
  2. We eventually end up shortening any potentially multi-hop paths that arise.

(2) sounds like a good thing initially, but per the docs above this will likely be problematic in a level 3 implementation. Also, (1) is probably not ok either, since if we are to avoid multi-hop shortening, sendCap() needs to continue to encode the capability as the original promise.

My current thinking is that we need to refactor the internals of Client so that on resolution, the initial clientHook is not dropped, and after resolution we always invoke the first resolution, rather than doing what resolveHook() does and walk as deep into the chain as it can. I think this will work for basic correctness, but I see two downsides:

  • I can envision a scenario where:

    1. Alice is a promise in vat A resolves to bob in vat B, which is in turn a promise which resolves to carol in vat C. If the app is calling methods on alice, it seems like calls would needlessly continue to route through bob. I suppose we could change the way Client.Resolve works so that it will actually give you the final client, but it's a shame this can't be transparent to the app. But maybe this is just a limitation of the protocol that we have to live with.
    2. If the caller doesn't explicitly drop the promise, we could build up chains of promises routing calls around the network uselessly. If it were done automatically by library this would not be a concern, but I don't love requiring the app to deal with it.

I want to think a bit more to see if there's a way we can keep this transparent; I have some things ideas investigate that may not pan out.

@lthibault interested in your thoughts.

@zenhack
Copy link
Contributor Author

zenhack commented Mar 29, 2023

One notion I have is that it might help to introduce more asymmetry between SendCall and RecvCall; initially these were just a question of who allocated the memory, with Recv being more appropriate for calls coming from the network, and send more appropriate for calls going to it. Then, when we added flow control support, we made it so that Send respects the FlowController, but Recv does not, per the different use cases. Maybe we can get away with doing a similar thing here, where when app code makes calls (using Send) we deep-shorten (using resolveHook internally), but when the rpc system makes calls (using Recv) we don't, and just pass it down the chain naively. I need to think this through but this might give us what we want.

@lthibault
Copy link
Collaborator

lthibault commented Mar 29, 2023

There is something I don't quite understand.

 # This rule does not cause a significant performance burden because once P has resolved to R, it 
 # is expected that people sending messages to P will shortly start sending them to R instead and 
 # drop P. P is at end-of-life anyway, so it doesn't matter if it ignores chances to further 
 # optimize its path. 

My understanding is that we are dealing with a chain P -> R -> Q where R is an alias of promise P2, which resolves to Q.

The argument seems to be that performance isn't degraded (much?) because P has resolved to R, leaving us with R -> Q. Is the point that one hop is better than two. Or rather, is it that independently of P, R will also perform path shortening, and that messages will soon be routed to Q directly (i.e. the one-hop path is also short-lived)?

Having typed all this out, I am slightly more confident that it is the latter, but then this passage would contradict that conclusion:

Alice is a promise in vat A resolves to bob in vat B, which is in turn a promise which resolves to carol in vat C. If the app is calling methods on alice, it seems like calls would needlessly continue to route through bob. I suppose we could change the way Client.Resolve works so that it will actually give you the final client, but it's a shame this can't be transparent to the app. But maybe this is just a limitation of the protocol that we have to live with.

@lthibault
Copy link
Collaborator

P.S.: maybe it would make sense for @kentonv to weigh in here?

@zenhack
Copy link
Contributor Author

zenhack commented Mar 29, 2023

Yeah, that's kinda what I'm puzzling over as well; it seems like the docs essentially assume something at a higher level of abstraction is going to do the "deep" shortening, and I'm fuzzy on what that is supposed to be. Would definitely appreciate @kentonv's thoughts.

Also, from talking to ocapn folks, I have learned that E didn't have anything like disembargos -- it just punted and took the latency hit of having to queue calls locally and wait for all prior calls on the promise to return before actually sending them. It is more obvious to me how that would work.

@kentonv
Copy link
Member

kentonv commented Mar 31, 2023

Sorry, I don't follow what is the point of confusion here.

If you have a path: P -> Q -> R -> S

Now you want to resolve Q -> R, to make it: P -> R -> S

The rule here is, once you have decided this, you cannot subsequently decide that you want to resolve Q directly to S instead. Q is now permanently a proxy that relays to R. But this is OK because Q won't be around much longer anyway; as soon as P has received the message that Q should be resolved to R, then P will stop sending messages to Q. Moreover, P can then discover that R should further resolve to S, and then P can begin talking directly to S.

FWIW, this issue applies even in two-party scenarios, when you have promise chains pointing back and forth between the two parties.

@lthibault
Copy link
Collaborator

@kentonv I think you've just cleared it up (for me at least). Thank you!

@kentonv
Copy link
Member

kentonv commented Mar 31, 2023

I think a point of confusion here is, @zenhack is imagining that Q has clients other than P. That is not the case here. I am using Q to designate a specific export on a specific connection, where the other end of that connection is P's vat.

@kentonv
Copy link
Member

kentonv commented Mar 31, 2023

Let's label the edges:

P -q-> Q -r-> R -s-> S

Here, capital letters are objects, lower-case letters are specific exports over specific connections connecting those objects.

Q was originally a promise but has resolved to R. It informs P of this replacement.

Now, the rule is: all future messages arriving over the edge q must be forwarded directly to r.

@zenhack
Copy link
Contributor Author

zenhack commented Mar 31, 2023

Ok, yeah I think part of what I'm snagging on is that the distinction between objects and specific exports in go-capnp makes this hard, and the implementation will probably need to be tweaked to facilitate it. The entries in a connection's export table literally store the same object that an app would use to make calls. So it sounds like the way Clients currently work in go-capnp for user code is fine, but we need to store something else in the tables for use by the RPC system, behaves differently. Does that seem like I'm on the right track?

@zenhack
Copy link
Contributor Author

zenhack commented Mar 31, 2023

(Just looked at the C++ implementation, I see that the exports table stores a ClientHook, not a Client, so yeah I think I get it now)

@kentonv
Copy link
Member

kentonv commented Mar 31, 2023

Yeah.

In C++, an imported RPC promise capability is actually implemented using two layers: The inner layer is a ClientHook targeting the specific import slot, and the outer layer waits for resolution and then redirects to the resolved destination.

The export table just contains ClientHooks. But the way we implement this rule is, once we've sent out a message indicating that an export should be resolved, we update the export table entry to point directly at the inner ClientHook, whereas previously it pointed to the outer one.

Specifically in resolveExportedPromise:

      // Get the innermost ClientHook backing the resolved client.  This includes traversing
      // PromiseClients that haven't resolved yet to their underlying ImportClient or
      // PipelineClient, so that we get a remote promise that might resolve later.  This is
      // important to make sure that if the peer sends a `Disembargo` back to us, it bounces back
      // correctly instead of going to the result of some future resolution.  See the documentation
      // for `Disembargo` in `rpc.capnp`.
      resolution = getInnermostClient(*resolution);

Of course, other implementations are possible.

@zenhack
Copy link
Contributor Author

zenhack commented Mar 31, 2023

Makes sense. Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants