Change client resolution semantics to match what's needed for 3PH #494

zenhack · 2023-03-29T04:56:15Z

Background from rpc.capnp:

Lines 686 to 709 in 1a829fd

    
           # *The Tribble 4-way Race Condition* 
        
           # 
        
           # Any implementation of promise resolution and embargos must be aware of what we call the 
        
           # "Tribble 4-way race condition", after Dean Tribble, who explained the problem in a lively 
        
           # Friam meeting. 
        
           # 
        
           # Embargos are designed to work in the case where a two-hop path is being shortened to one hop. 
        
           # But sometimes there are more hops. Imagine that Alice has a reference to a remote promise P1 
        
           # that eventually resolves to _another_ remote promise P2 (in a third vat), which _at the same 
        
           # time_ happens to resolve to Bob (in a fourth vat). In this case, we're shortening from a 3-hop 
        
           # path (with four parties) to a 1-hop path (Alice -> Bob). 
        
           # 
        
           # Extending the embargo/disembargo protocol to be able to shorted multiple hops at once seems 
        
           # difficult. Instead, we make a rule that prevents this case from coming up: 
        
           # 
        
           # One a promise P has been resolved to a remote object reference R, then all further messages 
        
           # received addressed to P will be forwarded strictly to R. Even if it turns out later that R is 
        
           # itself a promise, and has resolved to some other object Q, messages sent to P will still be 
        
           # forwarded to R, not directly to Q (R will of course further forward the messages to Q). 
        
           # 
        
           # This rule does not cause a significant performance burden because once P has resolved to R, it 
        
           # is expected that people sending messages to P will shortly start sending them to R instead and 
        
           # drop P. P is at end-of-life anyway, so it doesn't matter if it ignores chances to further 
        
           # optimize its path.

Right now, when a capability is resolved, we overwrite the internal clientHook. This has a couple implications:

The fact that the client was a promise at some point in the past is forgotten.
We eventually end up shortening any potentially multi-hop paths that arise.

(2) sounds like a good thing initially, but per the docs above this will likely be problematic in a level 3 implementation. Also, (1) is probably not ok either, since if we are to avoid multi-hop shortening, sendCap() needs to continue to encode the capability as the original promise.

My current thinking is that we need to refactor the internals of Client so that on resolution, the initial clientHook is not dropped, and after resolution we always invoke the first resolution, rather than doing what resolveHook() does and walk as deep into the chain as it can. I think this will work for basic correctness, but I see two downsides:

I can envision a scenario where:
1. Alice is a promise in vat A resolves to bob in vat B, which is in turn a promise which resolves to carol in vat C. If the app is calling methods on alice, it seems like calls would needlessly continue to route through bob. I suppose we could change the way Client.Resolve works so that it will actually give you the final client, but it's a shame this can't be transparent to the app. But maybe this is just a limitation of the protocol that we have to live with.
2. If the caller doesn't explicitly drop the promise, we could build up chains of promises routing calls around the network uselessly. If it were done automatically by library this would not be a concern, but I don't love requiring the app to deal with it.

I want to think a bit more to see if there's a way we can keep this transparent; I have some things ideas investigate that may not pan out.

@lthibault interested in your thoughts.

zenhack · 2023-03-29T05:04:23Z

One notion I have is that it might help to introduce more asymmetry between SendCall and RecvCall; initially these were just a question of who allocated the memory, with Recv being more appropriate for calls coming from the network, and send more appropriate for calls going to it. Then, when we added flow control support, we made it so that Send respects the FlowController, but Recv does not, per the different use cases. Maybe we can get away with doing a similar thing here, where when app code makes calls (using Send) we deep-shorten (using resolveHook internally), but when the rpc system makes calls (using Recv) we don't, and just pass it down the chain naively. I need to think this through but this might give us what we want.

lthibault · 2023-03-29T19:13:02Z

There is something I don't quite understand.

 # This rule does not cause a significant performance burden because once P has resolved to R, it 
 # is expected that people sending messages to P will shortly start sending them to R instead and 
 # drop P. P is at end-of-life anyway, so it doesn't matter if it ignores chances to further 
 # optimize its path.

My understanding is that we are dealing with a chain P -> R -> Q where R is an alias of promise P2, which resolves to Q.

The argument seems to be that performance isn't degraded (much?) because P has resolved to R, leaving us with R -> Q. Is the point that one hop is better than two. Or rather, is it that independently of P, R will also perform path shortening, and that messages will soon be routed to Q directly (i.e. the one-hop path is also short-lived)?

Having typed all this out, I am slightly more confident that it is the latter, but then this passage would contradict that conclusion:

Alice is a promise in vat A resolves to bob in vat B, which is in turn a promise which resolves to carol in vat C. If the app is calling methods on alice, it seems like calls would needlessly continue to route through bob. I suppose we could change the way Client.Resolve works so that it will actually give you the final client, but it's a shame this can't be transparent to the app. But maybe this is just a limitation of the protocol that we have to live with.

lthibault · 2023-03-29T19:15:11Z

P.S.: maybe it would make sense for @kentonv to weigh in here?

zenhack · 2023-03-29T20:28:09Z

Yeah, that's kinda what I'm puzzling over as well; it seems like the docs essentially assume something at a higher level of abstraction is going to do the "deep" shortening, and I'm fuzzy on what that is supposed to be. Would definitely appreciate @kentonv's thoughts.

Also, from talking to ocapn folks, I have learned that E didn't have anything like disembargos -- it just punted and took the latency hit of having to queue calls locally and wait for all prior calls on the promise to return before actually sending them. It is more obvious to me how that would work.

kentonv · 2023-03-31T18:16:39Z

Sorry, I don't follow what is the point of confusion here.

If you have a path: P -> Q -> R -> S

Now you want to resolve Q -> R, to make it: P -> R -> S

The rule here is, once you have decided this, you cannot subsequently decide that you want to resolve Q directly to S instead. Q is now permanently a proxy that relays to R. But this is OK because Q won't be around much longer anyway; as soon as P has received the message that Q should be resolved to R, then P will stop sending messages to Q. Moreover, P can then discover that R should further resolve to S, and then P can begin talking directly to S.

FWIW, this issue applies even in two-party scenarios, when you have promise chains pointing back and forth between the two parties.

lthibault · 2023-03-31T18:23:53Z

@kentonv I think you've just cleared it up (for me at least). Thank you!

kentonv · 2023-03-31T18:32:08Z

I think a point of confusion here is, @zenhack is imagining that Q has clients other than P. That is not the case here. I am using Q to designate a specific export on a specific connection, where the other end of that connection is P's vat.

kentonv · 2023-03-31T18:36:25Z

Let's label the edges:

P -q-> Q -r-> R -s-> S

Here, capital letters are objects, lower-case letters are specific exports over specific connections connecting those objects.

Q was originally a promise but has resolved to R. It informs P of this replacement.

Now, the rule is: all future messages arriving over the edge q must be forwarded directly to r.

zenhack · 2023-03-31T18:43:07Z

Ok, yeah I think part of what I'm snagging on is that the distinction between objects and specific exports in go-capnp makes this hard, and the implementation will probably need to be tweaked to facilitate it. The entries in a connection's export table literally store the same object that an app would use to make calls. So it sounds like the way Clients currently work in go-capnp for user code is fine, but we need to store something else in the tables for use by the RPC system, behaves differently. Does that seem like I'm on the right track?

zenhack · 2023-03-31T18:47:54Z

(Just looked at the C++ implementation, I see that the exports table stores a ClientHook, not a Client, so yeah I think I get it now)

kentonv · 2023-03-31T18:49:07Z

Yeah.

In C++, an imported RPC promise capability is actually implemented using two layers: The inner layer is a ClientHook targeting the specific import slot, and the outer layer waits for resolution and then redirects to the resolved destination.

The export table just contains ClientHooks. But the way we implement this rule is, once we've sent out a message indicating that an export should be resolved, we update the export table entry to point directly at the inner ClientHook, whereas previously it pointed to the outer one.

Specifically in resolveExportedPromise:

      // Get the innermost ClientHook backing the resolved client.  This includes traversing
      // PromiseClients that haven't resolved yet to their underlying ImportClient or
      // PipelineClient, so that we get a remote promise that might resolve later.  This is
      // important to make sure that if the peer sends a `Disembargo` back to us, it bounces back
      // correctly instead of going to the result of some future resolution.  See the documentation
      // for `Disembargo` in `rpc.capnp`.
      resolution = getInnermostClient(*resolution);

Of course, other implementations are possible.

zenhack · 2023-03-31T18:52:34Z

Makes sense. Thanks for your help!

zenhack mentioned this issue Mar 31, 2023

Handle incoming resolve messages. #480

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change client resolution semantics to match what's needed for 3PH #494

Change client resolution semantics to match what's needed for 3PH #494

zenhack commented Mar 29, 2023

zenhack commented Mar 29, 2023

lthibault commented Mar 29, 2023 •

edited

lthibault commented Mar 29, 2023

zenhack commented Mar 29, 2023

kentonv commented Mar 31, 2023

lthibault commented Mar 31, 2023

kentonv commented Mar 31, 2023

kentonv commented Mar 31, 2023

zenhack commented Mar 31, 2023

zenhack commented Mar 31, 2023

kentonv commented Mar 31, 2023

zenhack commented Mar 31, 2023

Change client resolution semantics to match what's needed for 3PH #494

Change client resolution semantics to match what's needed for 3PH #494

Comments

zenhack commented Mar 29, 2023

zenhack commented Mar 29, 2023

lthibault commented Mar 29, 2023 • edited

lthibault commented Mar 29, 2023

zenhack commented Mar 29, 2023

kentonv commented Mar 31, 2023

lthibault commented Mar 31, 2023

kentonv commented Mar 31, 2023

kentonv commented Mar 31, 2023

zenhack commented Mar 31, 2023

zenhack commented Mar 31, 2023

kentonv commented Mar 31, 2023

zenhack commented Mar 31, 2023

lthibault commented Mar 29, 2023 •

edited