Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When to CID expire if using Unique Path ID? #313

Open
huitema opened this issue Mar 16, 2024 · 16 comments
Open

When to CID expire if using Unique Path ID? #313

huitema opened this issue Mar 16, 2024 · 16 comments

Comments

@huitema
Copy link
Contributor

huitema commented Mar 16, 2024

In the Unique Path ID model, the creation of a CID is always linked to the creation of a potential path. CIDs are managed using two frames:

  • MP_NEW_CONNECTION_ID with parameters CID value, path ID, sequence, and retire prior.
  • MP_RETIRE_CID with parameters path ID and sequence.

Path IDs are not supposed to live forever. If a path is abandoned, the path ID will not be reused, because of potential ambiguity: reusing path id and sequence numbers could break encryption; when packets arrive for an abandoned path ID and the path state has been delete, there is no way to distinguish new packets from copies of old packets. Yet, as long as there is a valid CID available, nothing prevents a node from using it and sending packets with that corresponding path_id.

The sender of the MP_NEW_CONNECTION_ID can only force retirement of old path IDs by sending an MP_NEW_CONNECTION_ID frame and setting the "retire prior" value, but doing so creates a new CID for the old path and does not actually "retire all CID for the path".

In the current specification, these "old path CID" can only be retired by the peer, who could send series of MP_RETIRE_CID frames. The local node knows that the path is abandoned, so it could refrain from providing new CID to replace them. But nothing forces the peer to do that.

We miss a way to say "retire all CID for that path". One way would be to state that as a side effect of "PATH_ABANDON": the peer should immediately retire all CIDs tied to that path. We should also specify that a node MUST retain knowledge of all valid CID until they have been retired -- otherwise we get issue like a peer sending packets with old CID values, resulting in a stateless reset. Instead, the node that receive a packet with an old CID for an abandoned path should probably just ignore it.

Maybe there are simpler solutions, but the current behavior is under-specified.

@gloinul
Copy link

gloinul commented Mar 17, 2024

To my understanding is that how it is currently specified is that each endpoint need to close the path in the direction towards itself by first sending the path_abandon then wait the 3*PTO, then send the retire CID for all the CIDs used by that path. And if the peer don't send a path_abandon back to you when you have sent one, this results in a path that is open in only one direction. That is probably not what we want the situation to be in.

@gloinul
Copy link

gloinul commented Mar 17, 2024

I would suggest that we ensure that we have when it comes to the path close scenarion, that one says first stop sending, i.e. path_abandon. Then a path_close that retires the path in this direction and all related CIDs. This could actually be a frame that says path_closed which references a path Index.

Then when it comes to cycle CIDs used by a path one need MP_NEW_CONNECTION_ID frame to add new CIDs to a path. This can retire older ones when one moved forward. A question is if one need a MP_RETIRE_CONNECTION_ID that can deal with path + CID addressing properly as the single path sequence number assumption does not hold when one have explicit path IDs.

@Yanmei-Liu
Copy link
Contributor

Record comments here:

marten-seemann 4 days ago

Should PATH_ABANDON automatically retire the connection IDs, without the need to send any frames? It would be nice if we don't have to deal with a malicious / misbehaving peer that abandons a path, but keeps the CIDs active.

michael-eriksson 4 days ago

Yes, it should! Or at least the complete closing of a path, with dual PATH_ABANDON frames should.

@huitema
Copy link
Contributor Author

huitema commented Mar 21, 2024

There is a potential issue with "implicit retire". Suppose the client sends "Abandon path 1", and after some delay, maybe 3*RTO, implicitly retires the CID that he sent on path 1. And then suppose that the server was somehow slow in receiving and processing the ABANDON_PATH, and has kept sending 1RTT packets over path 1. Once the CIDs have been implicitly retired, they will elicit StatelessReset, which may well cause the server to abandon the whole connection.

I think we could solve that by redefining the "Path Abandon" frame. Currently, it only imposes actions to its recipient: stop sending. Going implicit with that creates an implicit race condition, because the sender will act upon those actions "as if the recipient has done them"... but can only guess whether the recipient has in fact processed the frame.

If we reverse the meaning from "please do not send" to "I will not send anymore", the race condition disappears. The recipient can act on that immediately, because it knows that the sender has already stopped sending. If the recipient reply with its own "Path Abandon", the loop is closed, and there is no ambiguity. The nodes may still want to wait 3*RTO after Path Abandon frames have been sent and received to ensure there are no lingering frames in the network, but the race condition has disappeared.

@MikeBishop
Copy link
Contributor

I had a similar question reading the draft, and while @huitema's suggestion here simplifies dealing with CID retirement, I'm not certain it fixes the issue with ABANDON in general. I agree that abandoning a path means two things:

  • I will no longer send on this path, and all CIDs you issued for it have been retired
  • I will shortly stop accepting packets on this path, so stop sending them

The sender of ABANDON will not process any Stateless Resets that might arrive, since it already considers the CIDs and tokens from that path to be invalid. However, there still needs to be that loop closure before it can safely start sending Stateless Resets in response to straggler packets, and probably still a draining period.

@huitema
Copy link
Contributor Author

huitema commented May 25, 2024

In the last interop, we pretty much concluded that the robust solution was to have Abandon sent by both end points, either simultaneously or one in response to the other. The abandon means "I won't send on this path, please consider the CID that you sent me as retired, and please send an Abandon frame of your own if you have not already done so." If both ends do that, the path is effectively wiped. You probably need to keep some memory for some time before serving stateless resets, e.g., because of out of order deliveries. The usual timer for this kind of things is 3*PTO.

Another reason to keep memory is ACK_MP frames. It makes sense to send them even for a path that has been abandoned, otherwise the packets in flight for that path will be considered lost.

@huitema
Copy link
Contributor Author

huitema commented May 25, 2024

We could make a case that Abandon frames must not be sent on the path that they are abandoning.

@mirjak
Copy link
Collaborator

mirjak commented May 28, 2024

Actually, and that is clear in the draft, abandon only means

"I will shortly stop accepting packets on this path, so stop sending them"

After sending the abandon frame you have to wait for 3 PTOs and then send RETIRE_CID frames, which then means:

"I will no longer send on this path, and all CIDs you issued for it have been retired"

You can still send ACK frames and of course the RETIRE_CID frames on that path during those 3 PTOs. However, you should not send any new data anymore.

@mirjak
Copy link
Collaborator

mirjak commented May 28, 2024

#365 maybe clarifies this. Hopefully...?

@MikeBishop
Copy link
Contributor

Actually, and that is clear in the draft, abandon only means

But I think the point of this issue is that we might want to change what it means. Let's have that discussion before clarifying what it currently means.

@mirjak
Copy link
Collaborator

mirjak commented May 29, 2024

Sure we can have a discussion to change it. I'm just saying that it is currently clearly defined (because we had some lengthly discussion about this already). However, with the explicit path id, things might be different now and we can reconsider.

@huitema
Copy link
Contributor Author

huitema commented May 29, 2024

(Copied from #367:)

The PATH_ABANDON ought to be a commitment to not send data anymore on the specified path. Not sending data on the path implicitly means that the endpoint will not use any of the MP_NEW_CID sent by the peer for the abandoned path. De facto, that's equivalent to retiring these CIDs: they will not be used anymore. Sending explicit MP_RETIRE_CID is just overhead, or maybe worse than overhead.

In monopath QUIC, sending a RETIRE_CID is useful because it can trigger production by the peer of a NEW_CID. But in the PATH_ABANDON scenario, there is no point: if the peer produced more MP_NEW_CID frames, they will just be unused. Since there is no point, some implementations may be tempted to not do it. The endpoint could just silently drop the unused CID, and would not suffer any consequence. The peer, on the other hand, will be stuck with a list of "zombie CID" that have no utility and just consume memory. So, if we look at the endpoint that receives the path Abandon, its interest is to free resource asap, whether MP_RETIRE_CID frames are received after the path Abandon or not.

The main synchronization issue is the "stateless reset" risk. When an endpoint sends a PATH ABANDON, there maybe some packets in transit that will be delivered out of order. If the peer has deleted all knowledge of the path-associated CID, the CID in the out of order packets will not be recognized, and the peer will send a "stateless reset" packet.

We could specify a complex solution to eliminate the stateless reset risk, such as ask to retain knowledge of the old CID for 3*PTO after receiving the PATH_ABANDON. That will reduce the risk of generating stateless reset packets, but it will not eliminate it. But then, stateless reset packets only have an effect if they can be verified, if the last 16 bytes match the "Stateless Reset Token" of a valid CID for the connection. If the sender of the path abandon has freed all resource associated with the CIDs for the abandoned path-ID, it will also have deleted the corresponding stateless reset tokens. The stateless reset packet sent by the peer thus will not be recognized as a valid stateless reset, and will have no effect on the state of the connection.

Thus, I think that the proper behavior should be:

  • When sending a path abandon frame, immediately free the resource associated with the CIDs received from the peer for that path.

  • When receiving a path abandon frame, assume that it carries an implicit "retire CID" for all the CIDs sent to the peer, and free the corresponding resource after a short delay, maybe 3*PTO.

@mirjak
Copy link
Collaborator

mirjak commented May 31, 2024

A couple more thoughts. Usually for a graceful shut-down you need two things.

  1. Wait until all in flight data has been received
  2. Retain the ability to send control frames, e.g. ACKs

That's what the current design tries to support. You indicate your "interest" in closing a path by telling the other end to stop sending by sending the path_abandon frame, then you wait for 3 PTOs until all data is received and only then you retire the CID which means you are also not able to send on the path anymore.

However, in case of multipath we only need 1) and we don't need 2) because by definition we always have still another path to send control frame because otherwise it would be connection close.

Therefore we could say path_abandon is both 1) a request to the other peer to stop sending and 2) a commitment to not send anymore on the path. That means the peer sending the path_abandon frame could release all CIDs immediately it is using for sending. However, it would still need to wait 3 PTOs before it could remove ressources for the CIDs issued to the other end to receive all in flight data. An endpoint that receives an path_abandon frame could release all CIDs immediately for both side but would still need to remember path state and the path ID until all ACKs for the send data on the path are received or data is declared lost.

@Yanmei-Liu
Copy link
Contributor

I appreciate @huitema 's idea about simplify the mechanism without requiring extra MP_RETIRE_CID after sending Path Abandon. And we'd also able to have a graceful shut-down for paths if we design it like:

  1. Endpoint A sends Path Abandon to retire the associate Path ID resource. After it sends out the Path Abandon frame, the endpoint only process ACK_MP for inflight packets on this path within 3 PTO. In this period, it could retransmit Path Abandon / ACK_MP on the path for graceful shutdown, and also telling the peer that it’s closing the Path if receives any other packet on the Path. After that, it frees all the resource associated with the Path ID (including all the CIDs belonging to this Path ID).
  2. When endpoint B receives Path Abandon for a specific Path ID, it sends Path Abandon as response to inform the peer that it also start to graceful shutdown, and also sends ACK_MP for the Path Abandon (Note that these 2 frames could be encode in the same packet for performance).
  3. After Endpoint A 1)sends Path Abandon, 2)received ACK_MP for the Path Abandon sent by itself, and also 3)received Path Abandon sent by the peer for the same Path ID, It could release all the Path resources associated with the Path ID immediately.
  4. For the worst case, Endpoint A doesn’t receive 2) and 3) at all, it can still free the path corresponding resource after a short delay(maybe 3*PTO). These cases could happen in real world due to high packet loss / blackhole paths or the peer’s crashing down.

@mirjak
Copy link
Collaborator

mirjak commented Jun 3, 2024

I'm not sure if you can automatically release all resources if you received an ack for your path abandon frame and a path abandon from the peer for the same path because packets can be re-ordered. You can retire implicitly all CIDs for the path that where issued by the peer as soon as you have received the path abandon from the peer and have sent your path abandon (no need to wait for the ack) but you still have to remember the CIDs you issued and the path ID until all packets up to the packet with the path abandon frame are ack'ed or declared lost (after 3 PTOs). Alternatively if we don't want to considered re-ordering as re-ordered packet that are not processed anymore could be retransmitted on another path, you can retire all CIDs and packet processing state as soon as you have received the path abandon from the peer and you only have to remember the path ID until your abandon frame is ack'ed.

Also I think we always should try to retransmit the path abandon given we have multiple paths to do that. As the path abandon is basically by definition the last frame you sent on a path, that means you will always detect loss by a time-out. That means a) that if you have to retransmit the path abandon, you always have to retransmit it on another path anyway. And b) in this case you actually cannot just free the path resources fully yet as you have to remember the path ID to retransmit the path abandon.

There are actually a couple of corner cases here to consider:

  1. If you've sent a path abandon, then didn't receive an ack for it (yet) but you received a path abandon from the peer, that's probably a "new" path abandon request from the peer, so you should "re-transmit"/send a path-abandon on another path right away in response. After receipt of the abandon from the peer you can immediately retire all CIDs that where issued by the peer for that path ID. Again, you still have to remember the path ID though for retransmission until you receive an ack for your path abandon.

  2. If you received the ack for your path abandon but no path abandon from the other end, you should wait for 3 PTO and then force path closure and remove all state for this path ID including all the CIDs. For the ones you issued for this path ID, you could maybe send MP_RETIRE_CIDS frames on another path in order to enforce that the other peer cannot send anymore. In this case any in flight data with those CIDs is lost (but of course the peer can retransmit them on another path).

  3. You've sent a path abandon and you didn't receive an ack or a path abandon from the peer, there are two cases: The path that you've sent the path abandon on is broken and you should retransmit it on another path and wait again for 3 PTOs. If there is no other working path anymore, you should close the connection. If you didn't receive an ack for your path abandon but you received other acks on that path, you retransmit the path abandon based on the normal loss detection rule. In this case it is not clear to me if you want to restart the 3 PTO timer or not. However, at some point you probably have to decide to simply remove all path related ressources and not retransmit the path abandon anymore and not process any data for that path anymore, or alternatively close the whole connection with an error such that a peer refusing to send acks cannot force you to keep path state open for ever. Also similarly as above if you remove path state without having received a path abandon from the peer, you could probably still send MP_RETIRE_CIDS for all CIDs of that path.

@LPardue
Copy link
Member

LPardue commented Jun 3, 2024

Disussed during interim: agreement we need to do something, @huitema to write up a PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants