Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure the grace shutdown of HTTP/2 connections are completed #11046

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

duke8253
Copy link
Contributor

@duke8253 duke8253 commented Feb 7, 2024

This PR addresses a few issues:

  • If a HTTP2_SESSION_EVENT_FINI is received after a HTTP2_SESSION_EVENT_SHUTDOWN_INIT, the GOAWAY frame with the correct last_stream_id isn't sent out.
  • Correctly set and remove the Connection: close header for HTTP/2 (this header is only used as a internal flag to indicate a grace shutdown is needed).
  • Initiate a grace shutdown when status is 429.

@duke8253 duke8253 added the HTTP/2 label Feb 7, 2024
@duke8253 duke8253 added this to the 10.0.0 milestone Feb 7, 2024
@duke8253 duke8253 self-assigned this Feb 7, 2024
// Schedule session shutdown if response header has "Connection: close",
// then remove "Connection: close" which is a protocol violation for HTTP/2.
if (!this->_send_header.is_keep_alive_set()) {
this->_send_header.field_delete(MIME_FIELD_CONNECTION, MIME_LEN_CONNECTION);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to delete the header here? I think it (and other headers that are invalid for H2) will be deleted by VersionConverter later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does get deleted, added that mostly for testing purpose. I'll remove that.

case VC_EVENT_INACTIVITY_TIMEOUT:
case VC_EVENT_INACTIVITY_TIMEOUT: {
if (this->connection_state.get_shutdown_state() == HTTP2_SHUTDOWN_NONE) {
this->connection_state.set_shutdown_state(HTTP2_SHUTDOWN_INITIATED, Http2ErrorCode::HTTP2_ERROR_NO_ERROR);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not understand the issue, but the state should not be changed directly here. It should be set by HTTP2_SESSION_EVENT_SHUTDOWN_INIT event handler. Changing the state at multiple places is going to be super confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will sending this with the SHUTDOWN_CONT event below trigger SHUTDOWN_INIT? Or are you suggesting that this case should send the SHUTDOWN_INIT event instead and rely on that flow to set the appropriate state?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was suggesting scheduling SHUTDOWN_INIT event instead. But if canceling grace shutdown works we don't need it.

@maskit
Copy link
Member

maskit commented Feb 7, 2024

If a HTTP2_SESSION_EVENT_FINI is received after a HTTP2_SESSION_EVENT_SHUTDOWN_INIT, the GOAWAY frame with the correct last_stream_id isn't sent out.
If a HTTP/2 session timed out, no grace shutdown is triggered.

Aren't these immediate close? Why do we want to initiate grace shutdown on timeout?

@duke8253
Copy link
Contributor Author

duke8253 commented Feb 7, 2024

If a HTTP2_SESSION_EVENT_FINI is received after a HTTP2_SESSION_EVENT_SHUTDOWN_INIT, the GOAWAY frame with the correct last_stream_id isn't sent out.
If a HTTP/2 session timed out, no grace shutdown is triggered.

Aren't these immediate close? Why do we want to initiate grace shutdown on timeout?

We got a complaint from a property saying they don't have a way to determine whether a connection is closed, since the HTTP/2 doesn't have the Connection: close header, sending a GOAWAY when that happens should give them what they want.

The problem with the HTTP2_SESSION_EVENT_FINI is , if we started the grace shutdown, I think we should finish the that process even if HTTP2_SESSION_EVENT_FINI is received, since the first step is sending a GOAWAY with INT32_MAX.

About that timeout part, was doing some testing and wasn't sure whether we need that, left it in there to get some comments.

@maskit
Copy link
Member

maskit commented Feb 7, 2024

Ok, I wonder why the property can't check if the connection is closed on TCP layer, but let's leave it aside.

The RFC says:

A server that is attempting to gracefully shut down a connection SHOULD send an initial GOAWAY frame with the last stream identifier set to 231-1 and a NO_ERROR code. This signals to the client that a shutdown is imminent and that initiating further requests is prohibited. After allowing time for any in-flight stream creation (at least one round-trip time), the server MAY send another GOAWAY frame with an updated last stream identifier. This ensures that a connection can be cleanly shut down without losing requests.

and

If a connection terminates without a GOAWAY frame, the last stream identifier is effectively the highest possible stream identifier.

So, a client cannot expect a GOAWAY. A connection close without GOAWAY can happen. I don't think current behavior of ATS is wrong.

That said, the RFC also says:

An endpoint MAY send multiple GOAWAY frames if circumstances change. For instance, an endpoint that sends GOAWAY with NO_ERROR during graceful shutdown could subsequently encounter a condition that requires immediate termination of the connection. The last stream identifier from the last GOAWAY frame received indicates which streams could have been acted upon.

I'd take this as "A server can cancel graceful shutdown and send a GOAWAY without waiting for the 1 RTT of grace time, to close a connection immediately". This may be just a different way of saying what you do on this PR, but I think this is what we should do, and I'd take a different approach.

In most cases, send_goaway_frame() is called right before we schedule HTTP2_SESSION_EVENT_FINI event. So if ATS receives HTTP2_SESSION_EVENT_FINI, GOAWAY should be sent. I don't think we need to forcibly progress the shutdown process (we can simply cancel it) to send the second GOAWAY. The only exception is Http2ClientSession::do_io_close. It schedules the event without calling send_goaway_frame(). But if the event is scheduled in do_io_close, it's probably too late to send something anyway.

TLDR; Isn't canceling graceful shutdown process enough?

@duke8253 duke8253 marked this pull request as draft February 8, 2024 19:18
shutdown_cont_event->cancel();
}
this->_shutdown_cont();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So cleaning up the in progress shutdown events, since we just shutdown cleanly here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since we already started a grace shutdown, I think we should finish the whole process.

@duke8253
Copy link
Contributor Author

@maskit forgot to mention one thing, when doing a curl command, if no GOAWAY frame is sent, it reports connection left intact even when ATS says the H2 session is destroyed. I'm not sure that is the correct behavior, or should we even care about that?

@maskit
Copy link
Member

maskit commented Feb 14, 2024

Closing a connection without sending GOAWAY is not polite, but I don't think it's a violation. Depends on why the connection was closed.

Endpoints SHOULD always send a GOAWAY frame before closing a connection so that the remote peer can know whether a stream has been partially processed or not. For example, if an HTTP client sends a POST at the same time that a server closes a connection, the client cannot know if the server started to process that POST request if the server does not send a GOAWAY frame to indicate what streams it might have acted on.

An endpoint might choose to close a connection without sending a GOAWAY for misbehaving peers.

@duke8253 duke8253 force-pushed the master-h2_shutdown branch 2 times, most recently from 46f497d to a71a773 Compare March 18, 2024 16:17
@duke8253 duke8253 marked this pull request as ready for review March 18, 2024 16:17
@@ -80,6 +80,7 @@ class Http2Stream : public ProxyTransaction
void set_expect_send_trailer() override;
bool expect_receive_trailer() const override;
void set_expect_receive_trailer() override;
void set_close_connection(HTTPHdr &hdr) const override;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, nvm. I just realized it's override.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this relay Connection header value between a request and a response? I think that should not happen.

Client <-- (H2) -- ATS <-- (H1) -- Origin

When an origin server returns Connection: close, ATS should not close the connection between the client and ATS, right? It's just the origin wants to close the connection between ATS and the origin. The client should be allowed to make another request, which can be served from the cache, on the same connection.

Copy link
Contributor Author

@duke8253 duke8253 Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sets the Connection: close, but it doesn't decide whether the connection will be closed. HttpTransact::handle_response_keep_alive_headers actually decides whether that close will remain in the response or be removed. But before we never set it in the first place, so all the logic there does nothing for h2 connections.

heads->field_delete(MIME_FIELD_CONNECTION, MIME_LEN_CONNECTION);

It removes the connection header, and added it back if needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I was confused by this line. It seemed like to modify a client response based on a server response, but it doesn't.

s->state_machine->get_ua_txn()->set_close_connection(*heads);

Http1Transaction and Http2Stream do the same thing in set_close_connection. It's not conditional at all. I assume Http3Transaction should do the same. Now I wonder why we don't modify the heads directly in HttpTransact. What's the reason for letting transactions to set the header? I think we should remove this confusing function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added in #8178 , maybe at the time we had some issues with our H2 streams and this header was causing trouble?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are going back to the original behavior (do the same thing for H1 and H2). Are you sure you don't have the issue that was resolved by #8178, with this change?

If we don't have to to differentiate the behavior, I prefer the original code that modifies heads directly in HttpTransact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, let me dig a bit deeper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent with #8178 was to not drain the session if the HttpSM logic sets the connection close due to detected transaction failures. Shutting down the connection in HTTP/1 makes sense since the TCP connection may be polluted in the case of bad request bodies. However, this is not an issue with HTTP/2. An error with a request body should not require shutting down the session due to cross-contamination.

Explicitly setting the Connection: close header in the HTTP/2 processing logic when w want to close the session seems reasonable if it makes our client's lives easier. But blindly interpreting Connection: Close passed from the state machine processing will cause HTTP/2 sessions to shut down needlessly in some cases.

// Stop creating new streams
SCOPED_MUTEX_LOCK(lock, this->session->get_mutex(), this_ethread());
this->session->set_half_close_local_flag(true);
this->_shutdown_cont();
} break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we no longer need a scope here. Let's remove the braces.

@maskit
Copy link
Member

maskit commented Jun 4, 2024

@duke8253 What's the status of this PR? Is it ready for review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants