Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect timings related with TLS handshake and protocol upgrade #3647

Closed
trustin opened this issue Jun 21, 2021 · 2 comments · Fixed by #5647
Closed

Collect timings related with TLS handshake and protocol upgrade #3647

trustin opened this issue Jun 21, 2021 · 2 comments · Fixed by #5647
Labels
new feature sprint Issues for OSS Sprint participants

Comments

@trustin
Copy link
Member

trustin commented Jun 21, 2021

We should collect the timings related with TLS handshake - at least how long it took to finish TLS handshake. If a request was the first in a connection, we could also provide it in a RequestLog to tell a user that the request timing has been affected by TLS handshake.

On top of this, we could also collect timings related with protocol upgrade:

  • If upgraded with a connection preface or ALPN, it could be the latency between sending a connection preface and receiving the first SETTINGS frame from the peer.
  • If upgraded with an HTTP/1 upgrade request, it could be the latency between sending an HTTP/1 upgrade request and receiving the first SETTINGS frame from the peer.

These timings are common in both client and server, so we could take advantage of this to introduce some type hierarchy to ClientConnectionTimings:

  • ClientConnectionTimings extends ConnectionTimings and provides only client-side timings;
  • ConnectionTimings provides TLS handshake and protocol upgrade timings; and
  • we could add RequestLog.connectionTimings() that returns a ConnectionTimings which can be downcasted to a ClientConnectionTimings.
@jrhee17 jrhee17 added the sprint Issues for OSS Sprint participants label Apr 19, 2024
@jrhee17
Copy link
Contributor

jrhee17 commented Apr 21, 2024

I believe as a starting point for this issue, it is enough to

  • Add a field related to TLS handshake at ClientConnectionTimingsBuilder
    • We can worry about generalizing and creating a common interface with server-side separately
    • Although we eventually want to offer more visibility to protocol upgrade timings, TLS timings should provide enough value for this iteration.

I think we can add fields to ClientConnectionTimingsBuilder like the following:

private long tlsHandshakeStartTimeMicros;
private long tlsHandshakeStartNanos;
private long tlsHandshakeEndNanos;
private boolean tlsHandshakeEndSet;

Afterwards, we can mark the above fields whenever a TLS handshake is started or completed.

A simple walkthrough of the process of establishing the connection is as follows:

  1. The internal class which starts a client request execution is the following location:
  1. If we don't have a readily available connection, a new connection is attempted at the following location:
  1. After Bootstrap#connect is called, the call will go through the netty pipeline and eventually HttpClientPipelineConfigurator#connect will be called.
  1. Lastly, we can notice we receive a callback on TLS handshake completion

In order to pass around the ClientConnectionTimingsBuilder, using attributes and attaching them to the channel is probably the easiest approach.

static final AttributeKey<ClientConnectionTimingsBuilder> TIMINGS_BUILDER_KEY =
        AttributeKey.valueOf(HttpChannelPool.class, "TIMINGS_BUILDER_KEY");
...
// set the ClientConnectionTimingsBuilder to the channel
channel.attr(TIMINGS_BUILDER_KEY).set(timingsBuilder);

// later retrieve the builder
final ClientConnectionTimingsBuilder timingsBuilder =
        ctx.channel().attr(TIMINGS_BUILDER_KEY).get();
timingsBuilder.tlsHandshakeStart();
...

Lastly, we will want to expose the new field at ClientConnectionTimings.

Misc
One point that may not be obvious that if a HTTP2 preface upgrade is attempted and failed, armeria retries the request with HTTP1. We probably want to transfer over the ClientConnectionTimingsBuilder in this case. The handling logic is at the following location:

  • assert responseDecoder == null || !responseDecoder.hasUnfinishedResponses();
    if (sessionTimeoutFuture != null) {
    sessionTimeoutFuture.cancel(false);
    }
    if (proxyDestinationAddress != null) {
    channelPool.connect(proxyDestinationAddress, retryProtocol, serializationFormat,
    poolKey, sessionPromise);
    } else {
    channelPool.connect(remoteAddress, retryProtocol, serializationFormat, poolKey, sessionPromise);
    }

Also, this means that TLS may be attempted two times. I think it's fine to just record the last attempt for now (So if two attempts are made, we just overwrite the first attempt).

In terms of starting point, one can probably write a simple test which spins up a server, and a client makes a request via tls.

@RegisterExtension
static ServerExtension server = new ServerExtension() {
    @Override
    protected void configure(ServerBuilder sb) throws Exception {
        sb.https(0);
        sb.tlsSelfSigned();
        sb.service("/", (ctx, req) -> HttpResponse.of(200));
    }
};

@Test
void testTlsSelfSigned() {
    try (ClientFactory clientFactory =
                 ClientFactory.builder()
                              .tlsNoVerify()
                              .build()) {
        final AggregatedHttpResponse res =
                server.blockingWebClient(cb -> cb.factory(clientFactory))
                      .get("/");
        assertThat(res.status().code()).isEqualTo(200);
    }
}

reserved for @Leewongi0731

@Leewongi0731
Copy link
Contributor

Thanks. I'll try to resolve this issue!!

minwoox pushed a commit that referenced this issue May 20, 2024
Motivation:

Collect the timings related with TLS handshake. If a request was the first in a connection, armeria could also provide it in a RequestLog to tell a user that the request timing has been affected by TLS handshake.

Modifications:

- Add TLS handshake related field in `ClientConnectionTimings`
- Add TLS handshake duration metric field at MetricCollectingClient
- Start collecting the TLS handshake timer in the case below.
  - ~~If the client is enabled as `TCP fast open` in the first request, start the timer before the TCP connection.~~
  - start the timer when netty calls `SslHandler.channelActive()`

Result:

- Closes #3647
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature sprint Issues for OSS Sprint participants
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants