Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Envoy's peerCertificateValidated() appears to behave inconsistently #4396

Open
kenjenkins opened this issue Jul 28, 2023 · 5 comments
Open
Assignees

Comments

@kenjenkins
Copy link
Contributor

What happened?

For #4352, I thought I had configured Envoy to perform client certificate validation (as a prerequisite to #4353). I believed we could rely on the result of Envoy's peerCertificateValidated() Lua method in order to determine whether a request was made over a connection with a trusted client certificate. However, the behavior of this method appears to be inconsistent. For initial requests with a valid client certificate, this method returns true, but for some subsequent requests, this method may return false. Consequently Pomerium may return a 495 error page even for requests even with a valid client certificate.

What did you expect to happen?

Pomerium should not return a 495 error page for requests with a valid client certificate.

How'd it happen?

  1. I configured a Pomerium route with the tls_downstream_client_ca setting, to enable client certificate validation for that particular route.
  2. I navigated to this route in the browser and selected a trusted client certificate when prompted.
  3. I refreshed the browser after waiting several minutes. Occasionally I observed a 495 error page from Pomerium stating that a valid client certificate was required.

What's your environment like?

  • Pomerium version (retrieve with pomerium --version):
    pomerium: v0.20.0-430-g6c1416fc+6c1416fc
    envoy: 1.25.5+ecf50a958e5c053e5016b994943d8e77710b8c7ddeef5bc6ca32b8ca09e7bcbc
  • Server Operating System/Architecture/Cloud: macOS

What's your config.yaml?

routes:
  - from: https://verify.localhost.pomerium.io
    to: http://localhost:8000
    pass_identity_headers: true
    allow_any_authenticated_user: true
    tls_downstream_client_ca_file: /path/to/client/CA.pem

What did you see in the logs?

{
  "level": "error",
  "error": "a valid client certificate is required to access this page",
  "status": 495,
  "status-text": "a valid client certificate is required to access this page",
  "request-id": "27e51b9d-79b8-4b7a-aa09-7a95228c97e8",
  "time": "2023-07-28T13:44:05-07:00",
  "message": "httputil: error"
}
{
  "level": "info",
  "service": "authorize",
  "request-id": "27e51b9d-79b8-4b7a-aa09-7a95228c97e8",
  "check-request-id": "27e51b9d-79b8-4b7a-aa09-7a95228c97e8",
  "method": "GET",
  "path": "/",
  "host": "verify.localhost.pomerium.io",
  "ip": "127.0.0.1",
  "session-id": "62a6eabf-2420-48b3-a8e6-3394fc77f118",
  "user": "[redacted]",
  "email": "[redacted]",
  "allow": true,
  "allow-why-true": [
    "user-ok"
  ],
  "deny": true,
  "deny-why-true": [
    "invalid-client-certificate"
  ],
  "time": "2023-07-28T13:44:05-07:00",
  "message": "authorize check"
}

Additional context

n/a

@kenjenkins kenjenkins self-assigned this Jul 28, 2023
@kenjenkins
Copy link
Contributor Author

I think this may be related to TLS session resumption.

I'm unable to reproduce this issue when connecting to Pomerium via openssl s_client and issuing multiple HTTP/1.1 requests over several minutes (I consistently receive a 302 login redirect rather than a 495 error page).

$ openssl s_client -connect localhost:443 -servername verify.localhost.pomerium.io -alpn http/1.1 \
    -cert path/to/client-cert.pem -key path/to/client-cert-key.pem
[... lots of openssl output ...]
GET / HTTP/1.1
Host: verify.localhost.pomerium.io

HTTP/1.1 302 Found
content-type: text/html; charset=UTF-8
x-pomerium-intercepted-response: true
location: https://authenticate.localhost.pomerium.io/.pomerium/sign_in?[...]

[... some time later ...]

GET / HTTP/1.1
Host: verify.localhost.pomerium.io

HTTP/1.1 302 Found
content-type: text/html; charset=UTF-8
x-pomerium-intercepted-response: true
location: https://authenticate.localhost.pomerium.io/.pomerium/sign_in?[...]

However, if I add the option -reconnect, then I receive a 495 response right away:

$ openssl s_client -connect localhost:443 -servername verify.localhost.pomerium.io -alpn http/1.1 \
    -cert path/to/client-cert.pem -key path/to/client-cert-key.pem -reconnect
[... lots of openssl output ...]
GET / HTTP/1.1                                          
Host: verify.localhost.pomerium.io

HTTP/1.1 495 Unknown
content-type: text/html; charset=UTF-8
x-pomerium-intercepted-response: true
[...]

@kenjenkins
Copy link
Contributor Author

There's an open upstream issue for this problem (only with a route match based on the client certificate validation status, rather than a Lua filter): envoyproxy/envoy#21235.

Disabling ticket-based session resumption by setting disable_stateless_session_resumption does not appear to fix this problem.

Some options:

  1. We could (partially) revert authorize: incorporate mTLS validation from Envoy #4374, and plan to instead configure validation within Envoy only when our pending "stricter client certificate enforcement" option is enabled. We would also need to re-implement CRL validation in our ExtAuthz service if we cannot rely on Envoy's client certificate validation.

    This would open the possibility of inconsistent client certificate validation between the two enforcement options.

  2. We could investigate the feasibility of fixing this upstream bug within Envoy. Envoy appears to still have access to the full certificate chain presented by the client, even after resuming a TLS session. I don't see why Envoy couldn't re-validate the client certificate at the time of session resumption.

@desimone desimone added the NeedsMoreData Waiting for additional user feedback or case studies label Jul 31, 2023
@kenjenkins
Copy link
Contributor Author

I've begun to discuss an upstream fix on envoyproxy/envoy#21235, but I'll plan to proceed with option (1) for now.

@kenjenkins
Copy link
Contributor Author

@desimone and I discussed another option today, which I previously hadn't given much consideration:

My understanding is that this should completely disable TLS session resumption (and so avoid the Envoy issue). The downside is that Pomerium would no longer support TLS 1.2 whenever downstream mTLS is enabled.

It's unclear to me whether any Pomerium users have a hard requirement on TLS 1.2 support.

I've prototyped this approach here: defe0fd. Initial testing appears promising: I haven't observed a spurious 495 error page so far, and I've been unable to resume a session with openssl s_client (although testing TLS 1.3 session resumption with s_client is non-trivial, so I may not be testing this correctly).

@kenjenkins
Copy link
Contributor Author

Envoy has accepted a new configuration option disable_stateful_session_resumption. I believe this will be included in the upcoming 1.28 release, scheduled for 2023-10-161.

Together with the existing disable_stateless_session_resumption, this option should allow us to disable TLS session resumption entirely, for both TLS 1.2 and 1.3.

Once this new version of Envoy is released, I propose we:

  • update Pomerium's Envoy configuration to disable TLS session resumption whenever downstream_mtls is configured (or possibly only when the enforcement mode is not set to reject_connection)
  • revert e91600c (restoring the Validated boolean field in the ClientCertificateInfo struct)
  • remove our Go implementation of the certificate chain validation, CRL check, chain depth check, and SAN check, relying solely upon the validation status provided by Envoy

(Note that fully removing these checks requires that we also complete the removal of the deprecated tls_downstream_client_ca option, tracked as #4385.)

Footnotes

  1. see https://github.com/envoyproxy/envoy/blob/main/RELEASES.md#major-release-schedule

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants