Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed unused sessions from SSL_CTX internal cache #101684

Merged
merged 4 commits into from May 2, 2024

Conversation

rzikm
Copy link
Member

@rzikm rzikm commented Apr 29, 2024

Discovered when playing with #101626 and #101552.

OpenSSL keeps an internal cache for TLS sessions so that it can automatically resume previous sessions created via same SSL_CTX objects. We do our own management of TLS session tickets on clients so that we don't try to resume sessions to different hostnames, and we keep exactly one SSL_SESSION object per hostname.

However, the internal OpenSSL cache still keeps track of all the SSL_SESSION objects. A single SSL_CTX object by default will be willing to cache 20 * 1024 objects by default, in my repro, this manifested as up to about 70 MB of "leaked" memory.

The fix is to simply synchronize both caches and remove the SSL_SESSION from SSL_CTX cache when we discard it.

Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl, @bartonjs, @vcsjones
See info in area-owners.md if you want to be subscribed.

@wfurt
Copy link
Member

wfurt commented May 1, 2024

and do we see positive improvements in memory use? I worry the tests we have don't quite reproesent realistic scenario.

@rzikm
Copy link
Member Author

rzikm commented May 1, 2024

Do we know if that respects the 1x for each ticket? The report given to me claimed different memory use between 1.2 and 1.3 - all going to single server. But there were parallel requests and I'm not sure we have good test case for it.

While the change looks reasonable to me I feel we should make sure we handle all the TLS 1.3 cases as best as we can. I'm wondering if we have multiple TLS session to the same destination if we can get multiple tickets - each valid for one more resume. The current logic really maps to TLS 1.2 where we can simply keep the last session and use it later as many times as we want to.

few things:

  • the "leak" presents only when there are parallel SslStream handshakes to the same host, what I observed and tried to fix is that we get multiple new session tickets from OpenSSL, but we keep only one. But even though we FreeSession the previous one, it still has a ref in the internal SSL_CTX cache, which should be now fixed by tha call to SslCtxRemoveSession
  • even without this change, the TLS session is used multiple times if you start multiple TLS 1.3 handshakes in parallel, I checked and all these sessions get successfully resumed (at least running against Linux server), so I thought it better to preserve this behavior. I can reasonably enforce the single use constraint, I think we have TLS resume only for TLS 1.3 so it could be easy.

@rzikm
Copy link
Member Author

rzikm commented May 1, 2024

and do we see positive improvements in memory use? I worry the tests we have don't quite reproesent realistic scenario.

Yes, with the change, the repro from #101552 presents stable OpenSSL memory utilization between iterations, as measured by #101626

@rzikm rzikm changed the title Disable OpenSSL internal SSL_SESSION cache for clients Removed unused sessions from SSL_CTX internal cache May 1, 2024
Copy link
Member

@wfurt wfurt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rzikm
Copy link
Member Author

rzikm commented May 2, 2024

CI Failures are unrelated

@rzikm rzikm merged commit ee93104 into dotnet:main May 2, 2024
94 of 98 checks passed
michaelgsharp pushed a commit to michaelgsharp/runtime that referenced this pull request May 9, 2024
* Disable OpenSSL internal SSL_SESSION cache for clients

* Attempt no. 2

* Revert "Disable OpenSSL internal SSL_SESSION cache for clients"

This reverts commit 56a308e.
@wfurt
Copy link
Member

wfurt commented May 10, 2024

/backport to release/8.0-staging

Copy link
Contributor

Started backporting to release/8.0-staging: https://github.com/dotnet/runtime/actions/runs/9036427971

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants