Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancel DispatchSource before closing socket (#4791) #4859

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

lxbndr
Copy link
Contributor

@lxbndr lxbndr commented Dec 26, 2023

To avoid issues described in #4791 we have to follow DispatchSource cancel procedures in the first place. I.e. we have to prevent underlying socket close until DispatchSource calls a cancel handler.

Everything becomes complicated a bit, because we have multiple cases for socket and DispatchSource lifecycle. Cancelling Dispatch Source is asynchronous operation (with control points at the start and at the end of the process), and other actions, despite being simple and synchronous, become more complicated due to their dependency on cancel operation. Some of basic cases:

  • Socket could (but not necessary would) outlive its owning easy_handle, as CURL caches connections in multi_handle
  • 99% of time DispatchSource cancels quickly, but sometimes some other work on the queue squeezes in between the start and the end of cancelling. CURL could easily send a series of register-unregister-register requests, and we have to carefully handle possible overlapping.

This change aims to extend socket life by tying its lifecycle with boxing object (_SocketReference). Socket is not closed until _SocketReference is alive. While DispatchSource cancel process is ongoing, we're keeping such object, sharing it through a storage with the close socket function. Close socket function implementation marks _SocketReference as eligible for closing. If there is no ongoing Dispatch Source cancel, the reference is deinited immediately, effectively closing socket.

Also, this change is trying to be as less invasive as possible. Mostly additive, without structural changes. Major work is done by manipulating the state of _SocketReference. I believe there is better way to handle this, but this would probably require more extended rework of how Dispatch Sources are managed and stored.

And the fly in the ointment. This fixes most of crash scenarios on Windows, but not all. I can now say confidently that Dispatch has some flaws related to the socket processing on that platform. During stress testing I observed numerous Dispatch crashes on adding socket handle after it being reused by the system - and that is after graceful and complete cancel of corresponding DispatchSource. Luckily, it appears only on heavy load (test test_repeatedRequestsStress - marked as expected to fail), but it definitely affects final product and our users.

Also, this changeset includes tests from #4854 and timer fixes from #4858 (as it makes everything work more predictable). If this whole PR would be unreliable for some reason, I'd like to merge aforementioned changes separately.

CURL documentation (https://curl.se/libcurl/c/CURLMOPT_TIMERFUNCTION.html)
explicitly says that the timer should be one-time. We basically have to
follow CURL requests for setting, resetting and disarming such timers.

Current logic eventually leaves a 1ms repeating timer forever, because
CURL assumes it fires once, and may not ask us to remove it explicitly.

Also, being used as request timeout trigger, this timer also has no sense
to be repeated.
@lxbndr lxbndr changed the title Cancel DispatchSource before closing socket (#4791) Cancel DispatchSource before closing socket (#4791) Dec 27, 2023
@lxbndr lxbndr force-pushed the readdle/urlsession-dispatchsource-close branch from 7e9e89d to bab881e Compare December 27, 2023 12:41
self?.timeoutTimerFired()
}
timeoutSource = _TimeoutSource(queue: queue, milliseconds: milliseconds, handler: block)
//TODO: Could simply change the existing timer by using DispatchSourceTimer again.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it has to be done or one issue will be created ?

Suggested change
//TODO: Could simply change the existing timer by using DispatchSourceTimer again.
// TODO: Could simply change the existing timer by using DispatchSourceTimer again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't really say on behalf on this comment author😞 This was here from the beginning. Guess it is better to leave as is to keep the diff less noisy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let the Todo, I thought you've added it

}
if action.needsWriteSource {
createWriteSource(socket: socket, queue: queue, handler: handler)
if (action.needsReadSource || action.needsWriteSource) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is () is really needed ?

Suggested change
if (action.needsReadSource || action.needsWriteSource) {
if action.needsReadSource || action.needsWriteSource {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, how did it get here? Should've cleaned this. Thanks!

@@ -99,7 +99,7 @@ class _TCPSocket: CustomStringConvertible {
listening = false
}

init(port: UInt16?) throws {
init(port: UInt16?, backlog: Int32) throws {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you choose Int32 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propagated backlog parameter from listen function signature, and it is Int32 there. I guess it is imported from C int, which is CInt in Swift, which, in turn, is the alias for Int32. 🤔 Do you think it is better to use CInt here instead?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation 🙌

No for me Int32 is the good choice, I asked only for know how did you make your choice.

Copy link

@Harry-KNIGHT Harry-KNIGHT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple review with questions, not blocking for merge.

Extends socket lifetime enough to let DispatchSource cancel properly.
Also prevents from creating new DispatchSources while other are in the
middle of cancelling.

Also includes tests (see apple#4854 for test details).
@lxbndr lxbndr force-pushed the readdle/urlsession-dispatchsource-close branch from bab881e to f9a54f3 Compare December 29, 2023 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants