Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QUIC Concurrency API Implementation #24257

Open
wants to merge 25 commits into
base: feature/quic-server
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
59c8791
QUIC APL: Add QUIC Domain SSL Object: Basic Definitions
hlandau Apr 24, 2024
52da59a
QUIC APL: Add QUIC Domain SSL Object: Implementation
hlandau Apr 24, 2024
a550fc8
QUIC APL: Add support for configuring domain flags
hlandau Apr 24, 2024
133d14e
QUIC APL: Use domain flag to determine thread assisted mode
hlandau Apr 24, 2024
a8077c3
RIO: Add OS notifier
hlandau Apr 24, 2024
e3520f0
QUIC REACTOR: Integrate RIO NOTIFIER
hlandau Apr 24, 2024
940c3f1
QUIC APL: Default domain flags
hlandau Apr 24, 2024
80bf4a7
QUIC REACTOR: Inter-thread notification
hlandau Apr 24, 2024
f3d81e5
QUIC ENGINE: Notify when ticking
hlandau Apr 24, 2024
0f48f9d
QUIC REACTOR: Allow ticks to schedule notifications of other threads
hlandau Apr 24, 2024
7468e2f
QUIC CHANNEL: Notify other threads when needed
hlandau Apr 24, 2024
1cf3069
QUIC APL: Refine domain flag handling
hlandau Apr 24, 2024
8ca024a
QUIC: Document SSL_new_domain, etc.
hlandau Apr 24, 2024
ec9c2be
QUIC: Add documentation on concurrency model
hlandau Apr 24, 2024
36fad54
QUIC: Update listener documentation
hlandau Apr 24, 2024
aa15ea4
make update
hlandau Apr 24, 2024
e6e8abf
QUIC OBJ: Require blocking support in the domain flags to use blockin…
hlandau Apr 24, 2024
4e53b28
RIO NOTIFIER: Fix symbol usage
hlandau Apr 29, 2024
cf5a935
Minor fixes
hlandau Apr 29, 2024
8555f1f
Allow use of socketpair, WSASocketA
hlandau Apr 29, 2024
d7ed50e
Doc fixes
hlandau Apr 29, 2024
01303d0
Assorted bugfixes
hlandau Apr 29, 2024
17c6c70
QUIC: Add basic domain flags test
hlandau Apr 29, 2024
b25425b
QUIC RADIX: Test domain functions as well
hlandau Apr 29, 2024
d98dfb2
Minor fix for Windows
hlandau May 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
49 changes: 43 additions & 6 deletions include/internal/quic_reactor.h
Expand Up @@ -100,9 +100,21 @@ struct quic_reactor_st {
void (*tick_cb)(QUIC_TICK_RESULT *res, void *arg, uint32_t flags);
void *tick_cb_arg;

/* Used to notify other threads. */
/* Used to notify other threads. Valid only if have_notifier is set. */
RIO_NOTIFIER notifier;

/*
* Condvar to assist synchronising use of the notifier. Valid only if
* have_notifier is set.
*/
CRYPTO_CONDVAR *notifier_cv;

/*
* Count of the current number of blocking waiters. Like everything else,
* this is protected the caller's mutex (i.e., the engine mutex).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: protected by the caller's

*/
size_t cur_blocking_waiters;

/*
* These are true if we would like to know when we can read or write from
* the network respectively.
Expand All @@ -119,6 +131,9 @@ struct quic_reactor_st {

/* 1 if notifier is present and initialised. */
unsigned int have_notifier : 1;

/* 1 if a block_until_pred call has put the notifier in the signalled state. */
unsigned int signalled_notifier : 1;
};

/* Create an OS notifier? */
Expand Down Expand Up @@ -173,12 +188,17 @@ RIO_NOTIFIER *ossl_quic_reactor_get0_notifier(QUIC_REACTOR *rtor);
*
* The blocking I/O adaptation layer implements blocking I/O on top of our
* asynchronous core.
*/

/*
* ossl_quic_reactor_block_until_pred
* ----------------------------------
*
* The core mechanism is block_until_pred(), which does not return until pred()
* returns a value other than 0. The blocker uses OS I/O synchronisation
* primitives (e.g. poll(2)) and ticks the reactor until the predicate is
* satisfied. The blocker is not required to call pred() more than once between
* tick calls.
* The core mechanism of the Blocking I/O Adaption Layer is block_until_pred(),
* which does not return until pred() returns a value other than 0. The blocker
* uses OS I/O synchronisation primitives (e.g. poll(2)) and ticks the reactor
* until the predicate is satisfied. The blocker is not required to call pred()
* more than once between tick calls.
*
* When pred returns a non-zero value, that value is returned by this function.
* This can be used to allow pred() to indicate error conditions and short
Expand Down Expand Up @@ -209,6 +229,23 @@ int ossl_quic_reactor_block_until_pred(QUIC_REACTOR *rtor,
uint32_t flags,
CRYPTO_MUTEX *mutex);

/*
* ossl_quic_reactor_notify_other_threads
* --------------------------------------
*
* Notify other threads currently blocking in
* ossl_quic_reactor_block_until_pred() calls that a predicate they are using
* might now be met due to state changes.
*
* This function must be called after state changes which might cause a
* predicate in another thread to now be met (i.e., ticking). It is a no-op if
* inter-thread notification is not being used.
*
* mutex is required and must be held.
*/
void ossl_quic_reactor_notify_other_threads(QUIC_REACTOR *rtor,
CRYPTO_MUTEX *mutex);

# endif

#endif
156 changes: 145 additions & 11 deletions ssl/quic/quic_reactor.c
Expand Up @@ -9,6 +9,7 @@
#include "internal/quic_reactor.h"
#include "internal/common.h"
#include "internal/thread_arch.h"
#include <assert.h>

/*
* Core I/O Reactor Framework
Expand All @@ -32,10 +33,17 @@ int ossl_quic_reactor_init(QUIC_REACTOR *rtor,
rtor->tick_cb = tick_cb;
rtor->tick_cb_arg = tick_cb_arg;

rtor->cur_blocking_waiters = 0;

if ((flags & QUIC_REACTOR_FLAG_USE_NOTIFIER) != 0) {
if (!ossl_rio_notifier_init(&rtor->notifier))
return 0;

if ((rtor->notifier_cv = ossl_crypto_condvar_new()) == NULL) {
ossl_rio_notifier_cleanup(&rtor->notifier);
return 0;
}

rtor->have_notifier = 1;
} else {
rtor->have_notifier = 0;
Expand All @@ -52,6 +60,8 @@ void ossl_quic_reactor_cleanup(QUIC_REACTOR *rtor)
if (rtor->have_notifier) {
ossl_rio_notifier_cleanup(&rtor->notifier);
rtor->have_notifier = 0;

ossl_crypto_condvar_free(&rtor->notifier_cv);
}
}

Expand Down Expand Up @@ -191,6 +201,7 @@ RIO_NOTIFIER *ossl_quic_reactor_get0_notifier(QUIC_REACTOR *rtor)
*/
static int poll_two_fds(int rfd, int rfd_want_read,
int wfd, int wfd_want_write,
int notify_rfd,
OSSL_TIME deadline,
CRYPTO_MUTEX *mutex)
{
Expand Down Expand Up @@ -224,9 +235,17 @@ static int poll_two_fds(int rfd, int rfd_want_read,
if (wfd != -1)
openssl_fdset(wfd, &efd_set);

/* Check for notifier FD readability. */
if (notify_rfd != -1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INVALID_SOCKET

openssl_fdset(notify_rfd, &rfd_set);
openssl_fdset(notify_rfd, &efd_set);
}

maxfd = rfd;
if (wfd > maxfd)
maxfd = wfd;
if (notify_rfd > maxfd)
maxfd = notify_rfd;

if (!ossl_assert(rfd != -1 || wfd != -1
|| !ossl_time_is_infinite(deadline)))
Expand Down Expand Up @@ -269,7 +288,7 @@ static int poll_two_fds(int rfd, int rfd_want_read,
#else
int pres, timeout_ms;
OSSL_TIME now, timeout;
struct pollfd pfds[2] = {0};
struct pollfd pfds[3] = {0};
size_t npfd = 0;

if (rfd == wfd) {
Expand All @@ -290,6 +309,12 @@ static int poll_two_fds(int rfd, int rfd_want_read,
++npfd;
}

if (notify_rfd >= 0) {
pfds[npfd].fd = notify_rfd;
pfds[npfd].events = POLLIN;
++npfd;
}

if (!ossl_assert(npfd != 0 || !ossl_time_is_infinite(deadline)))
/* Do not block forever; should not happen. */
return 0;
Expand Down Expand Up @@ -336,8 +361,8 @@ static int poll_descriptor_to_fd(const BIO_POLL_DESCRIPTOR *d, int *fd)
}

/*
* Poll up to two abstract poll descriptors. Currently we only support
* poll descriptors which represent FDs.
* Poll up to two abstract poll descriptors, as well as an optional notify FD.
* Currently we only support poll descriptors which represent FDs.
*
* If mutex is non-NULL, it is assumed be a lock currently held for write and is
* unlocked for the duration of any wait.
Expand All @@ -348,6 +373,7 @@ static int poll_descriptor_to_fd(const BIO_POLL_DESCRIPTOR *d, int *fd)
*/
static int poll_two_descriptors(const BIO_POLL_DESCRIPTOR *r, int r_want_read,
const BIO_POLL_DESCRIPTOR *w, int w_want_write,
int notify_rfd,
OSSL_TIME deadline,
CRYPTO_MUTEX *mutex)
{
Expand All @@ -357,7 +383,45 @@ static int poll_two_descriptors(const BIO_POLL_DESCRIPTOR *r, int r_want_read,
|| !poll_descriptor_to_fd(w, &wfd))
return 0;

return poll_two_fds(rfd, r_want_read, wfd, w_want_write, deadline, mutex);
return poll_two_fds(rfd, r_want_read, wfd, w_want_write,
notify_rfd, deadline, mutex);
}

void ossl_quic_reactor_notify_other_threads(QUIC_REACTOR *rtor,
CRYPTO_MUTEX *mutex)
{
if (!rtor->have_notifier)
return;

/*
* This function is called when we have done anything on this thread which
* might allow a predicate for a block_until_pred call on another thread to
* now be met.
*
* When this happens, we need to wake those threads using the notifier.
* However, we do not want to wake *this* thread (if/when it subsequently
* enters block_until_pred) due to the notifier FD becoming readable.
* Therefore, signal the notifier, and use a CV to detect when all other
* threads have woken.
*/

if (rtor->cur_blocking_waiters == 0)
/* Nothing to do in this case. */
return;

/* Signal the notifier to wake up all threads. */
if (!rtor->signalled_notifier) {
ossl_rio_notifier_signal(&rtor->notifier);
rtor->signalled_notifier = 1;
}

/*
* Wait on the CV until all threads have finished the first phase of the
* wakeup process and the last thread out has taken responsibility for
* unsignalling the notifier.
*/
while (rtor->signalled_notifier)
ossl_crypto_condvar_wait(rtor->notifier_cv, mutex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to lock this mutex prior to reading/writing rtor->signaled_notifier, and unlock it again after the thread you are notifying has updated signaled_notifier. I believe if the mutex isn't owned by the calling thread at the time of the underlying pthread_cond_wait call, EINVAL is returned here:
https://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_cond_wait.html

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes — this code is always called with the mutex held. Incidentally, poll_two_descriptors unlocks the mutex before calling poll(2) and then relocks it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I see where poll_two_descriptors unlocks/relocks the mutex several times, but I never see where that mutex is first acquired.

Is the reactor mutex assigned in ossl_quic_reactor_init aliased from the qctx mutex that is also used in qctx_lock? That would explain why I can't find where rtor->mutex is explicitly locked.

Incidentally, while looking at this I noticed that all your usages of mutexes in QUIC are conditional on OPENSSL_THREADS being defined, which is good, but the above call to ossl_crypto_condvar_wait is not, implying that in a no-threads build, this code will pass a NULL pointer to ossl_crypto_condvar_wait, which I believe will cause -EINVAL to be returned. Not sure if thats a problem in an unthreaded build, but I thought I would point it out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reactor mutex assigned in ossl_quic_reactor_init aliased from the qctx mutex that is also used in qctx_lock? That would explain why I can't find where rtor->mutex is explicitly locked.

Yes. The reactor does not allocate or own its own mutex, but uses the mutex donated from the quic_impl.c code. This is necessary so that it can unlock the mutex during the poll(2) call and then reacquire it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally, while looking at this I noticed that all your usages of mutexes in QUIC are conditional on OPENSSL_THREADS being defined, which is good, but the above call to ossl_crypto_condvar_wait is not, implying that in a no-threads build, this code will pass a NULL pointer to ossl_crypto_condvar_wait, which I believe will cause -EINVAL to be returned.

Good point. We need to have better CI coverage here...

}

/*
Expand All @@ -376,9 +440,12 @@ int ossl_quic_reactor_block_until_pred(QUIC_REACTOR *rtor,
uint32_t flags,
CRYPTO_MUTEX *mutex)
{
int res, net_read_desired, net_write_desired;
int res, net_read_desired, net_write_desired, notifier_fd;
OSSL_TIME tick_deadline;

notifier_fd
= (rtor->have_notifier ? ossl_rio_notifier_as_fd(&rtor->notifier) : -1);

for (;;) {
if ((flags & SKIP_FIRST_TICK) != 0)
flags &= ~SKIP_FIRST_TICK;
Expand All @@ -397,12 +464,77 @@ int ossl_quic_reactor_block_until_pred(QUIC_REACTOR *rtor,
/* Can't wait if there is nothing to wait for. */
return 0;

if (!poll_two_descriptors(ossl_quic_reactor_get_poll_r(rtor),
net_read_desired,
ossl_quic_reactor_get_poll_w(rtor),
net_write_desired,
tick_deadline,
mutex))
++rtor->cur_blocking_waiters;

res = poll_two_descriptors(ossl_quic_reactor_get_poll_r(rtor),
net_read_desired,
ossl_quic_reactor_get_poll_w(rtor),
net_write_desired,
notifier_fd,
tick_deadline,
mutex);

assert(rtor->cur_blocking_waiters > 0);
--rtor->cur_blocking_waiters;

/*
* We have now exited the OS poller call. We may have
* (rtor->signalled_notifier), and other threads may still be blocking.
* This means that cur_blocking_waiters may still be non-zero. As such,
* we cannot unsignal the notifier until all threads have had an
* opportunity to wake up.
*
* At the same time, we cannot unsignal in the case where
* cur_blocking_waiters is now zero because this condition may not occur
* reliably. Consider the following scenario:
*
* T1 enters block_until_pred, cur_blocking_waiters -> 1
* T2 enters block_until_pred, cur_blocking_waiters -> 2
* T3 enters block_until_pred, cur_blocking_waiters -> 3
*
* T4 enters block_until_pred, does not block, ticks,
* sees that cur_blocking_waiters > 0 and signals the notifier
*
* T3 wakes, cur_blocking_waiters -> 2
* T3 predicate is not satisfied, cur_blocking_waiters -> 3, block again
*
* Notifier is still signalled, so T3 immediately wakes again
* and is stuck repeating the above steps.
*
* T1, T2 are also woken by the notifier but never see
* cur_blocking_waiters drop to 0, so never unsignal the notifier.
*
* As such, a two phase approach is chosen when designalling the
* notifier:
*
* First, all of the poll_two_descriptor calls on all threads are
* allowed to exit due to the notifier being signalled.
*
* Second, the thread which happened to be the one which decremented
* cur_blocking_waiters to 0 unsignals the notifier and is then
* responsible for broadcasting to a CV to indicate to the other
* threads that the synchronised wakeup has been cmpleted. Other
* threads wait for this CV to be signalled.
*
*/
if (rtor->have_notifier && rtor->signalled_notifier) {
if (rtor->cur_blocking_waiters == 0) {
ossl_rio_notifier_unsignal(&rtor->notifier);
rtor->signalled_notifier = 0;

/*
* Release the other threads which have woken up (and possibly
* rtor_notify_other_threads as well).
*/
ossl_crypto_condvar_broadcast(rtor->notifier_cv);
} else {
/* We are not the last waiter out - so wait for that one. */
while (rtor->signalled_notifier)
ossl_crypto_condvar_wait(rtor->notifier_cv, mutex);
}
}

if (!res)
/*
* We don't actually care why the call succeeded (timeout, FD
* readiness), we just call reactor_tick and start trying to do I/O
Expand All @@ -420,4 +552,6 @@ int ossl_quic_reactor_block_until_pred(QUIC_REACTOR *rtor,
*/
return 0;
}

return res;
}