Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to crossbeam-queue for events #1091

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ahomescu
Copy link
Contributor

@ahomescu ahomescu commented May 6, 2024

Use a lock-free queue to store the event list because the event callbacks might be called from inside a signal handler. Most system calls are not safe in that context, so we just spin if the queue is full.

@ahomescu ahomescu self-assigned this May 6, 2024
@@ -91,7 +103,7 @@ impl Backend {
self.flush();
}

pub fn run(&mut self, events: impl IntoIterator<Item = Event>) {
pub fn run(&mut self, events: Arc<ArrayQueue<Event>>) {
let (lock, cvar) = &*FINISHED;
let mut finished = lock.lock().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this lock?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also MainThreadRuntime::send_event, which calls Mutex::lock, too, on a Mutex<Backend>. Also, are we sure allocations can't happen in any of these calls?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure allocations can't happen in any of these calls?

ArrayQueue docs say:

This queue allocates a fixed-capacity buffer on construction, which is used to store pushed elements.... Having a buffer allocated upfront makes this queue a bit faster than SegQueue.

Which sounds like it doesn't allocate on push.

There's also MainThreadRuntime::send_event, which calls Mutex::lock

@ahomescu, can you clarify specifically which functions should support being called from a signal handler context?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which sounds like it doesn't allocate on push.

There's a bunch of other functions here that look like they could potentially called inside a signal handler. I understand this one is specifically designed not to allocate, but I'm not sure about the others. Like bincode serialization or formatting output, etc. They might not, but we need to check, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vague understanding of the design here is that the instrumentation calls we insert push events onto a queue, and a background thread processes those events. Since we potentially could be instrumenting signal handlers, the instrumentation/push code needs to be lock-free and allocation-free. But the background thread's processing (including bincode/serialization stuff) doesn't run in a signal handler and so isn't subject to those constraints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. So then only ExistingRuntime::send_event needs to be async-signal-safe, meaning fn run and fn write_all here don't need to be, right?

Copy link
Contributor Author

@ahomescu ahomescu May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's exactly what @spernsteiner said: the push code needs to be signal-safe, but the receiver doesn't because it runs in its own thread.

Edit: Should I go audit all the functions in the runtime and check exactly which ones need to be signal-safe? Shouldn't be too hard.

Copy link
Contributor

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we run into this causing problems? Signal safety is really annoying.

analysis/runtime/src/runtime/scoped_runtime.rs Outdated Show resolved Hide resolved
@@ -91,7 +103,7 @@ impl Backend {
self.flush();
}

pub fn run(&mut self, events: impl IntoIterator<Item = Event>) {
pub fn run(&mut self, events: Arc<ArrayQueue<Event>>) {
let (lock, cvar) = &*FINISHED;
let mut finished = lock.lock().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. So then only ExistingRuntime::send_event needs to be async-signal-safe, meaning fn run and fn write_all here don't need to be, right?

Use a lock-free queue to store the event list
because the event callbacks might be called from
inside a signal handler. Most system calls are not
safe in that context, so we just spin if the queue
is full.
@ahomescu ahomescu force-pushed the ahomescu/lock_free_event_queue branch from 64faf2f to d83f983 Compare May 9, 2024 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants