Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle possible race condition when using ExecutorEventEmitter #76

Closed
forslund opened this issue Oct 6, 2020 · 7 comments
Closed

Handle possible race condition when using ExecutorEventEmitter #76

forslund opened this issue Oct 6, 2020 · 7 comments

Comments

@forslund
Copy link
Contributor

forslund commented Oct 6, 2020

We've had an error report in Mycroft with an exception when removing a once handler.

Been trying to reproduce and I think it's a race condition. We have a sequence which sets up a once listener waits for 5 seconds for the listener to be called. If the handler is called it returns immediately if the 5 seconds passes without the handler being triggered the listener is removed.

Sometimes (quite rarely) the handler gets called, 5 seconds times out and removes the listener. THEN the once wrapper tries to remove the listener.

A try/except block may be a simple solution to remove the exception log

def _wrapper(f):
            def g(*args, **kwargs):
                try:
                    self.remove_listener(event, f)
                except KeyError:
                    # log warning or something
                    pass  # or possibly return

                # f may return a coroutine, so we need to return that
                # result here so that emit can schedule it
                return f(*args, **kwargs)

            self._add_event_handler(event, f, g)

A proper solution would be to have a lock around the critical section(s)

@jfhbrook
Copy link
Owner

jfhbrook commented Oct 6, 2020

Thanks for the report! I'll try to take a closer look this weekend.

@jfhbrook
Copy link
Owner

jfhbrook commented Oct 6, 2020

Which class is this? Is this the BaseEventEmitter or the ExecutorEventEmitter? read the topic

@jfhbrook
Copy link
Owner

jfhbrook commented Oct 6, 2020

(it might not matter, it sounds like the move is to make the base EE thread-safe anyhow)

@forslund
Copy link
Contributor Author

forslund commented Oct 6, 2020

Yeah might be just as well to handle it in the BaseEmitter.

I'll see if I can get some spare time and clobber together a suggestion...

@jfhbrook
Copy link
Owner

jfhbrook commented Oct 6, 2020

mulling this over, I think it might not be a locking issue - like it sounds like what's happening is:

  • you set the once, which adds a handler to remove the listener not after it's fired but after the async handler completes
  • the event fires at t=0, kicking off an async handler
  • the async action is incomplete at t=1 and a timeout occurs, triggering a manual handler removal
  • the async action completes at t=2, triggering an automated handler removal of a handler that's already been removed

so while I could probably add a lock to the BaseEventEmitter around some of the fiddly bits (that's a good idea regardless), I think the answer here is to make the once handlers robust against the handler being removed out-of-band prior to the once's cleanup, and/or have the once remove its own listener prior to the async handler completing. something like that.

@forslund
Copy link
Contributor Author

forslund commented Oct 6, 2020

Yes that's pretty much what I think is happening as well.

I see two ways of making it more robust,

One: the simple try / except above
Two: Making explicit checks that event and f still exist in the event before removing the listener. This will require locking since check + pop cannot be atomic (afaik)

Edit: Since you've got much more insight than me in the architecture here, maybe you have a better idea?

@jfhbrook
Copy link
Owner

jfhbrook commented Oct 6, 2020

Yeah no, we're of similar minds here. I mulled over making sure the remove runs on the same tick as the trigger, rather than waiting for the callback to complete - I think this is more correct anyway and might ninja it in there, as a digression - but you should still be able to trigger this for a synchronous call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants