Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async event handlers #324

Merged
merged 10 commits into from Jan 23, 2024
Merged

Async event handlers #324

merged 10 commits into from Jan 23, 2024

Conversation

mnmr
Copy link
Contributor

@mnmr mnmr commented Jan 11, 2024

Changed events to use AsyncEventHandler. Modified all event subscribers internally. Changed event publishing in NatsConnection to instead write to a channel and used a background task to do the actual publishing.

All the tests (except the memory tests, which I didn't run) run successfully, however, it should be noted that events now fire outside locks. Which is probably a good thing as NatsConnection was formerly relying on event handlers to be well-behaved.

Write events to channel and publish from a background task. This allows handlers to be async and also prevents event listeners from blocking the NatsConnection code.
Copy link
Collaborator

@mtmk mtmk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good 👍 Just a quick question about channel reader.

src/NATS.Client.Core/NatsConnection.cs Outdated Show resolved Hide resolved
@mtmk
Copy link
Collaborator

mtmk commented Jan 11, 2024

there were a few test failures due to the way we pull in nats-server binaries (nothing to do with actual tests). Rerunning some of them now... edit: done

Copy link
Contributor

@to11mtm to11mtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good change overall.

Two thoughts/concerns:

First, Task rather than ValueTask from the standpoint of allowing high perf scenarios without breaking APIs in future. We may want to think about this. (although....)

Secondly, whether we should consider instead doing something along the lines of (genuinely apologize for late night napkin concept but wanted to have the thought out there, both for the sake of avoiding breaking things for current users as well as making it potentially easier for consumers of all types to wire up):

//again, forgive (probably some) incorrectness

//TODO: Need to allow for ordering of execution.
//NOTE: Due to ordering, may need an ordered dictionary with locks 
//           rather than ConcurrentDictionary
private readonly ConcurrentDictionary<object, Func<object, string,ValueTask> _disconnectHandlers = new();

    //Assumes AsyncEventHandler is now ValueTask delegate type.
    public event AsyncEventHandler<string>? ConnectionDisconnectedAsync
    {
        add { _disconnectHandlers.TryAdd(value, value); }
        remove { _disconnectHandlers.TryRemove(value, out _); }
    }
    public event EventHandler<string>? ConnectionDisconnected
    {
        add { _disconnectHandlers.TryAdd(value, 
                 v=> {
                            return  (send,arg) => 
                                    { 
                                        v(send,arg); 
                                        return ValueTask.CompletedTask}; 
                                    }
                         }
        remove 
        { 
            _disconnectHandlers.TryRemove(value, out _); 
        }    
    }
    //Could also have Task variant that converts to ValueTask

Again, I'm not certain whether this would have a perf benefit [0] but at least it would help avoid breaking changes of various sorts...

Not sure if doing such would break things like outside callers trying to raise but in general that is a questionable pattern.

[0] - It very well might, between avoiding the casts as well as the cost of .GetInvocationList()'s internal cloning of the list, as well as the ability to get rid of casts in the main event dispatch loop.

src/NATS.Client.Core/NatsConnection.cs Outdated Show resolved Hide resolved
/// <param name="sender">The sender of the event.</param>
/// <param name="args">Event arguments.</param>
/// <returns>A task whose completion signals handling is finished.</returns>
public delegate Task AsyncEventHandler(object? sender, EventArgs args);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hard would it be for us to have the delegate be ValueTask instead?

Two main reasons:

  1. Enable users who want to put in the work to minimize allocations on invocation.
  2. (Potentially?) allow users who understand risk but would still prefer a 'fast path' for whatever reason (avoiding Task.Run pressure, etc)

We don't necessarily have to do the work for that outright (i.e. we can optimize plumbing internally later) but if we expose Task as the delegate return type going to ValueTask later would be a breaking change, so not sure if worth considering.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a downside personally. We might as well start with ValueTask to avoid breaking it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using ValueTask is a mistake. This code is not on a hot path (if we had a MessageDelivered event, this would be a different story), and the awaiting code is outside the library - you must never await a ValueTask more than once, and this provides a gotcha for consumers of the library.

I'm not opposed to making it ValueTask per se (and will happily adjust the PR), but don't see this as an area that needs any optimization. There's only 4 events and none of them are expected to fire very often; if they do, you have more serious concerns to deal with.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about not being on hot path. In general we do use ValueTask pretty much everywhere though, hot path or not. If you don't see any technical problems using it (apart from double await) do you think we should consider using it for consistency as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use ValueTask sparingly, unless it was internally or on a hot path. Stephen Toub has a great guide on the topic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to revive this thread - I just scanned our public API and I think we use ValueTask everywhere. It might be best to stick to that for consistency's sake. External code wouldn't be awaiting events anyways, we control when the events run and are awaited.

Copy link
Collaborator

@mtmk mtmk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @to11mtm's point about supporting both event handler types is well worth considering. We won't break and invocation list performance would be a nice bonus.

edit: we can leave this discussion for another issue/PR

/// <param name="sender">The sender of the event.</param>
/// <param name="args">Event arguments.</param>
/// <returns>A task whose completion signals handling is finished.</returns>
public delegate Task AsyncEventHandler(object? sender, EventArgs args);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a downside personally. We might as well start with ValueTask to avoid breaking it later.

src/NATS.Client.Core/NatsConnection.cs Outdated Show resolved Hide resolved
@mnmr
Copy link
Contributor Author

mnmr commented Jan 12, 2024

I think @to11mtm's point about supporting both event handler types is well worth considering. We won't break and invocation list performance would be a nice bonus.

It's a clever idea, but not without drawbacks (e.g. how to provide ordering). ConcurrentDictionary is slow to iterate and using a dictionary for something that is likely 1-3 entries might also not be ideal. If we really want to go down this route, I think it should be possible to use Interlocked and a fixed sized array to store the delegates (i.e. there would be a max subscriber limit).

That said, as I understand it, these events will be fired quire rarely (perhaps with the exception of OnError?), and since it's now being done using a separate task, it's not performance critical at all. Also, since this code calls out to user-provided code, it can be as clever as it want and still not provide any guarantees about timeliness (since an earlier subscriber could be taking a long time to complete; this would of course be bad, but is also not something the library can protect against).

I could of course be mistaken about this, or you could have plans for events that could be more frequent than the current set, but if not then I don't think we need cleverness here at all.

It might still be worthwhile to support both event paradigms, but forcing sync subscribers to return a Task/ValueTask is not a big ask and likely something they already do in many other places, so it would not be a new paradigm. So personally I don't see much value in keeping the sync events, except that doing so would be non-breaking.

@mnmr
Copy link
Contributor Author

mnmr commented Jan 12, 2024

As an aside, I think we should consider introducing custom EventArgs for all events, and in particular for the 3 events that currently just pass a string.

Motivation: If we at some point decide to add a timestamp (since the handlers are invoked some time after the occurrence, this might be useful) or a concurrency version/token (to detect if an event relates to an earlier state of the connection), then these fields can simply be added to the EventArgs - no breaking change for existing subscribers as the signature would still be the same.

@mtmk
Copy link
Collaborator

mtmk commented Jan 13, 2024

I think we should consider introducing custom EventArgs for all events, and in particular for the 3 events that currently just pass a string.

This sounds good to me. passing strings here is a bit odd.

@mnmr
Copy link
Contributor Author

mnmr commented Jan 13, 2024

This sounds good to me. passing strings here is a bit odd.

If there are no objections from anyone else, I will apply this change to the PR.

Let me know what the decision ends up being for ValueTask/Task and "optimized storage with dual sync/async support". For now, I will not be applying any of these changes.

…ppedEventArgs. Changed events to use EventArgs and adjusted internal subscribers. Renamed OnError to MessageDropped, as that better reflects what error is signaled and also is naming-wise more consistent with the other events.
@mnmr
Copy link
Contributor Author

mnmr commented Jan 13, 2024

I've pulled the latest code from main into the branch, and applied some changes:

  • Added NatsEventArgs and renamed MessageDroppedError to NatsMessageDroppedEventArgs
  • Changed events to use EventArgs and adjusted internal subscribers
  • Renamed OnError to MessageDropped, as that better reflects what error is signaled and also is naming-wise more consistent with the other events

The error event was only signaled in one place, so it makes sense to rename it I think, but this was not discussed earlier so feel free to shout and have me revert if you dislike it :)

@mtmk
Copy link
Collaborator

mtmk commented Jan 14, 2024

The error event was only signaled in one place, so it makes sense to rename it I think, but this was not discussed earlier so feel free to shout and have me revert if you dislike it :)

nah it makes sense but I think we need to keep it as it is not because of matter of taste :) but I think we should keep it inline with other NATS clients: https://docs.nats.io/using-nats/developer/connecting/events/slow#detect-a-slow-consumer-and-check-for-dropped-messages

@mnmr
Copy link
Contributor Author

mnmr commented Jan 14, 2024

The error event was only signaled in one place, so it makes sense to rename it I think, but this was not discussed earlier so feel free to shout and have me revert if you dislike it :)

nah it makes sense but I think we need to keep it as it is not because of matter of taste :) but I think we should keep it inline with other NATS clients: https://docs.nats.io/using-nats/developer/connecting/events/slow#detect-a-slow-consumer-and-check-for-dropped-messages

But there's not really much consensus between the SDKs when it comes to error handling. The Java version supplies a "slowConsumerDetected" handler to the errorListener, but that's both backwards and unsafe (a special-purpose method is used to handle a generic event, rather than vice versa). And correcting a mistake has to start somewhere, right? ;)

Alternatively, we could keep both. That is, if we re-introduce Error but with a generic NatsErrorEventArgs argument (with an enum to specify the kind of error), we have an event in place for any future errors. Handlers for the generic Error event would need to inspect the args and cast it to the specific args type if they want more information. Since event publishing happens on a separate task, it should be no problem with publish error events to multiple delegate lists.

I will of course change it back to Error (even OnError, although the OnXxx methods are usually reserved for the methods triggering the event) if you insist, but I think a bit of push-back here is in order ;)

@mnmr
Copy link
Contributor Author

mnmr commented Jan 14, 2024

I see that some tests are failing in CI, but they run fine locally. Is this something to be worried about, or is this a known issue with how the tests are executed?

@mtmk
Copy link
Collaborator

mtmk commented Jan 14, 2024

I see that some tests are failing in CI, but they run fine locally. Is this something to be worried about, or is this a known issue with how the tests are executed?

There are a few flappers. I need to fix them at some stage. Just re-running them now. edit: all green now!

@mtmk
Copy link
Collaborator

mtmk commented Jan 14, 2024

I will of course change it back to Error (even OnError, although the OnXxx methods are usually reserved for the methods triggering the event) if you insist, but I think a bit of push-back here is in order ;)

Nothing like a good debate :) If you look at the examples all languages are accepting some form of error callback thorough a generic error handler then deducing (which would be an is type check for us) the nature of the error in that callback (Java is a bit odd but same idea applies).

Good point about the name though? Just Error or ErrorReceived?

@mnmr
Copy link
Contributor Author

mnmr commented Jan 14, 2024

Nothing like a good debate :) If you look at the examples all languages are accepting some form of error callback thorough a generic error handler then deducing (which would be an is type check for us) the nature of the error in that callback (Java is a bit odd but same idea applies).

Don't most of these exist in their present form because they originate from designs in languages that are less expressive? If you can only pass one error handler, it needs to be unspecific. This is not consumer friendly, as you need to know what types you'll encounter at runtime and then do appropriate casting. It doesn't feel like a good fit for a language where each event and handler type can be declared separately.

Good point about the name though? Just Error or ErrorReceived?

If we do go back to this we should probably use the verbified edition, so ErrorReceived/ErrorOccurred/etc

@mtmk
Copy link
Collaborator

mtmk commented Jan 14, 2024

Nothing like a good debate :) If you look at the examples all languages are accepting some form of error callback thorough a generic error handler then deducing (which would be an is type check for us) the nature of the error in that callback (Java is a bit odd but same idea applies).

Don't most of these exist in their present form because they originate from designs in languages that are less expressive? If you can only pass one error handler, it needs to be unspecific. This is not consumer friendly, as you need to know what types you'll encounter at runtime and then do appropriate casting. It doesn't feel like a good fit for a language where each event and handler type can be declared separately.

That's a good point. This one is kind of a gray area maybe, but our general approach is that we should be using what's natural to C# over general NATS client parity.

@mtmk
Copy link
Collaborator

mtmk commented Jan 14, 2024

LGTM thanks @mnmr 💯

I shall merge in a day or so to give everyone a chance to have one last look.

cc @sspates @to11mtm @oising @caleblloyd

PS I'll create a discussion for keeping the old handlers (@to11mtm's suggestion) once merged.

@mtmk mtmk added the breaking label Jan 14, 2024
@to11mtm
Copy link
Contributor

to11mtm commented Jan 14, 2024

PS I'll create a discussion for keeping the old handlers (@to11mtm's suggestion) once merged.

TYVVM @mtmk ! Am happy to revisit later. My main concern, while it may not be a real thing yet, is future scenarios where we are dealing with Mobile or other scenarios where the connection state may have more frequent drops/reconnects (and often GC is preferred to be minimized.)

My biggest concern is in the case of OnError, I can say that unfortunately this does lead to the risk of having a lot of additional allocations at what is likely an inopportune time; i.e. it happens when a consumer is already hitting a constraint and we are possibly adding more pressure with the invocation calls.

I have thoughts for how we can solve but will leave for the follow-up issue.

@caleblloyd
Copy link
Contributor

I think it's almost ready - I did revive the ValueTask comment thread - #324 (comment) - but that's the only outstanding comment I have

@mnmr
Copy link
Contributor Author

mnmr commented Jan 22, 2024

@caleblloyd I would like to get this PR wrapped up and so need a resolution on the Task/ValueTask issue.

If major rework is required or expected, perhaps we should create a feature branch to accept the PR into. Then we can spawn the follow-up discussion and apply the changes agreed there before merging into main.

@caleblloyd
Copy link
Contributor

I think that we should stick with ValueTask, since the public API is currently consistent in using ValueTask. @mtmk what do you think?

@mtmk
Copy link
Collaborator

mtmk commented Jan 22, 2024

Let's change that to ValueTask. That was my opinion too. Also, I think that's the consensus as well.

@mnmr I think your main point against using ValueTask is that care needs to be taken because of its pitfalls like not being able to double await; which is valid but for our library. I think we're are already making extensive use of ValueTask; so from our library's design perspective ValueTask fits in better because it's similar to other API calls as @caleblloyd mentioned, also it might open up opportunities for finer optimizations in the future as @to11mtm suggested.

…from AsyncEventHandler as it now different from the original delegate. Adjusted code to match new signature. Note that "default" for ValueTask is a completed task.
@mnmr
Copy link
Contributor Author

mnmr commented Jan 22, 2024

I have changed the delegate to return ValueTask and adjusted the rest of the code.

I also investigated if we could get rid of the GetInvocationList() copy (other than by caching the result), and came across this interesting issue - so it looks like this is something that can be improved in the future.

@mtmk
Copy link
Collaborator

mtmk commented Jan 23, 2024

Added open comments to #348 (as far as I could gather, feel free to add/correct bits I missed)

Copy link
Collaborator

@mtmk mtmk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @mnmr

@mtmk mtmk merged commit 7b930ea into nats-io:main Jan 23, 2024
10 checks passed
@mtmk
Copy link
Collaborator

mtmk commented Jan 23, 2024

Apologies, I forgot to check and remind you for this PR @mnmr please sign your commits next time.

mtmk added a commit that referenced this pull request Jan 23, 2024
* Async event handlers (#324)
* pipe reader: don't mark commands as consumed until pending=0 (#347)
* Log connection failure as warning (#343)
@mtmk mtmk mentioned this pull request Jan 23, 2024
mtmk added a commit that referenced this pull request Jan 23, 2024
* Async event handlers (#324)
* pipe reader: don't mark commands as consumed until pending=0 (#347)
* Log connection failure as warning (#343)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid multicast delegates (event) in favor of Func<args, Task> to allow for async event handlers
4 participants