New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid allocations in RaiseEvent when there's no listeners #15580
base: master
Are you sure you want to change the base?
Conversation
You can test this PR using the following package version. |
|
@cla-avalonia agree |
return typedInputElement.RaiseEvent(routedEvent); | ||
} | ||
|
||
var eventArgs = new RoutedEventArgs(routedEvent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't it make sense to pass a factory here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would require adding all the new RaiseEvent
overloads to the IInputElement
interface, which I wasn't comfortable with, as I don't think it should be that specific.
Since IInputElement
isn't client implementable, we should never get to that fallback anyways.
/// The created <see cref="RoutedEventArgs"/> used to raise the event, | ||
/// or null if the event wasn't raised because it didn't have any listeners. | ||
/// </returns> | ||
public RoutedEventArgs? RaiseEvent<TContext>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if passing the context by value combined with a delegate call is less expensive than dealing with a single allocation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's optimizing for the case where there are no listeners, where it's a clear win.
For the case where there are listeners, passing the state and calling the delegate is a little slower than a direct call, as you'd expect.
Note that the delegate+state pattern appears in .NET itself, for example in ILogger.Log<TState>()
or String.Create<TState>()
.
Benchmarks
SmallEvent
Method | Mean | Error | StdDev | Gen0 | Allocated |
---|---|---|---|---|---|
Standard_No_Listeners | 22.091 ns | 0.1672 ns | 0.1564 ns | 0.0024 | 40 B |
Factory_No_Listeners | 9.617 ns | 0.0545 ns | 0.0510 ns | - | - |
Standard_With_Listeners | 100.925 ns | 0.8585 ns | 0.8031 ns | 0.0057 | 96 B |
Factory_With_Listeners | 111.128 ns | 0.6346 ns | 0.5936 ns | 0.0057 | 96 B |
LargeEvent:
Method | Mean | Error | StdDev | Gen0 | Allocated |
---|---|---|---|---|---|
Standard_No_Listeners | 39.63 ns | 0.768 ns | 0.854 ns | 0.0076 | 128 B |
Factory_No_Listeners | 21.99 ns | 0.115 ns | 0.108 ns | - | - |
Standard_With_Listeners | 118.76 ns | 1.438 ns | 1.345 ns | 0.0110 | 184 B |
Factory_With_Listeners | 126.16 ns | 0.411 ns | 0.343 ns | 0.0110 | 184 B |
Benchmark code
#nullable enable
using Avalonia.Controls;
using Avalonia.Input;
using Avalonia.Interactivity;
using BenchmarkDotNet.Attributes;
namespace Avalonia.Benchmarks;
[MemoryDiagnoser]
public class RaiseEvent_SmallEvent_Benchmarks
{
private static readonly RoutedEvent<RoutedEventArgs> _smallEvent =
RoutedEvent.Register<Button, RoutedEventArgs>("SomeEvent", RoutingStrategies.Bubble);
private readonly Button _button = CreateButton(10);
internal static Button CreateButton(int depth)
{
var button = new Button();
for (int i = 0; i < depth; i++)
{
var child = new Button();
button.Content = child;
button = child;
}
return button;
}
[GlobalSetup(Targets = [nameof(Standard_With_Listeners), nameof(Factory_With_Listeners)])]
public void SetupListeners()
=> _button.AddHandler(_smallEvent, (_, _) => { });
[Benchmark]
public void Standard_No_Listeners()
=> Standard();
[Benchmark]
public void Factory_No_Listeners()
=> Factory();
[Benchmark]
public void Standard_With_Listeners()
=> Standard();
[Benchmark]
public void Factory_With_Listeners()
=> Factory();
private void Standard()
=> _button.RaiseEvent(new RoutedEventArgs(_smallEvent));
private void Factory()
=> _button.RaiseEvent(_smallEvent);
}
[MemoryDiagnoser]
public class RaiseEvent_LargeEvent_Benchmarks
{
private static readonly RoutedEvent<PointerEventArgs> _largeEvent =
RoutedEvent.Register<Button, PointerEventArgs>("LargeEvent", RoutingStrategies.Bubble);
private readonly Button _button = RaiseEvent_SmallEvent_Benchmarks.CreateButton(10);
private readonly IPointer _pointer = new Pointer(1, PointerType.Mouse, true);
[GlobalSetup(Targets = [nameof(Standard_With_Listeners), nameof(Factory_With_Listeners)])]
public void SetupListeners()
=> _button.AddHandler(_largeEvent, (_, _) => { });
[Benchmark]
public void Standard_No_Listeners()
=> Standard();
[Benchmark]
public void Factory_No_Listeners()
=> Factory();
[Benchmark]
public void Standard_With_Listeners()
=> Standard();
[Benchmark]
public void Factory_With_Listeners()
=> Factory();
private void Standard()
=> _button.RaiseEvent(new PointerEventArgs(
_largeEvent,
_button,
_pointer,
_button,
new Point(123.0, 456.0),
9876543210UL,
new PointerPointProperties(RawInputModifiers.Control, PointerUpdateKind.LeftButtonPressed),
KeyModifiers.Control));
private void Factory()
=> _button.RaiseEvent(
_largeEvent,
static (e, ctx) => new PointerEventArgs(e, ctx.source, ctx.pointer, ctx.rootVisual, ctx.rootVisualPosition, ctx.timestamp, ctx.properties, ctx.modifiers),
(source: _button,
pointer: _pointer,
rootVisual: _button,
rootVisualPosition: new Point(123.0, 456.0),
timestamp: 9876543210UL,
properties: new PointerPointProperties(RawInputModifiers.Control, PointerUpdateKind.LeftButtonPressed),
modifiers: KeyModifiers.Control));
}
So it's about 6-10% slower when there's listeners, and twice as fast when there's none.
Note that the handler does nothing in these benchmarks - real code will probably have heavier handlers.
If the overhead for listeners isn't acceptable, we can avoid the delegate call and context and use the LightweightEventRoute
directly in the solution - it's simply less convenient to write than then RaiseEvent()
handlers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from the small performance drop identified above, the static delegate syntax is hard to use correctly.
Allocations could also be avoided with a null check, using syntax similar to this:
myControl.RaiseEvent(InputElement.TextInputEvent)?.With(new TextInputEventArgs() { Text = "foo" });
The return value of this RaiseEvent
overload should be a nullable struct. If it's null, the event args object won't be constructed.
Perhaps a new name is needed for this design, since "RaiseEvent" won't actually raise the event.
It reminds me But I don't have specific preferences here, since old (current) syntax is still available and likely to be the most common to use. |
Aside from that, simple dev analyzer can be added to help with finding usages of slower RaiseEvent version. |
While I very much like the simpler syntax you're proposing, it comes with new caveats. The struct could implement using var route = myControl.TryBuildEventRoute(Event); // final method name to be defined
route?.RaiseEvent(new RoutedEventArgs(...)); Or we can dispose the list on raise: myControl.TryBuildEventRoute(Event)?.Raise(new RoutedEventArgs(...)); Forgetting to call Personally, I prefer the first one, with the |
Both are OK, but I would prefer option B because I think it will be unusual for anyone to try to store the returned object and reuse it, especially if we choose a good name for the methods. "EventRouteBuilder", then "Complete"? If someone does try to re-use the same struct, an exception can be thrown. |
For the sake of discussion, I want to add that option A is a similar to what we have today with the existing using var route = BuildEventRoute();
route.RaiseEvent(args); |
That works, but is a bit surprising. That being said, I agree it should be unusual to store the returned object, and proper documentation comments should hopefully prevent incorrect usage. Outside of the Avalonia solution, only people caring about these extra allocations will use the new method anyway. |
So unless this is a major bottleneck we might be going too far on the optimizations now. It's always a balance between optimization <-> maintainability and this seems error prone and not intuitive compared to what we've been doing. |
Yes, it's always a balance, and a difficult one. That's why .NET, especially since Core, has multiple levels of API. For example you can go as high as While it's not a major bottleneck, allocating is a death by a thousand paper cuts, as I like to remind each time. I don't mind allocations when they are useful, and kept to a minimum if possible (while still having maintainable code). In the case of this PR, launch any Avalonia application, and observe that 95% of raised events on startup have no listeners. Yes, that's a real number. For example, should a In my opinion, everything should be "pay to use" as much as possible. You're not using that feature? It's free, no extra allocations and minimum CPU time wasted. If possible, it should even be completely trimmed away when using AOT compilation (not applicable to this PR). As someone looking at memory snapshots very frequently, it's quite hard to analyze and improve things when the majority of the allocations are noise coming from libraries out of your control. I ideally want a "clean state" here as much as possible. Avalonia should be efficient, and never be a bottleneck. I'm very committed to improve things regarding memory allocations and CPU usage in the coming months. My rant aside, if the proposed API isn't ideal - which I agree it isn't - we can go back to simply exposing |
Option B could allow reusing the route with |
@MrJul I agree with basically everything you said. My only concern is the API is a bit ugly and non-standard (unwieldy?) for this type of thing 😄 I would prioritize a cleaner API in this area to avoid errors rather than micro optimizations. But like I said, I agree with you that it makes sense to have different levels of API. If we don't expose this API to app developers (keep it internal) I am even more OK with this. It allows us to improve it in the future if a better idea comes along without committing to it as a public API. I can live with the PR as is but if I was a gatekeeper I would still want an improved API. I won't hold up the discussion any more on this unless I think of a better API myself. |
What does the pull request do?
This PR adds new
Interactive.RaiseEvent()
overloads which avoid creating an event route and the corresponding event arguments if there are no listeners for a givenRoutedEvent
.Note that while there's already an existing
BuildEventRoute()
method to avoid allocating the arguments when there isn't any listener, it doesn't prevent creating the route itself.In addition,
EventRoute.HasHandlers
now returns true if there are no handlers inside the route butRoutedFinished
handlers exist.What is the current behavior?
Each time a
RoutedEvent
is raised, anEventRoute
and aRoutedEventArgs
(or derived) objects are always created.What is the updated/expected behavior with this PR?
RoutedEventArgs
are created only if necessary (when there's at least one listener).EventRoute
are never created anymore.How was the solution implemented (if it's not obvious)?
A new
LightweightEventRoute
is introduced, which has the same behavior as anEventRoute
, but is a struct, to avoid heap allocations.This new type has
RaiseEvent()
methods taking an arguments factory alongside a context/state. The factory is only called if there are listeners for the event.The expected usage is to have a static factory to avoid allocating an extra closure on each call (which would defeat the purpose of the method).
Since mutable structures are prone to errors (they must always be passed by reference to work correctly, which is hard to enforce in public API),
LightweightEventRoute
is internal and isn't exposed directly.Instead, new public
Interactive.RaiseEvent()
overloads use theLightweightEventRoute
.The public
EventRoute
implementation now delegates its logic toLightweightEventRoute
.Notes
The first commit is the most interesting one as it implements the new
RaiseEvent()
overloads and theLightweightEventRoute
type.The other commits are using the new
RaiseEvent()
overloads where it's possible.We have several places where it isn't possible to remove the arguments allocation currently without breaking changes.
For example, there's
Control.OnLoaded/OnUnloaded/OnSizeChanged
which takes the arguments directly, added in #11828. While there are good arguments exposed in this PR for this change, I think we should also optimize for the 99% case. There are usually hundreds of controls in a given visual tree, and only a few (if any) with aLoaded
handler.That said, we have tons of inconsistency here in the codebase. Most routed events don't have a matching virtual
OnXxx()
method. Some have. Some of these methods have aRoutedEventArgs
parameter. Some don't. We should really unify the behavior here in v12, document it, and stick to it (while still avoiding unnecessary allocations when possible). I plan to open a proposal to discuss that.