Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding Worker and BackgroundService instances #96

Open
Matheos96 opened this issue Feb 13, 2024 · 6 comments
Open

Understanding Worker and BackgroundService instances #96

Matheos96 opened this issue Feb 13, 2024 · 6 comments
Labels
question Further information is requested

Comments

@Matheos96
Copy link
Contributor

I have a few questions regarding the use of IWorker instances and IWorkerBackgroundService instances.

  1. Having created a background service for on a Worker instance, is there any point in keeping the worker instance around? Can I for example spawn multiple BackgroundServices from the same Worker instance? If so, I suppose they can only run once at a time, since they are on the same Worker?
  2. Is diposing of the IWorker instance sufficient or do I separately need to dispose of the IWorkerBackgroundServices? (BTW is there an actual need for disposing workers in practise?)
  3. If Workers and BackgroundServices are always a 1-1 relationship, why not allow creating a Background service through a factory method which also implicitly creates a Worker? Just a thought with a lot of "ifs"...

Had high hopes to improve performance in our application using workers but I am starting to see some of the limitations really becoming problems. For example, the cost of serialization + deserialization to pass data may at times overweigh the benefit of doing calculations in parallel, probably depending on the complexitiy of the task. I got some PoC code working with our app but managed to decrease performance by quite a lot... I guess due to massive amounts of serialization work as well as unability to utilize our cache well enough (as it would be global and workers cannot access global things)

@Tewr
Copy link
Owner

Tewr commented Feb 14, 2024

Hello, first of all, thank you for your interest in the project and for taking the time to ask these kind of questions.

  1. Having created a background service for on a Worker instance, is there any point in keeping the worker instance around? Can I for example spawn multiple BackgroundServices from the same Worker instance? If so, I suppose they can only run once at a time, since they are on the same Worker?

You may run several services on the same worker, but indeed, a worker is a single thread so only one function call can do work at a given time. The exception would be yielding, which may occur at any async/await junction. You can force yields for tests or complex scenarios using await Task.Delay(1). I don't have any particular scenario in mind where it would make sense to run several services rather than one on a worker, but I imagine that when reusing components that work together or using libraries (eg indexed db), it could eventually be useful.

  1. Is diposing of the IWorker instance sufficient or do I separately need to dispose of the IWorkerBackgroundServices? (BTW is there an actual need for disposing workers in practise?)

Disposing the worker kills the background services, so disposing them should only be neccessary if you plan on keeping the worker. The need for worker disposal is crucial for applications that load dynamic code (an assembly.load call cannot be undone). Scenarios like compilers, plugins, or libraries that are leaking memory by nature. Also, when maintaining a worker pool it's probably a common scenario when changing the pool size.

  1. If Workers and BackgroundServices are always a 1-1 relationship, why not allow creating a Background service through a factory method which also implicitly creates a Worker? Just a thought with a lot of "ifs"...

Like stated in the first answer, it's not a 1-1 relationship, and I think it would be unwise to change that. Even if its a niche use case, this entire project is very niche. That said, most of the start up code is already stowed away in extension methods and creating a new one that creates a worker implicitly would not be very difficult, even in user code. But may require some small intermediate object I guess.

Had high hopes to improve performance in our application using workers but I am starting to see some of the limitations really becoming problems. For example, the cost of serialization + deserialization to pass data may at times overweigh the benefit of doing calculations in parallel, probably depending on the complexitiy of the task. I got some PoC code working with our app but managed to decrease performance by quite a lot... I guess due to massive amounts of serialization work as well as unability to utilize our cache well enough (as it would be global and workers cannot access global things)

Indeed serializing is very expensive. There is always the option of using the core module and optimize serialization for your application. But the general rule I think should be that you must avoid freezing the UI at all cost, so any task that has a cycle slower than 200ms could be a candidate, but like you said, you must be able to serialize in less time than that. Also the 2x serialize/deserialize adds to overall execution time of course.

The top layer (backgroundservice) is developed with usability rather than speed in mind. Serialization of expressions is nice to use but slow and the messages have a lot of overhead. It has a lot of alternatives, I'm planning to create a second, alternative layer that uses Roslyn at compile time instead, and use message pack for serialization.

Regarding caching, you would have to use something universally available like indexed DB or the Cache interface. Both of these methods require serializing your data though.

@Tewr Tewr added the question Further information is requested label Feb 14, 2024
@Matheos96
Copy link
Contributor Author

Hi! Thanks for the detailed response.

You answered my questions quite clearly and I don't really have any follow ups on that.

Regarding practical problems, indeed serializing is very expensive. I very much like what this library brings and the ease of use, but I may have underestimated the limitations that even the underlaying Javascript Web Worker API enforces. Such limitations are of course the need for serialization + deserialization everytime, no access to DOM and/or global window/document object of javascript environment. These things vastly limit what use we can get out of this library unfortunately.

Let me explain briefly our scenario.
We have a Blazor WASM (Standalone purely client side) web app which gets Data from our API. The data is instantly deserialized (from BSON) when received and this needs to happen (at least at some point) on the main thread as we need access to the data in the GUI. Now, the plan was to then send parts of the data to workers and have them process and calculate 3D coordinates for us to then eventually use for rendering a 3D model using Three.js.
The problems already arise when we realise we need to serialize parts of the data, which we previously already deserialized, and then send it to the workers, deserialize it again in the worker, process it, compute result, serialize result, send it back to the main thread and deserialize it on there. It would be nice if we did not have to return the result back to the main thread at all, but unfortunately it seems like only the main thread will be able to "talk" to the underlaying DOM and JS objects that we set up for creating our Three.js scene etc. This means we cannot interop straight from a worker, as the worker will not be able to access our exisitng javascript objects :(
In the end, it seems like this back and forth serialization hell really brings way more performance hits than benefits, unfortunately.

To Summarize:

  • Need everything to be deserialized on main thread (cannot avoid this)
  • Need all JS interop to happen from main thread
  • Leads to --> all worker communication must be handled with separate serialization back and forth

Even if using the Core library may give some less overhead, we still cannot avoid the issue mentioned above. I'm not looking for a magical solution but if you have any suggestions on how I could mediate any of the issues, please, I am all ears.

@Tewr
Copy link
Owner

Tewr commented Feb 19, 2024

I'm curious to why you say that data must be deserialized on the main thread and why it cannot be avoided.

You can try optimizing by downloading data directly to something that can be shared by everyone (indexed db or Cache api). That way, you could theoretically do the deserialiation in parallel (on worker(s) and main thread) But I cannot tell if it's too much work or even if its going to pay off. Also indexeddb is not super easy to work with.

@Matheos96
Copy link
Contributor Author

I'm curious to why you say that data must be deserialized on the main thread and why it cannot be avoided.

Sorry I may have left out some details. It is essential for the functionality of our web app that we have this data in memory as it may need to be accessed depending on user interactions. This data of ours contains both meta data and geometrical primitives. These primitives are in 2D in our data but with a simple offset we may calculate the extruded 3D shapes, which is what we visualize in our web app and which is what takes quite long to calculate.
It is true that these primitives themselves may not need to be deserialized at all on the main thread but they are kind of baked in with the rest of the data that we do want to access at runtime. This may be a limitation of our own deserialization code and/or Object model though...
Also, the result of the 3D calculations must be returned to the main thread so we can use JS interop to pass the data on to our Three.js methods. The methods in questions create object instances which are attached to the global window object which is why the interop must happen from the main thread with access to the DOM.
If you see any flaws in my thinking please correct me, this all is just according to my understanding of web workers which may be slightly faulty.

You can try optimizing by downloading data directly to something that can be shared by everyone (indexed db or Cache api). That way, you could theoretically do the deserialiation in parallel (on worker(s) and main thread) But I cannot tell if it's too much work or even if its going to pay off. Also indexeddb is not super easy to work with.

I am not familiar with either of these at all so I don't think at the moment we have the time to investigate it further unfortunately with our release coming up rather quickly.

Ultimately I believe the best solution for us, if we would like to get our solution working with workers, would be to perhaps split our model, either physically or "virtually" (for example geometrical data separately delivered from meta data from API) in order for us to avoid some deserialization+serialization overhead. That way we could perhaps, as you say,only deserialize the heavy stuff on the workers where the data would be needed, and never deserialize the complex shapes on the main thread. The main thread would just need to receive the calculated 3D geometry data in the end in order for it to be passed on to JS. A lot of hoops to jump through in the end, not impossible but not trivial

@Tewr
Copy link
Owner

Tewr commented Feb 19, 2024

must be returned to the main thread so we can use JS interop to pass the data on to our Three.js methods.

You could probably serialize from the worker and send directly to the js main thread without doing yet another deserialize - serialize on the dotnet running on main. Not sure how much it will save you though. have you tried to measure the time spent in (de/)serialization ?

Might be some flag I could provide for debugging in a future version, I'm not sure how easy it is to set up performance counters

@Matheos96
Copy link
Contributor Author

Hmm yea that is true, probably would not need to deserialize it in between on dotnet main as long as I send it in simple enough form for it to be easily deserializable in javascript (float arrays).

I have tried measuring the deserialization time and serialization using Stopwatch. It depends quite a lot on the data as I grouped it by "layer" (PCB CAD layers). One layer may contain a few shapes to thousands and thousands. The general trend though was more than 200ms, sometimes up to 700ms for deserialization in worker. And that is of course without the transfer overhead (if such exists?). In the end, I have not yet managed to group it in any way that would actually improve the speed of the loading, probably the overheads overweigh the benefit from parallisation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants