Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent context #47

Open
darky opened this issue Aug 26, 2019 · 7 comments
Open

Persistent context #47

darky opened this issue Aug 26, 2019 · 7 comments

Comments

@darky
Copy link
Contributor

darky commented Aug 26, 2019

Need ability to pass some context firstly and then, it will be always available in workers pool.
For example, CPU-intensive geo task - check point in polygons.
Polygons so weight and every time serialize - deserialize it so expensive.
Would be better to pass it firstly

await job(() => {
}, {persistentCtx: {polygons: [/* many-many polygons */]}});

And then on every job execute it always accessible:

await job(() => {
  polygons // it accessible here yet.
  
}, {data: {point: [12.3434, 56.3434]}});
@manuel-di-iorio
Copy link

manuel-di-iorio commented Aug 27, 2019

Related issue: #42

darky pushed a commit to darky/microjob that referenced this issue Aug 28, 2019
@darky
Copy link
Contributor Author

darky commented Aug 28, 2019

#48 PR

@wilk
Copy link
Owner

wilk commented Sep 3, 2019

@darky Thanks for this issue!

Well, let me check if I got it rightly: you need a global bucket shared between worker threads to avoid multiple massive serialisations/deserialisations, correct?
This could be done with SharedArrayBuffer (shared memory) by you.
However, yes, it could be a useful feature to embed in microjob.

Anyway, your PR is moving the serialisation/deserialisation problem from the user to the core: darky@67c21ae#diff-c9253097723f89dd4716748fab2e00cdR108
Every time the user invokes job, the whole persistentCtx gets serialised and sent via postMessage and then deserialised from the worker thread.
I think a good solution could be to pass a global shared context from an external facade, convert it to a SharedArrayBuffer and then convert it back with a proper getter from the worker.
I wouldn't use the job interface to define a global context: it's ambiguous.

@darky
Copy link
Contributor Author

darky commented Sep 3, 2019

Every time the user invokes job, the whole persistentCtx gets serialised and sent via postMessage and then deserialised from the worker thread.

It occurred once at first time, after it always available via darky@67c21ae#diff-5bfbc2def8d97c3939b537c3f6f31b3eR3

I think a good solution could be to pass a global shared context from an external facade, convert it to a SharedArrayBuffer and then convert it back with a proper getter from the worker.

Can you please provide little example, also you can close #42 via it example :)

@darky
Copy link
Contributor Author

darky commented Sep 3, 2019

I wouldn't use the job interface to define a global context: it's ambiguous.

Yep, agree. Maybe better to use start function for this purpose?

@r3wt
Copy link

r3wt commented Sep 26, 2019

Yep, agree. Maybe better to use start function for this purpose?

In this scenario, would persistentCtx be mutable (from within a job for example)?

I have a bit of a weird use case:

  • in one job that runs every N minutes, some data is passed in via context, and the synchronous algorithm builds a sharded index based on the data, then returns it from the job to the main thread.
  • this index is stored in memory along with the data, where a synchronous search algorithm uses the index and data to compute search results.

Ideally i'd like to be able to do the following:

  • keep both index and data in persistent state of the job (mutable)
  • run the search algorithm inside of jobs, instead of in main thread as it is now

unfortunately the serialization cost is too high without persistent state, and idea the state would be mutable would be advantageous, otherwise i'd have to stop and start a new worker pool everytime i need to update the dataset.

@darky
Copy link
Contributor Author

darky commented Sep 27, 2019

@r3wt #48 PR can satisfy your needs about mutation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants