Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a worker property to access the underlying worker #31

Open
aelgasser opened this issue Oct 4, 2020 · 9 comments
Open

Added a worker property to access the underlying worker #31

aelgasser opened this issue Oct 4, 2020 · 9 comments

Comments

@aelgasser
Copy link

Hi, I'm trying to have someone look at and merge the PR I send a couple of weeks ago.

This PR: #27

Is this project still being maintained?

@seedy
Copy link

seedy commented Oct 8, 2020

@aelgasser I use this workaround until your PR gets merged. I hope it might help!

My need is to instanciate a worker for a heavy computation, then terminate it.

// loader.js

// eslint-disable-next-line no-restricted-globals
export function terminate() { close(); }

export const getSomething = () => {/* my specific code */}

// worker.js
// eslint-disable-next-line
import MyWorker from 'comlink-loader!./loader';
import * as Comlink from 'comlink';

export default async () => {
  const worker = new MyWorker();
  const something = await worker.getSomething();
  worker.terminate();
  worker[Comlink.releaseProxy]();
  return something;
}

// main.js
import workerGetSomething from './worker';


const something = await workerGetSomething();

@aelgasser
Copy link
Author

@seedy That's what I've used in my first attempt as well, but actually worker[Comlink.releaseProxy](); doesn't terminate the thread, it only removes the message port/proxy being used to communicate between the worker and the main page. If you go into the dev tools and look into the Source tab for threads you would see, after your worker terminates, its thread is still hanging:

image

Here I have 5 + 1 threads because I ran 1 of my workers multiple times.

When you really terminate a worker, its thread is removed as well, which is not the case here (at least for me; could you confirm?)

Actually I tried to provide an implementation for "terminate" on comlink (here GoogleChromeLabs/comlink#487), but it hasn't been accepted, that's why I putting a change for comlink-loader here instea

@developit
Copy link
Collaborator

Are either of you able to give an example.of where implicit or explicit termination is necessary? I can think of a few, just want to confirm that this won't lead folks to think it's advisable in the general case (which it likely isn't).

@aelgasser
Copy link
Author

@developit Not sure I understand your question, but in my case, threads related to the workers are not released and stay showing up in the source tab of Chrome and thus I guess their resources are there too. That looks like to me as a potential for memory leaks.

My use case is as this (you can play around with it here https://dataset-manager.netlify.app/ with the attached CSV file [in the zip as Github doesn't support CSV files], and the source code is here https://github.com/aelgasser/dataset-manager) :

test_smiles.zip

  • The user uploads a dataset comprised of a list of molecules and some measurements and maybe some invalid or unwanted columns (in my example those are columns origin and availability).
  • I spawn multiple workers each running one validation step (make sure a column has data, make sure a column only contains numbers, make sure there is a column containing SMILES [a string representation of a molecule], ...). This allows that all checks happen in parallel on as many threads as the user have on his machine (real-world datasets can be pretty huge)
  • Present all the errors to the user and let him fix
  • Save the dataset to our backend
  • Repeat for the next dataset

In the actual version, you'll see that there are 2 workers, transposers and empty-column-checker, and they both stay alive when done, which I don't want because I'd have as many of them as there are columns + other checkers for other kinds of errors and they accumulate for every dataset you upload (you can try uploading the same file again)

Once I introduced the changes to terminate the worker in comlink and later in comlink-loader, I could simply terminate them at the end of the process and they're gone.

Here is a view of Chrome's task manager showing the increase in memory consumption :

First load:

image

After one dataset::

image

After a second dataset:

image

I could even think about it as a vector to DOS the user's machine but that would take me some time to come up with a POC so I'll leave it to someone else

@aelgasser
Copy link
Author

aelgasser commented Oct 10, 2020

I just pushed a WIP branch with the changes to terminate the worker (I was in the middle of refactoring when I realized that the workers were not killed, hence the mess in the code :-) ) : https://github.com/aelgasser/dataset-manager/commit/c4fb0776854f834a6ac500bf28b4da3f06574c72

The main pieces are in :

If you want to test this branch, you'd have to manually patch comlink-loader inside node_modules with my changes and then run npm start.

@developit
Copy link
Collaborator

developit commented Oct 10, 2020

@aelgasser the reason I ask is because it seems like terminating the workers in this case is actually hurting performance.

Unlike OS threads, Workers do not have any form of code sharing. This means there is a nontrivial startup cost associated with spawning a new Worker. Because of that, if you are building a flow that passes data through multiple workers, it's almost always better to reuse workers rather than spawning them for a single task and terminating after.

That said there's still lots of cases where termination is important. One that hasn't yet been mentioned here is terminating workers in response to HMR module disposal. I see you're using create-react-app, maybe that's part of the reason you end up with all the dormant workers?

@aelgasser
Copy link
Author

How would I reuse the worker created/returned by comlink and comlink-loader to perform multiple tasks (ie. multiple validators in my case)? The only solution I see would to 'bundle' all the validators inside one big worker.

Re react-create-app, I honestly didn't test without it so I can't tell. I did work with workers without react and never seemed to have the issue. For this project I'm using comlink because it's the only solution I found that is stable enough to run my use case and have typescript code I could compile and be able to import with create-react-app without having to eject and update webpack's config.

@seedy
Copy link

seedy commented Oct 13, 2020

@aelgasser Sorry for the delayed answer...

In my code extract, I use the global close method available inside a worker context.

// eslint-disable-next-line no-restricted-globals
export function terminate() { close(); }

I call worker.terminate() before releasing the proxy.

I tested in a few browsers, web workers are killed once my computations are done.

Sadly it seems the global close method is deprecated


@developit Thanks for the info about startup cost of a worker.

I am using some workers to encrypt and decrypt large image files.

For now, I spawn one worker per file and kill it once computations are done.

I tried both singleton and factory modes.
In case I use the factory mode, I think I definitely need to kill my workers once work is done. I don't have a clue about a user's future reuse of the feature.

Having a singleton worker would mean encrypting my files in sequence, which is against my initial expectations.

I still need to benchmark both modes to see which one gives the best perfs, though.

@mlampedx
Copy link

@developit I’m interested in this as well. My use case is a one-time user flow where there is no benefit to maintaining the web worker over the duration of the user’s entire session — just a segment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants