Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement async parsers #1352

Merged
merged 9 commits into from Feb 14, 2024
Merged

Implement async parsers #1352

merged 9 commits into from Feb 14, 2024

Conversation

msujew
Copy link
Contributor

@msujew msujew commented Jan 23, 2024

A follow up on #1306.

Implements the basics to create an async parser based on node worker threads. While in theory this should be faster than the sync parser implementation (due to parallelisation), it might be slower on startup due to the time it takes for all the workers to start up. Once the threads are all up and running, it is indeed faster. The break even point is fairly late though, roughly after parsing ~2 million LoC.

The main benefit of this change is that it allows parser cancellation by simply killing workers. The cancellation happens after a timeout, since it needs to take the time it takes to create a new worker into account. The default is 200ms, but can be arbitrarily changed by adopters.

Most of the change is actually just the new Hydrator service which tries to dehydrate/hydrate AST nodes to be processed by the structured cloning algorithm of workers.

@msujew msujew added the parser Parser related issue label Jan 23, 2024
Copy link
Contributor

@sailingKieler sailingKieler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff @msujew.

I just have some minor questions & remarks.
Btw. Would it work the same way (in principle) in the browser with webworkers?

async parse<T extends AstNode>(text: string, cancelToken: CancellationToken): Promise<ParseResult<T>> {
const worker = await this.acquireParserWorker(cancelToken);
const deferred = new Deferred<ParseResult<T>>();
deferred.disposables.push(cancelToken.onCancellationRequested(() => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is very dense. Could you please separate this and motivate the attachment of the Disposable returned by the listener registration to deferred?
deferred is supposed to have a finite lifetime, right?
And onCanceled-Listeners won't be called multiple times, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored this a bit, but it still works the same way. This is basically a safeguard: If the cancellation is requested after parsing has finished, we don't want to kill the worker anymore. The cancellation has a way longer lifetime than the deferred promise here. So we have to remove the listener again once were finished with it.

packages/langium/src/parser/async-parser.ts Outdated Show resolved Hide resolved
...result,
value: dehydrated
});
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this considered a blue print for worker implementations?
Or should everybody built one on it's own? A remark on that including some dos and dont's would be nice, here.

Copy link
Contributor Author

@msujew msujew Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, everyone should build their own.
Do's: Parse the text and send the dehydrated AST back
Dont's: Anything else.

packages/langium/src/serializer/hydrator.ts Show resolved Hide resolved
packages/langium/src/serializer/hydrator.ts Outdated Show resolved Hide resolved
packages/langium/src/node/worker-thread-async-parser.ts Outdated Show resolved Hide resolved
@sailingKieler
Copy link
Contributor

sailingKieler commented Jan 30, 2024

One more thing:
I prefer to block merging this PR until #1258 is merged, and to fix the conflicts as part of this PR.
So maybe I was a bit to quick with my approval 😅

@msujew msujew force-pushed the msujew/async-parser-impl branch 2 times, most recently from b626ca3 to 9d8d6ad Compare February 13, 2024 15:57
@msujew msujew merged commit e78aeba into main Feb 14, 2024
5 checks passed
@msujew msujew deleted the msujew/async-parser-impl branch February 14, 2024 13:28
@msujew msujew added this to the v3.0.0 milestone Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parser Parser related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants