Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queue/limits for pre- and post-processing #349

Open
natefoo opened this issue Nov 30, 2023 · 0 comments
Open

Queue/limits for pre- and post-processing #349

natefoo opened this issue Nov 30, 2023 · 0 comments

Comments

@natefoo
Copy link
Member

natefoo commented Nov 30, 2023

Currently, Pulsar will take as many jobs off the setup queue as are received and immediately begin preprocessing them. If there is a large backlog of jobs (e.g. due to some kind of prior problem resulting in jobs not processing for a time period), this results in a large amount of IO contention staging in (and possibly hitting open file limits, if you don't increase them), causing jobs with even moderately small inputs to queue for hours because writing is so slow.

Unfortunately I can't really quantify the penalty - it is possible that the overall job throughput would not be any better even if a limited queue were in place, since the same amount of data still has to be transferred either way. But I do suspect it'd still move that data quicker if it weren't trying to do all of it at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant