Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable more flexible killing of jobs #15979

Open
hammerhead opened this issue May 9, 2024 · 4 comments
Open

Enable more flexible killing of jobs #15979

hammerhead opened this issue May 9, 2024 · 4 comments

Comments

@hammerhead
Copy link
Member

Problem Statement

Killing all running jobs of a certain user or all jobs that match a certain pattern (like all SELECT statements) is currently not easily possible. KILL only supports ALL or a single job ID. One needs to manually gather all job ids and send a sequence of KILL statements.

Possible Solutions

Support a function (like PostgreSQL's pg_cancel_backend) so you can do something like:

SELECT kill(id)
FROM sys.jobs
WHERE stmt LIKE '%my_table%'
  AND username = 'u';

Considered Alternatives

No response

@hlcianfagna
Copy link
Contributor

Another alternative solution could be to support a generic function to run arbitrary commands against a foreign server, we could then configure a foreign server pointing to the same CrateDB instance and use it to run the dynamically generated KILL statements.

@mfussenegger
Copy link
Member

Could you elaborate a bit more on why there's a need to kill jobs like that?

@hammerhead
Copy link
Member Author

One example is the abortion of a benchmark run consisting of many queries, and you want to kill those before starting another run. The benchmark tool itself didn't manage to kill them itself. KILL ALL cannot be used, as there is also parallel ingestion which should remain running. As the benchmark is using a particular user, it is handy to kill all queries of that user.

@matriv
Copy link
Contributor

matriv commented May 15, 2024

I'm only guessing here, but I could see it being handy for a CrateDB user that connects to the DB with different users, one doing insertion, one doing dashboard queries, one doing heavy analytic queries, etc. and you want to kill for example all heavy analytic queries that are saturating the cluster under some heavy ingest load peak.

Could also be that you want to do something similar based on sql statements filtering, if for example someone is experimenting (on a prod cluster) with new analytic queries that are long-running/resource-consuming, and you want to kill them. (if you have hundreds or thousands of jobs in sys.jobs table won't be easy to filter out by scrolling)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants