-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Experimental: Add JobList constraint parser and support constraint query strings in flux jobs -f
#5711
base: master
Are you sure you want to change the base?
Conversation
Problem: Several lines were > 80 chars in length. Fix violating lines.
Problem: Several timestamps were initialized to 0 instead of 0.0, which is what the rest of the code does. For consistency, initialize to 0.0.
Problem: Not all constraint operators are covered in state_match tests. Add a new test to cover the basics of every operator.
Problem: Several arguments to the JobList constructor are missing in documentation. Update class documentation to list the missing arguments.
Problem: In the near future, some initialization will have to be done for constraint matching. However, no information can currently be passed to the list_constraint_create() function. Add infrastructure to allow a match context to be initialized on job-list load and can be passed to list_constraint_create(). While not presently used, it will allow easier adjustments to constraint matching in the future.
Problem: In the near future it'd be convenient to do calculations on the job nodelist, but it often needs to be in a hostlist data structure for processing. We'd like to avoid converting the nodelist to hostlist struct over and over again. Add a field into the job_data struct to hold a cached version of the nodelist in a hostlist struct.
Problem: It would be convenient to filter jobs based on the nodes they ran on. Add a constraint operator "hostlist" to filter on nodes within the job nodelist. Multiple nodes can be specified. Hostlists represented in RFC29 format are acceptable for input to the constraint.
Problem: There are no unit tests for the new 'hostlist' constraint operator. Add tests in match and state_match tests.
Problem: In t2260-job-list.t, some test jobs are submitted using "flux job submit". This makes it inconvenient to set job requirements on those jobs, such as specific nodes the jobs should run on. Solution: Convert all uses of "flux job submit" to use "flux submit".
Problem: In the near future we would like to filter jobs based on nodes in a job's nodelist. This would be an issue in the current test setup because test brokers all run under the same host and it is unknown which nodes jobs actually ran on. Solution: Setup fake hostnames for the test brokers in t2260-job-list.t and when necessary, run test jobs on specific hosts.
Problem: There is no testing / coverage for the new hostlist constraint in the job-list module. Add some tests to t2260-job-list.t.
Problem: The parse_datetime() utility function is statically configured to assume future dates when a term like "Friday" is given. That is, instead of giving the date for the previous Friday, the function will return then next Friday instead. This behavior should be configurable. Add an assumeFuture parameter to the function which defaults to True. If set to False, then parse_datetime() will assume dates in the past instead.
Problem: Values are always returned as strings by the ConstraintParser parser, but there are times when another type may be more suitable. Additionally, there is currently no way to reduce a set of values to a single element if this is required or beneficial in the result. Add a convert_values mapping to the ConstraintParser class. If an operator is in this dictionary, then the values of a term are passed to the provided callable, which should return a new list of values as a result.
Problem: flux_job_strtostate(3) and flux_job_strtoresult(3) are not exposed in the Python API. Add strtostate() and strtoresult() to flux.job.info.
Problem: There is no user friendly way to create constraint objects that can be used with the job-list service. Add a JobListConstraintParser class which can be used to create RFC 31 constraing objects suitable for sending to the job-list service.
Problem: There is no way to pass a constraint to flux.job.job_list() Add an optional constraint parameter to job_list() and job_list_inactive()
Problem: The flux-jobs(1) -f, --filter option is being updated to take a query string, but there is no "filter" based argparse action to handle this case. Add flux.util.FilterActionConcatenate which concatenates multiple --filter strings together, separated by space.
Problem: There is no way to pass extra constraints via the JobList interface. Add an optional constraint parameter to the JobList initializer. This paramter takes a string in constraint query syntax which is parsed by an instance of JobListConstraintParser. The result is then joined with ``and`` to any other constraints before being passed to the job-list module.
Problem: There is no way to provide an arbitrary constraint to flux-jobs(1). Update the `-f, --filter` option of flux-jobs(1) to take a query string that will be passed to the JobList constraint parameter instead of the filter paramter.
We should decide if adding the constraint query support to |
IMO I think the the advanced query via I think it's more of an open debate if likewise, with the advanced filter/query in place, dunno if we could view some current command line options worthy of retirement (will I guess hidden)? I think |
Agreed. However, we could provide this feature on another command (e.g. |
The issue with supporting the query via an option is that quoting can get annoying. Note that in this WIP PR, I've allowed multiple $ flux jobs -f host:pi4 -f completed
JOBID USER NAME ST NTASKS NNODES TIME INFO
ƒ5hsApJ31 grondo hostname CD 1 1 0.205s pi4
$ flux jobs -f host:pi4 -f active
JOBID USER NAME ST NTASKS NNODES TIME INFO
ƒ5YsGD86K grondo sleep R 1 1 56.27s pi4 |
Ahhh I see your point now. Hmmm. I think there is value in One long term thought, what if inactive jobs are stored in a db long term. |
Yes, my hope is hat we can create a constraint->sql converter, but I haven't looked into it so that is a real concern. That might be an argument for keeping this kind of generic query syntax out of flux-jobs.
Yes, I definitely see your point (though I wouldn't use |
This is an experimental PR that adds a special
JobListConstraintParser
class to support a specialized constraint parser syntax forJobList
meant to be used withflux jobs -f, --filter
. This is built on top of @chu11's PR #5656.At this point, this is just a proof-of-concept of how this could work to extend the
-f, --filter
option offlux jobs
to support generalized constraint queries while being backwards compatible.The
JobListConstraintParser
class is a two pass parser. The first pass converts tokens without an operator to states and/or results bitmasks, usingor
to join these together. The first pass can also be used to do other convenience conversions (likehost:
orhosts:
tohostlist:
, or as is done in this prototype@dt
converted to(not (t_cleanup:<dt or t_run:>dt))
to find jobs that were running at a given time). The second pass converts the now modified query string to a JSON constraint object suitable for thejob-list
service.Finally, a
constraint
parameter is added to the Python job listing interfaces, andflux job list -f, --filter
is converted to pass the option argument as a constraint.e.g.:
after running some sleep jobs: