Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Experimental: Add JobList constraint parser and support constraint query strings in flux jobs -f #5711

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

grondo
Copy link
Contributor

@grondo grondo commented Jan 29, 2024

This is an experimental PR that adds a special JobListConstraintParser class to support a specialized constraint parser syntax for JobList meant to be used with flux jobs -f, --filter. This is built on top of @chu11's PR #5656.

At this point, this is just a proof-of-concept of how this could work to extend the -f, --filter option of flux jobs to support generalized constraint queries while being backwards compatible.

The JobListConstraintParser class is a two pass parser. The first pass converts tokens without an operator to states and/or results bitmasks, using or to join these together. The first pass can also be used to do other convenience conversions (like host: or hosts: to hostlist:, or as is done in this prototype @dt converted to (not (t_cleanup:<dt or t_run:>dt)) to find jobs that were running at a given time). The second pass converts the now modified query string to a JSON constraint object suitable for the job-list service.

Finally, a constraint parameter is added to the Python job listing interfaces, and flux job list -f, --filter is converted to pass the option argument as a constraint.

e.g.:

ƒ(s=5,d=1,builddir) grondo@pi3:~/git/flux-core.git$ flux jobs -f host:pi0
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
   ƒ4RtqYfo5 grondo   hostname   CD      1      1   0.259s pi0
   ƒ4Rtgek71 grondo   hostname   CD      1      1   0.274s pi0
ƒ(s=5,d=1,builddir) grondo@pi3:~/git/flux-core.git$ flux jobs -f host:pi[2-3]
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
   ƒ4Rts2f5R grondo   hostname   CD      1      1   0.303s pi2
   ƒ4Rtm6hx3 grondo   hostname   CD      1      1   0.304s pi2
   ƒ4RtnahEQ grondo   hostname   CD      1      1   0.240s pi3
   ƒ4RtfAkpf grondo   hostname   CD      1      1   0.235s pi3

after running some sleep jobs:

ƒ(s=5,d=1,builddir) grondo@pi3:~/git/flux-core.git$ flux jobs -f 'host:pi0 running'
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
   ƒ6gveP5Us grondo   sleep       R      1      1   3.658s pi0
   ƒ6gvYT8MV grondo   sleep       R      1      1   3.673s pi0
ƒ(s=5,d=1,builddir) grondo@pi3:~/git/flux-core.git$ flux jobs -f 'host:pi0 completed'
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
   ƒ4RtqYfo5 grondo   hostname   CD      1      1   0.259s pi0
   ƒ4Rtgek71 grondo   hostname   CD      1      1   0.274s pi0

chu11 and others added 19 commits January 15, 2024 23:13
Problem: Several lines were > 80 chars in length.

Fix violating lines.
Problem: Several timestamps were initialized to 0 instead of 0.0,
which is what the rest of the code does.

For consistency, initialize to 0.0.
Problem: Not all constraint operators are covered in state_match
tests.

Add a new test to cover the basics of every operator.
Problem: Several arguments to the JobList constructor are
missing in documentation.

Update class documentation to list the missing arguments.
Problem: In the near future, some initialization will have to be
done for constraint matching.  However, no information can currently
be passed to the list_constraint_create() function.

Add infrastructure to allow a match context to be initialized on
job-list load and can be passed to list_constraint_create().  While
not presently used, it will allow easier adjustments to constraint
matching in the future.
Problem: In the near future it'd be convenient to do calculations
on the job nodelist, but it often needs to be in a hostlist
data structure for processing.  We'd like to avoid converting the
nodelist to hostlist struct over and over again.

Add a field into the job_data struct to hold a cached version of the
nodelist in a hostlist struct.
Problem: It would be convenient to filter jobs based on the nodes
they ran on.

Add a constraint operator "hostlist" to filter on nodes within the job
nodelist.  Multiple nodes can be specified.  Hostlists represented in
RFC29 format are acceptable for input to the constraint.
Problem: There are no unit tests for the new 'hostlist' constraint
operator.

Add tests in match and state_match tests.
Problem: In t2260-job-list.t, some test jobs are submitted using
"flux job submit".  This makes it inconvenient to
set job requirements on those jobs, such as specific nodes
the jobs should run on.

Solution: Convert all uses of "flux job submit" to use "flux submit".
Problem: In the near future we would like to filter jobs based
on nodes in a job's nodelist.  This would be an issue in the current
test setup because test brokers all run under the same host and
it is unknown which nodes jobs actually ran on.

Solution: Setup fake hostnames for the test brokers in
t2260-job-list.t and when necessary, run test jobs on specific
hosts.
Problem: There is no testing / coverage for the new hostlist constraint
in the job-list module.

Add some tests to t2260-job-list.t.
Problem: The parse_datetime() utility function is statically configured
to assume future dates when a term like "Friday" is given. That is,
instead of giving the date for the previous Friday, the function will
return then next Friday instead. This behavior should be configurable.

Add an assumeFuture parameter to the function which defaults to True.
If set to False, then parse_datetime() will assume dates in the past
instead.
Problem: Values are always returned as strings by the ConstraintParser
parser, but there are times when another type may be more
suitable. Additionally, there is currently no way to reduce a set
of values to a single element if this is required or beneficial in
the result.

Add a convert_values mapping to the ConstraintParser class. If an
operator is in this dictionary, then the values of a term are passed
to the provided callable, which should return a new list of values
as a result.
Problem: flux_job_strtostate(3) and flux_job_strtoresult(3) are not
exposed in the Python API.

Add strtostate() and strtoresult() to flux.job.info.
Problem: There is no user friendly way to create constraint objects
that can be used with the job-list service.

Add a JobListConstraintParser class which can be used to create
RFC 31 constraing objects suitable for sending to the job-list service.
Problem: There is no way to pass a constraint to flux.job.job_list()

Add an optional constraint parameter to job_list() and
job_list_inactive()
Problem: The flux-jobs(1) -f, --filter option is being updated to
take a query string, but there is no "filter" based argparse action
to handle this case.

Add flux.util.FilterActionConcatenate which concatenates multiple
--filter strings together, separated by space.
Problem: There is no way to pass extra constraints via the JobList
interface.

Add an optional constraint parameter to the JobList initializer. This
paramter takes a string in constraint query syntax which is parsed
by an instance of JobListConstraintParser. The result is then joined
with ``and`` to any other constraints before being passed to the
job-list module.
Problem: There is no way to provide an arbitrary constraint to
flux-jobs(1).

Update the `-f, --filter` option of flux-jobs(1) to take a query string
that will be passed to the JobList constraint parameter instead of
the filter paramter.
@grondo
Copy link
Contributor Author

grondo commented Jan 30, 2024

We should decide if adding the constraint query support to -f, --filter is the right approach. It is more powerful, and flux-jobs(1) automatically gets support for the match operators supported by job-list, but on the other hand it is slightly less convenient in the common case, and the command already has option-based constraints like --since, --user, --name and --queue - so maybe the more consistent thing is to add --hosts and/or --ranks as @chu11 had initially proposed. For a more flexible and powerful interface, users could use flux-pgrep(1) which can parse the constraint query as free arguments which is slightly more convenient.

@chu11
Copy link
Member

chu11 commented Jan 30, 2024

We should decide if adding the constraint query support to -f, --filter is the right approach. It is more powerful, and flux-jobs(1) automatically gets support for the match operators supported by job-list, but on the other hand it is slightly less convenient in the common case, and the command already has option-based constraints like --since, --user, --name and --queue - so maybe the more consistent thing is to add --hosts and/or --ranks as @chu11 had initially proposed.

IMO I think the the advanced query via --filter is definitely a go no matter what. It provides the advanced query that the command line options can't provide. Like hypothetically user:42 and states:active OR user:43 and states:inactive, etc.

I think it's more of an open debate if --hosts and/or --ranks should be done. Will those be common / popular enough to warrant am option on the command line? I'm not sure.

likewise, with the advanced filter/query in place, dunno if we could view some current command line options worthy of retirement (will I guess hidden)? I think --name, --user, and --queue are popular enough keepers. --since is more debatable?

@grondo
Copy link
Contributor Author

grondo commented Jan 30, 2024

IMO I think the the advanced query via --filter is definitely a go no matter what. It provides the advanced query that the command line options can't provide. Like hypothetically user:42 and states:active OR user:43 and states:inactive, etc.

Agreed. However, we could provide this feature on another command (e.g. flux pgrep) and keep the flux jobs interface simple. However, we could also do both, that's fine with me.

@grondo
Copy link
Contributor Author

grondo commented Jan 30, 2024

The issue with supporting the query via an option is that quoting can get annoying. Note that in this WIP PR, I've allowed multiple -f, --filter options to flux-jobs(1) supported with multiple use combined with and, e.g.

$ flux jobs -f host:pi4 -f completed
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
   ƒ5hsApJ31 grondo   hostname   CD      1      1   0.205s pi4
$ flux jobs -f host:pi4 -f active
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
   ƒ5YsGD86K grondo   sleep       R      1      1   56.27s pi4

@chu11
Copy link
Member

chu11 commented Jan 30, 2024

Agreed. However, we could provide this feature on another command (e.g. flux pgrep) and keep the flux jobs interface simple. However, we could also do both, that's fine with me.

Ahhh I see your point now. Hmmm. I think there is value in flux pgrep (or another tool) without all of those extra filter options possibly adding complexity / corner cases as a result.

One long term thought, what if inactive jobs are stored in a db long term. flux pgrep certainly could be used, although it feels not like the right tool (only going by name and nothing else, like sacct vs flux pgrep ... maybe it would just take time to sink in).

@grondo
Copy link
Contributor Author

grondo commented Jan 30, 2024

what if inactive jobs are stored in a db long term.

Yes, my hope is hat we can create a constraint->sql converter, but I haven't looked into it so that is a real concern. That might be an argument for keeping this kind of generic query syntax out of flux-jobs.

flux pgrep certainly could be used, although it feels not like the right tool (only going by name and nothing else, like sacct vs flux pgrep

Yes, I definitely see your point (though I wouldn't use sacct as a good example since it doesn't really have anything to do with querying jobs either). On the other hand, nothing is more annoying to users than having a proliferation of commands that do almost the same thing. flux-pgrep would at least mirror pgrep in that it would "look up jobs based on name and other attributes" (literally the description of pgrep)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants