Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify evaluation policy of Workflow level requirements #157

Open
Tracked by #54
GlassOfWhiskey opened this issue Apr 6, 2022 · 3 comments
Open
Tracked by #54

Clarify evaluation policy of Workflow level requirements #157

GlassOfWhiskey opened this issue Apr 6, 2022 · 3 comments
Milestone

Comments

@GlassOfWhiskey
Copy link
Collaborator

The CWL standard says

Requirements specified in a parent Workflow are inherited by step processes if they are valid for that step

Such sentence is a bit ambiguous, in that it does not specify if the evaluation is eager (at the Workflow level) or lazy (at the inner step level). For ResourceRequirements, cwlltool (and StreamFlow, too) adopts a lazy evaluation policy, so that expressions are evaluated at the CommandLineTool level. This can generate unexpected behaviours with unconnected inputs, as reported in #1330.
The sentence should then be clarified, establishing the desired evaluation policy for this kind of requirements.

@kinow
Copy link
Member

kinow commented Jun 2, 2022

I think your are correct about cwltool doing the lazy evaluation, skipping the top level workflow - if present - https://github.com/common-workflow-language/cwltool/blob/2be19f3291f07653013b0a91662f518c33cf9fe4/cwltool/process.py#L973-L975

But I am not sure if there won't be cases where an implementation may prefer a lazy evaluation (e.g. maybe a case where an implementation wants to call a service to dynamically compute resources? like a batch scheduler?).

Perhaps the specification should say instead that it is up to the implementers to decide whether to do lazy or eager evaluation of resources, but suggest for implementers to clarify in their documentation what was decided. So in this case we could document in cwltool whether we want lazy or eager, and why.

For cwltool I think the issue linked could be fixed both by doing a eager evaluation, or by trying to evaluate the resource expressions lazily with a fallback to fetching inputs from parent (more complicated solution, and prone to errors/bugs I guess).

@tom-tan
Copy link
Member

tom-tan commented Jun 2, 2022

I understand there are use cases for both of eager and lazy evaluations in the process requirements.

However, as a long term solution, I against to add "eager" and "lazy" evaluations to the spec because we already have the specification for similar use cases: WorkflowStep#requirements to pass the parent process requirements to a child.

Unfortunately the current spec does not support the use cases because:

  • it is not clear whether the inherited process requirements are evaluated in the parent context or the child context, and
  • the spec lacks the way to distinguish the parent input object and child input object in WorkflowStep#requirements.

How about extending the spec of WorkflowStep#requirements to fix this issue?
For example, how about introducing the way to refer the parent input object and child input object as follows?

  • use inputs to refer the parent input object, and
  • use self to refer the child input object in WorkflowStep#requirements.

In this proposal, we only need eager evaluations and therefore it is easier to understand the spec and also easier to implement the platforms.

class: Workflow

inputs:
  ncores: int

steps:
  step1:
    run: tool.cwl
    in:
      mem:
        default: 1024 # 1GB
    requirements:
      ResourceRequirement:
        coresMin: $(inputs.ncores) # refer the parent input object with `inputs`
        ramMin: $(self.mem)        # refer the child input object with `self`
   ...

Note:

  • We can safely use self for this purpose because no existing process requirements in the current spec use self in their expressions.
  • There is another proposal to refer the child input object: use self.inputs. It makes easier to extend the spec to refer the child runtime information with self.runtime.cores but I am not sure there is a use case for that.

@mr-c
Copy link
Member

mr-c commented Apr 17, 2023

@tetron proposes deferring this to CWL v1.3, and adding an upgrader to copy down any dynamic requirements/hints to each workflow step, for evaluation in that context.

For CWL v1.2.1, he proposes documenting the existing behavior.

We should see if there are existing conformance test that touch on the "old" way

@mr-c mr-c added this to the 1.2.1 milestone Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants