Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to specify lower and upper bound for type array in WorkflowInputParameter #907

Open
emmanuelmathot opened this issue Jun 16, 2020 · 1 comment

Comments

@emmanuelmathot
Copy link

We would like to know if it is possible to specify a lower and upper boundary in the number of items in array of an input parameter. That specification would of course imply a validation of the number of input parameter.

So the input definition in CWL would be similar to :

cwlVersion: v1.0
class: CommandLineTool
inputs:
  main_input:
    type: string
    label: Main input

  correlated_input:
    type:
      type: array
      items: string
      min_items: 2
      max_items: 10
    label: 2 to 10 correlated inputs

baseCommand: echo
outputs: []

Thank you for your help.

@mr-c
Copy link
Member

mr-c commented Jul 4, 2020

Hello @emmanuelmathot . There is a proposal to add this to CWL but it hasn't been implemented yet.

Assistance with implementing the proposal is welcome, just leave a comment in that issue (linked above) to indicate your interest and to ask us for the next steps.

Here are options for a workaround that will get the job done on all CWL v1.x compliant systems today:

  1. For each input, add a valueFrom under inputBinding that is a CWL Expression. Here you write JavaScript (really ECMAScript 5.1) to examine the self variable to see if the provided value meets your restrictions; if it doesn't then throw a JavaScript exception which will halt execution. If it passes inspection then return self;. For your example it could be
cwlVersion: v1.0
class: CommandLineTool
requirements:
  InlineJavascriptRequirement: {}
inputs:
  main_input:
    type: string
    label: Main input

  correlated_input:
    type:
      type: array
      items: string
    inputBinding:
      valueFrom: |
        ${
          if (self.length > 2 && self.length <= 10) {
            return self;
          } else {
            throw "value is out of bounds!";
          }
         }
  1. If the goal is halting the execution of a workflow for inputs that don't meet the restrictions and you want that to happen sooner rather than later then you can consider copying the expression into the valueFrom inside the in portion of the Workflow step for the input in question; the expression shouldn't need any changes. For your example, that might look like
cwlVersion: v1.0
class: Workflow
requirements:
  InlineJavascriptRequirement: {}
inputs:
  corr: string[]
steps:
  correlation:
    run: correlate.cwl
    in:
      correlated_input:
        source: corr
        valueFrom: |
          ${
            if (self.length > 2 && self.length <= 10) {
              return self;
            } else {
              throw "value is out of bounds!";
            }
           }
    out: []
  1. If multiple steps use the same user provided or intermediate value and you want the same checks run on the input before using it, then you can use an ExpressionTool which is basically a standalone CWL Expression. Most workflow engines that understand CWL run ExpressionTools much faster than the equivalent CommandLineTools. In the basic form it has one input (corresponding to the input or intermediate value that you want to validate) and it has output (which will be the same value). The expression field works the same way as the valueFroms above: it throws an exception if the input doesn't meet the restrictions and returns it if it does. The differences are that you don't use self. Let's say the input is named value and the output is named validated_value, then the expression looks like ${ if (inputs.value.length > 2 && inputs.value.length <= 10) { return { "validated_value": inputs.value}; } else { throw "value is out of bounds!";} }.
    Once the ExpressionTool is written then insert it in your workflow so that it receives the particular input and all other steps that need that input validated connect to its sole output (`validated_value" in the example). This option can be done instead of 1&2, or in addition to them.
  2. (Optional but nice to have) Explain the restrictions in English (or the human languages of your choice) in the doc or label to reduce surprises for those using your CommandLineTool without looking at the source.

Once you get comfortable doing this you are partially on your way to implementing the proposed extension, as we can help you write a tool that takes CWL documents that use the proposed syntax and inserts the valueFroms and/or ExpressionTools automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants