How to specify lower and upper bound for type `array` in WorkflowInputParameter #907

emmanuelmathot · 2020-06-16T13:34:49Z

We would like to know if it is possible to specify a lower and upper boundary in the number of items in array of an input parameter. That specification would of course imply a validation of the number of input parameter.

So the input definition in CWL would be similar to :

cwlVersion: v1.0
class: CommandLineTool
inputs:
  main_input:
    type: string
    label: Main input

  correlated_input:
    type:
      type: array
      items: string
      min_items: 2
      max_items: 10
    label: 2 to 10 correlated inputs

baseCommand: echo
outputs: []

Thank you for your help.

The text was updated successfully, but these errors were encountered:

mr-c · 2020-07-04T09:19:01Z

Hello @emmanuelmathot . There is a proposal to add this to CWL but it hasn't been implemented yet.

Assistance with implementing the proposal is welcome, just leave a comment in that issue (linked above) to indicate your interest and to ask us for the next steps.

Here are options for a workaround that will get the job done on all CWL v1.x compliant systems today:

For each input, add a valueFrom under inputBinding that is a CWL Expression. Here you write JavaScript (really ECMAScript 5.1) to examine the self variable to see if the provided value meets your restrictions; if it doesn't then throw a JavaScript exception which will halt execution. If it passes inspection then return self;. For your example it could be

cwlVersion: v1.0
class: CommandLineTool
requirements:
  InlineJavascriptRequirement: {}
inputs:
  main_input:
    type: string
    label: Main input

  correlated_input:
    type:
      type: array
      items: string
    inputBinding:
      valueFrom: |
        ${
          if (self.length > 2 && self.length <= 10) {
            return self;
          } else {
            throw "value is out of bounds!";
          }
         }

If the goal is halting the execution of a workflow for inputs that don't meet the restrictions and you want that to happen sooner rather than later then you can consider copying the expression into the valueFrom inside the in portion of the Workflow step for the input in question; the expression shouldn't need any changes. For your example, that might look like

cwlVersion: v1.0
class: Workflow
requirements:
  InlineJavascriptRequirement: {}
inputs:
  corr: string[]
steps:
  correlation:
    run: correlate.cwl
    in:
      correlated_input:
        source: corr
        valueFrom: |
          ${
            if (self.length > 2 && self.length <= 10) {
              return self;
            } else {
              throw "value is out of bounds!";
            }
           }
    out: []

If multiple steps use the same user provided or intermediate value and you want the same checks run on the input before using it, then you can use an ExpressionTool which is basically a standalone CWL Expression. Most workflow engines that understand CWL run ExpressionTools much faster than the equivalent CommandLineTools. In the basic form it has one input (corresponding to the input or intermediate value that you want to validate) and it has output (which will be the same value). The expression field works the same way as the valueFroms above: it throws an exception if the input doesn't meet the restrictions and returns it if it does. The differences are that you don't use self. Let's say the input is named value and the output is named validated_value, then the expression looks like ${ if (inputs.value.length > 2 && inputs.value.length <= 10) { return { "validated_value": inputs.value}; } else { throw "value is out of bounds!";} }.
Once the ExpressionTool is written then insert it in your workflow so that it receives the particular input and all other steps that need that input validated connect to its sole output (`validated_value" in the example). This option can be done instead of 1&2, or in addition to them.
(Optional but nice to have) Explain the restrictions in English (or the human languages of your choice) in the doc or label to reduce surprises for those using your CommandLineTool without looking at the source.

Once you get comfortable doing this you are partially on your way to implementing the proposed extension, as we can help you write a tool that takes CWL documents that use the proposed syntax and inserts the valueFroms and/or ExpressionTools automatically.

mr-c mentioned this issue Jul 4, 2020

input value restrictions / validations #764

Open

mr-c mentioned this issue Feb 16, 2022

Enforce that Array[Type]+ are non-empty common-workflow-lab/wdl-cwl-translator#187

Merged

fmigneault mentioned this issue Sep 9, 2023

fix invalid literal data numerics conversion and validation between WPS/OAS/CWL representations crim-ca/weaver#558

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to specify lower and upper bound for type `array` in WorkflowInputParameter #907

How to specify lower and upper bound for type `array` in WorkflowInputParameter #907

emmanuelmathot commented Jun 16, 2020

mr-c commented Jul 4, 2020 •

edited

How to specify lower and upper bound for type array in WorkflowInputParameter #907

How to specify lower and upper bound for type array in WorkflowInputParameter #907

Comments

emmanuelmathot commented Jun 16, 2020

mr-c commented Jul 4, 2020 • edited

How to specify lower and upper bound for type `array` in WorkflowInputParameter #907

How to specify lower and upper bound for type `array` in WorkflowInputParameter #907

mr-c commented Jul 4, 2020 •

edited