Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema salad error validating CWL enum definitions #1908

Open
fmigneault opened this issue Sep 15, 2023 · 2 comments
Open

Schema salad error validating CWL enum definitions #1908

fmigneault opened this issue Sep 15, 2023 · 2 comments

Comments

@fmigneault
Copy link
Contributor

fmigneault commented Sep 15, 2023

Description

I have a CWL definition as presented below that reuses type: enum in multiple definitions to allow null (optional), single-value and an array of those values for some of my inputs.

Expected Behavior

The CWL definition should be valid.

Actual Behavior

For some reason, schema salad raises an error indicating that 'name': 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920' (schema auto-generated from scenario input definition?) already exists. I wonder if this is a false-positive because of the definitions repeats the exact same enum portion for the single value vs multi-value array types?

type: 
  - "null"
  - {"type": "enum", "symbols": [...]}
  - type: array
    items: {"type": "enum", "symbols": [...]}

Workflow Code

cwlVersion: v1.0
class: CommandLineTool
hints:
  WPS1Requirement:
    provider: https://finch.crim.ca/wps
    process: ensemble_grid_point_wetdays
requirements:
  InlineJavascriptRequirement: {}
inputs:
- id: lat
  type:
  - string
  - type: array
    items: string
- id: lon
  type:
  - string
  - type: array
    items: string
- id: start_date
  type:
  - 'null'
  - string
- id: end_date
  type:
  - 'null'
  - string
- id: ensemble_percentiles
  type:
  - 'null'
  - string
  default: 10,50,90
- id: average
  type:
  - 'null'
  - boolean
  default: false
- id: dataset
  type:
  - 'null'
  - type: enum
    symbols:
    - humidex-daily
    - candcs-u5
    - candcs-u6
    - bccaqv2
  default: candcs-u5
- id: scenario
  type:
  - 'null'
  - type: enum
    symbols:
    - ssp126
    - rcp85
    - rcp45
    - rcp26
    - ssp585
    - ssp245
  - type: array
    items:
      type: enum
      symbols:
      - ssp126
      - rcp85
      - rcp45
      - rcp26
      - ssp585
      - ssp245
- id: models
  type:
  - 'null'
  - type: enum
    symbols:
    - KACE-1-0-G
    - CCSM4
    - MIROC5
    - EC-Earth3-Veg
    - TaiESM1
    - GFDL-ESM4
    - GFDL-CM3
    - CanESM5
    - HadGEM3-GC31-LL
    - INM-CM4-8
    - IPSL-CM5A-MR
    - EC-Earth3
    - GFDL-ESM2G
    - humidex_models
    - GFDL-ESM2M
    - MIROC-ESM
    - CSIRO-Mk3-6-0
    - MPI-ESM-LR
    - NorESM1-M
    - CNRM-CM5
    - all
    - GISS-E2-1-G
    - 24models
    - MPI-ESM1-2-HR
    - CNRM-ESM2-1
    - CNRM-CM6-1
    - CanESM2
    - FGOALS-g3
    - NorESM1-ME
    - IPSL-CM6A-LR
    - CMCC-ESM2
    - pcic12
    - EC-Earth3-Veg-LR
    - ACCESS-ESM1-5
    - MRI-CGCM3
    - MIROC-ESM-CHEM
    - NorESM2-MM
    - bcc-csm1-1-m
    - BNU-ESM
    - UKESM1-0-LL
    - CESM1-CAM5
    - MIROC-ES2L
    - MRI-ESM2-0
    - HadGEM2-ES
    - MIROC6
    - MPI-ESM-MR
    - INM-CM5-0
    - bcc-csm1-1
    - BCC-CSM2-MR
    - ACCESS-CM2
    - NorESM2-LM
    - IPSL-CM5A-LR
    - FGOALS-g2
    - HadGEM2-AO
    - 26models
    - MPI-ESM1-2-LR
    - KIOST-ESM
  - type: array
    items:
      type: enum
      symbols:
      - KACE-1-0-G
      - CCSM4
      - MIROC5
      - EC-Earth3-Veg
      - TaiESM1
      - GFDL-ESM4
      - GFDL-CM3
      - CanESM5
      - HadGEM3-GC31-LL
      - INM-CM4-8
      - IPSL-CM5A-MR
      - EC-Earth3
      - GFDL-ESM2G
      - humidex_models
      - GFDL-ESM2M
      - MIROC-ESM
      - CSIRO-Mk3-6-0
      - MPI-ESM-LR
      - NorESM1-M
      - CNRM-CM5
      - all
      - GISS-E2-1-G
      - 24models
      - MPI-ESM1-2-HR
      - CNRM-ESM2-1
      - CNRM-CM6-1
      - CanESM2
      - FGOALS-g3
      - NorESM1-ME
      - IPSL-CM6A-LR
      - CMCC-ESM2
      - pcic12
      - EC-Earth3-Veg-LR
      - ACCESS-ESM1-5
      - MRI-CGCM3
      - MIROC-ESM-CHEM
      - NorESM2-MM
      - bcc-csm1-1-m
      - BNU-ESM
      - UKESM1-0-LL
      - CESM1-CAM5
      - MIROC-ES2L
      - MRI-ESM2-0
      - HadGEM2-ES
      - MIROC6
      - MPI-ESM-MR
      - INM-CM5-0
      - bcc-csm1-1
      - BCC-CSM2-MR
      - ACCESS-CM2
      - NorESM2-LM
      - IPSL-CM5A-LR
      - FGOALS-g2
      - HadGEM2-AO
      - 26models
      - MPI-ESM1-2-LR
      - KIOST-ESM
  default: all
- id: thresh
  type:
  - 'null'
  - string
  default: 1.0 mm/day
- id: freq
  type:
  - 'null'
  - type: enum
    symbols:
    - YS
    - QS-DEC
    - AS-JUL
    - MS
  default: YS
- id: op
  type:
  - 'null'
  - type: enum
    symbols:
    - '>='
    - '>'
    - gt
    - ge
  default: '>='
- id: month
  type:
  - 'null'
  - int
  - type: array
    items: int
  inputBinding:
    valueFrom: "\n            ${\n                const values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12];\n                if (Array.isArray(self)) {\n                    \n        if (self.every(item => values.includes(item))) {\n            return self;\n        }\n        else {\n            throw \"invalid value(s) in [\" + self + \"] are not all allowed values from [\" + values + \"]\";\n        }\n    \n                }\n                else {\n                    \n        if (values.includes(self)) {\n            return self;\n        }\n        else {\n            throw \"invalid value \" + self + \" is not an allowed value from [\" + values + \"]\";\n        }\n    \n                }\n            }\n        "
- id: season
  type:
  - 'null'
  - type: enum
    symbols:
    - SON
    - MAM
    - JJA
    - DJF
- id: check_missing
  type:
  - 'null'
  - type: enum
    symbols:
    - pct
    - at_least_n
    - wmo
    - skip
    - from_context
    - any
  default: any
- id: missing_options
  type:
  - 'null'
  - File
  format: iana:application/json
- id: cf_compliance
  type:
  - 'null'
  - type: enum
    symbols:
    - raise
    - log
    - warn
  default: warn
- id: data_validation
  type:
  - 'null'
  - type: enum
    symbols:
    - raise
    - log
    - warn
  default: raise
- id: output_name
  type:
  - 'null'
  - string
- id: output_format
  type:
  - 'null'
  - type: enum
    symbols:
    - csv
    - netcdf
  default: netcdf
- id: csv_precision
  type:
  - 'null'
  - int
outputs:
- id: output
  type: File
  format: iana:application/zip
  outputBinding:
    glob: output/*.zip
- id: output_log
  type: File
  format: edam:format_1964
  outputBinding:
    glob: output_log/*.*
$namespaces:
  iana: https://www.iana.org/assignments/media-types/
  edam: http://edamontology.org/

Full Traceback

[2023-09-15 17:58:32,863] ERROR    [MainThread][cwltool] Got workflow error
Traceback (most recent call last):
  File "schema_salad/avro/schema.py", line 405, in __init__
  File "schema_salad/avro/schema.py", line 595, in make_avsc_object
  File "schema_salad/avro/schema.py", line 373, in __init__
  File "schema_salad/avro/schema.py", line 253, in __init__
  File "schema_salad/avro/schema.py", line 219, in add_name
schema_salad.avro.schema.SchemaParseException: The name 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920' is already in use.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/executors.py", line 218, in run_jobs
    for job in jobiter:
  File "/home/francis/dev/weaver/weaver/processes/wps_workflow.py", line 133, in job
    builder = self._init_job(job_order, runtime_context)
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/process.py", line 888, in _init_job
    builder.bind_input(
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/builder.py", line 330, in bind_input
    self.bind_input(
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/builder.py", line 249, in bind_input
    avsc = make_avsc_object(convert_to_dict(t), self.names)
  File "schema_salad/avro/schema.py", line 608, in make_avsc_object
  File "schema_salad/avro/schema.py", line 407, in __init__
schema_salad.avro.schema.SchemaParseException: Items schema ({'type': 'enum', 'symbols': ['rcp85', 'ssp585', 'rcp45', 'rcp26', 'ssp126', 'ssp245'], 'name': 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920'}) not a valid Avro schema: The name 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920' is already in use.. Known names: ['org.w3id.cwl.cwl.File', 'org.w3id.cwl.cwl.File.class.File_class', 'org.w3id.cwl.cwl.Directory', 'org.w3id.cwl.cwl.Directory.class.Directory_class', 'org.w3id.cwl.salad.Any', 'input_record_schema', 'dataset90820b54-cca1-4b50-a6b8-558586135e1d', 'scenario6ece3ca2-f45b-42a7-bf85-95a22443a85e', 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920', 'models525a2682-565c-4dde-ace5-76ccfe6254f3', 'models6861d088-8f09-43fa-89d1-b18b587e1d22', 'freqc7b201ea-ef19-4831-a3ca-18b0b7381e71', 'op8846f2fd-6b5f-4e85-9722-dd220a0df24d', 'seasonde3dd417-c9e6-412a-bf73-e4e1b9e7382f', 'check_missing75883605-bd85-47fc-8e0d-c5ed227b7ee8', 'cf_compliance66d87015-cd60-4e20-b43c-f5edb890d758', 'data_validation0143577c-6989-487c-b5b5-3362b677d2ac', 'output_format0a80ac59-d5ff-414a-9b63-9c34ad11b6c7', 'outputs_record_schema']).
[2023-09-15 17:58:32,869] ERROR    [MainThread][weaver.processes.wps_package|ensemble_grid_point_wetdays]  10% failed     Failed package execution.
weaver.exceptions.PackageExecutionError: [Items schema ({'type': 'enum', 'symbols': ['rcp85', 'ssp585', 'rcp45', 'rcp26', 'ssp126', 'ssp245'], 'name': 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920'}) not a valid Avro schema: The name 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920' is already in use.. Known names: ['org.w3id.cwl.cwl.File', 'org.w3id.cwl.cwl.File.class.File_class', 'org.w3id.cwl.cwl.Directory', 'org.w3id.cwl.cwl.Directory.class.Directory_class', 'org.w3id.cwl.salad.Any', 'input_record_schema', 'dataset90820b54-cca1-4b50-a6b8-558586135e1d', 'scenario6ece3ca2-f45b-42a7-bf85-95a22443a85e', 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920', 'models525a2682-565c-4dde-ace5-76ccfe6254f3', 'models6861d088-8f09-43fa-89d1-b18b587e1d22', 'freqc7b201ea-ef19-4831-a3ca-18b0b7381e71', 'op8846f2fd-6b5f-4e85-9722-dd220a0df24d', 'seasonde3dd417-c9e6-412a-bf73-e4e1b9e7382f', 'check_missing75883605-bd85-47fc-8e0d-c5ed227b7ee8', 'cf_compliance66d87015-cd60-4e20-b43c-f5edb890d758', 'data_validation0143577c-6989-487c-b5b5-3362b677d2ac', 'output_format0a80ac59-d5ff-414a-9b63-9c34ad11b6c7', 'outputs_record_schema']).]
Traceback (most recent call last):
  File "schema_salad/avro/schema.py", line 405, in __init__
  File "schema_salad/avro/schema.py", line 595, in make_avsc_object
  File "schema_salad/avro/schema.py", line 373, in __init__
  File "schema_salad/avro/schema.py", line 253, in __init__
  File "schema_salad/avro/schema.py", line 219, in add_name
schema_salad.avro.schema.SchemaParseException: The name 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920' is already in use.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/executors.py", line 218, in run_jobs
    for job in jobiter:
  File "/home/francis/dev/weaver/weaver/processes/wps_workflow.py", line 133, in job
    builder = self._init_job(job_order, runtime_context)
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/process.py", line 888, in _init_job
    builder.bind_input(
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/builder.py", line 330, in bind_input
    self.bind_input(
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/builder.py", line 249, in bind_input
    avsc = make_avsc_object(convert_to_dict(t), self.names)
  File "schema_salad/avro/schema.py", line 608, in make_avsc_object
  File "schema_salad/avro/schema.py", line 407, in __init__
schema_salad.avro.schema.SchemaParseException: Items schema ({'type': 'enum', 'symbols': ['rcp85', 'ssp585', 'rcp45', 'rcp26', 'ssp126', 'ssp245'], 'name': 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920'}) not a valid Avro schema: The name 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920' is already in use.. Known names: ['org.w3id.cwl.cwl.File', 'org.w3id.cwl.cwl.File.class.File_class', 'org.w3id.cwl.cwl.Directory', 'org.w3id.cwl.cwl.Directory.class.Directory_class', 'org.w3id.cwl.salad.Any', 'input_record_schema', 'dataset90820b54-cca1-4b50-a6b8-558586135e1d', 'scenario6ece3ca2-f45b-42a7-bf85-95a22443a85e', 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920', 'models525a2682-565c-4dde-ace5-76ccfe6254f3', 'models6861d088-8f09-43fa-89d1-b18b587e1d22', 'freqc7b201ea-ef19-4831-a3ca-18b0b7381e71', 'op8846f2fd-6b5f-4e85-9722-dd220a0df24d', 'seasonde3dd417-c9e6-412a-bf73-e4e1b9e7382f', 'check_missing75883605-bd85-47fc-8e0d-c5ed227b7ee8', 'cf_compliance66d87015-cd60-4e20-b43c-f5edb890d758', 'data_validation0143577c-6989-487c-b5b5-3362b677d2ac', 'output_format0a80ac59-d5ff-414a-9b63-9c34ad11b6c7', 'outputs_record_schema']).

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/francis/dev/weaver/weaver/processes/wps_package.py", line 1791, in _handler
    result = package_inst(**cwl_inputs)  # type: CWL_Results
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/factory.py", line 32, in __call__
    out, status = self.factory.executor(self.t, kwargs, runtime_context)
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/executors.py", line 62, in __call__
    return self.execute(process, job_order_object, runtime_context, logger)
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/executors.py", line 145, in execute
    self.run_jobs(process, job_order_object, logger, runtime_context)
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/executors.py", line 252, in run_jobs
    raise WorkflowException(str(err)) from err
cwltool.errors.WorkflowException: Items schema ({'type': 'enum', 'symbols': ['rcp85', 'ssp585', 'rcp45', 'rcp26', 'ssp126', 'ssp245'], 'name': 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920'}) not a valid Avro schema: The name 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920' is already in use.. Known names: ['org.w3id.cwl.cwl.File', 'org.w3id.cwl.cwl.File.class.File_class', 'org.w3id.cwl.cwl.Directory', 'org.w3id.cwl.cwl.Directory.class.Directory_class', 'org.w3id.cwl.salad.Any', 'input_record_schema', 'dataset90820b54-cca1-4b50-a6b8-558586135e1d', 'scenario6ece3ca2-f45b-42a7-bf85-95a22443a85e', 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920', 'models525a2682-565c-4dde-ace5-76ccfe6254f3', 'models6861d088-8f09-43fa-89d1-b18b587e1d22', 'freqc7b201ea-ef19-4831-a3ca-18b0b7381e71', 'op8846f2fd-6b5f-4e85-9722-dd220a0df24d', 'seasonde3dd417-c9e6-412a-bf73-e4e1b9e7382f', 'check_missing75883605-bd85-47fc-8e0d-c5ed227b7ee8', 'cf_compliance66d87015-cd60-4e20-b43c-f5edb890d758', 'data_validation0143577c-6989-487c-b5b5-3362b677d2ac', 'output_format0a80ac59-d5ff-414a-9b63-9c34ad11b6c7', 'outputs_record_schema']).
[2023-09-15 17:58:32,877] INFO     [MainThread][PYWPS] Removing temporary working directory: /tmp/weaver-hybrid/workdir/pywps_process_xb524u2x
[2023-09-15 17:58:32,877] INFO     [MainThread][weaver.processes.wps_package|ensemble_grid_point_wetdays]  10% failed     Package completed with errors. Server logs: [/tmp/weaver-hybrid/outputs/public/fa7d748f-bff0-4729-9a57-331d9402c232.log], Available at: [http://127.0.0.1:4003/fa7d748f-bff0-4729-9a57-331d9402c232.log]
Traceback (most recent call last):
  File "schema_salad/avro/schema.py", line 405, in __init__
  File "schema_salad/avro/schema.py", line 595, in make_avsc_object
  File "schema_salad/avro/schema.py", line 373, in __init__
  File "schema_salad/avro/schema.py", line 253, in __init__
  File "schema_salad/avro/schema.py", line 219, in add_name
schema_salad.avro.schema.SchemaParseException: The name 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920' is already in use.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/executors.py", line 218, in run_jobs
    for job in jobiter:
  File "/home/francis/dev/weaver/weaver/processes/wps_workflow.py", line 133, in job
    builder = self._init_job(job_order, runtime_context)
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/process.py", line 888, in _init_job
    builder.bind_input(
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/builder.py", line 330, in bind_input
    self.bind_input(
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/builder.py", line 249, in bind_input
    avsc = make_avsc_object(convert_to_dict(t), self.names)
  File "schema_salad/avro/schema.py", line 608, in make_avsc_object
  File "schema_salad/avro/schema.py", line 407, in __init__
schema_salad.avro.schema.SchemaParseException: Items schema ({'type': 'enum', 'symbols': ['rcp85', 'ssp585', 'rcp45', 'rcp26', 'ssp126', 'ssp245'], 'name': 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920'}) not a valid Avro schema: The name 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920' is already in use.. Known names: ['org.w3id.cwl.cwl.File', 'org.w3id.cwl.cwl.File.class.File_class', 'org.w3id.cwl.cwl.Directory', 'org.w3id.cwl.cwl.Directory.class.Directory_class', 'org.w3id.cwl.salad.Any', 'input_record_schema', 'dataset90820b54-cca1-4b50-a6b8-558586135e1d', 'scenario6ece3ca2-f45b-42a7-bf85-95a22443a85e', 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920', 'models525a2682-565c-4dde-ace5-76ccfe6254f3', 'models6861d088-8f09-43fa-89d1-b18b587e1d22', 'freqc7b201ea-ef19-4831-a3ca-18b0b7381e71', 'op8846f2fd-6b5f-4e85-9722-dd220a0df24d', 'seasonde3dd417-c9e6-412a-bf73-e4e1b9e7382f', 'check_missing75883605-bd85-47fc-8e0d-c5ed227b7ee8', 'cf_compliance66d87015-cd60-4e20-b43c-f5edb890d758', 'data_validation0143577c-6989-487c-b5b5-3362b677d2ac', 'output_format0a80ac59-d5ff-414a-9b63-9c34ad11b6c7', 'outputs_record_schema']).

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/francis/dev/weaver/weaver/processes/wps_package.py", line 1791, in _handler
    result = package_inst(**cwl_inputs)  # type: CWL_Results
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/factory.py", line 32, in __call__
    out, status = self.factory.executor(self.t, kwargs, runtime_context)
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/executors.py", line 62, in __call__
    return self.execute(process, job_order_object, runtime_context, logger)
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/executors.py", line 145, in execute
    self.run_jobs(process, job_order_object, logger, runtime_context)
  File "/home/francis/dev/miniconda/envs/weaver-py3/lib/python3.10/site-packages/cwltool/executors.py", line 252, in run_jobs
    raise WorkflowException(str(err)) from err
cwltool.errors.WorkflowException: Items schema ({'type': 'enum', 'symbols': ['rcp85', 'ssp585', 'rcp45', 'rcp26', 'ssp126', 'ssp245'], 'name': 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920'}) not a valid Avro schema: The name 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920' is already in use.. Known names: ['org.w3id.cwl.cwl.File', 'org.w3id.cwl.cwl.File.class.File_class', 'org.w3id.cwl.cwl.Directory', 'org.w3id.cwl.cwl.Directory.class.Directory_class', 'org.w3id.cwl.salad.Any', 'input_record_schema', 'dataset90820b54-cca1-4b50-a6b8-558586135e1d', 'scenario6ece3ca2-f45b-42a7-bf85-95a22443a85e', 'scenario1b4d8bed-8ce2-4a0b-a0ad-194203993920', 'models525a2682-565c-4dde-ace5-76ccfe6254f3', 'models6861d088-8f09-43fa-89d1-b18b587e1d22', 'freqc7b201ea-ef19-4831-a3ca-18b0b7381e71', 'op8846f2fd-6b5f-4e85-9722-dd220a0df24d', 'seasonde3dd417-c9e6-412a-bf73-e4e1b9e7382f', 'check_missing75883605-bd85-47fc-8e0d-c5ed227b7ee8', 'cf_compliance66d87015-cd60-4e20-b43c-f5edb890d758', 'data_validation0143577c-6989-487c-b5b5-3362b677d2ac', 'output_format0a80ac59-d5ff-414a-9b63-9c34ad11b6c7', 'outputs_record_schema']).

Your Environment

  • cwltool version: 3.1.20230906142556
    Check using cwltool --version
@mr-c
Copy link
Member

mr-c commented Sep 15, 2023

As a workaround, I suggest giving your enums a name when reusing them. Also consider using SchemaDefRequirement.

hints:
  WPS1Requirement:
    provider: https://finch.crim.ca/wps
    process: ensemble_grid_point_wetdays

Please use a namespace for any extensions, like WPS1Requirement

@fmigneault
Copy link
Contributor Author

fmigneault commented Sep 18, 2023

@mr-c
I have tried setting name for the enum. While this "works", cwltool replaces all the definitions it finds with a simple reference to the tmp schema by name, resulting in something like:

type:
 - "null"
 - 'tmp.tmp2utae6vd.ensemble_grid_point_wetdays.scenario.scenariobd96176e-aafc-4fff-b32d-261c2bc4d858'
 - type: array
   items: 'tmp.tmp2utae6vd.ensemble_grid_point_wetdays.scenario.scenariobd96176e-aafc-4fff-b32d-261c2bc4d858'

Because of my current implementation, I'm having issues resolving this type (other steps down the line have errors with this "unknown" string).

Now, to my question...
When I run cwltool from the command line with definitions that do NOT have name, it successfully executes the operation without any error (see down below).

What could be at cause that schema fails when using Factory.make, but not when using cwltool CLI?

Following are the adapted tool and inputs to test without our custom definitions:

# /tmp/tool.cwl
cwlVersion: v1.0
class: CommandLineTool
baseCommand: echo
requirements:
  InlineJavascriptRequirement: {}
inputs:
- id: lat
  type:
  - string
  - type: array
    items: string
- id: lon
  type:
  - string
  - type: array
    items: string
- id: start_date
  type:
  - 'null'
  - string
- id: end_date
  type:
  - 'null'
  - string
- id: ensemble_percentiles
  type:
  - 'null'
  - string
  default: 10,50,90
- id: average
  type:
  - 'null'
  - boolean
  default: false
- id: dataset
  type:
  - 'null'
  - type: enum
    symbols:
    - humidex-daily
    - candcs-u5
    - candcs-u6
    - bccaqv2
  default: candcs-u5
- id: scenario
  type:
  - 'null'
  - type: enum
    symbols:
    - ssp126
    - rcp85
    - rcp45
    - rcp26
    - ssp585
    - ssp245
  - type: array
    items:
      type: enum
      symbols:
      - ssp126
      - rcp85
      - rcp45
      - rcp26
      - ssp585
      - ssp245
- id: models
  type:
  - 'null'
  - type: enum
    symbols:
    - KACE-1-0-G
    - CCSM4
    - MIROC5
    - EC-Earth3-Veg
    - TaiESM1
    - GFDL-ESM4
    - GFDL-CM3
    - CanESM5
    - HadGEM3-GC31-LL
    - INM-CM4-8
    - IPSL-CM5A-MR
    - EC-Earth3
    - GFDL-ESM2G
    - humidex_models
    - GFDL-ESM2M
    - MIROC-ESM
    - CSIRO-Mk3-6-0
    - MPI-ESM-LR
    - NorESM1-M
    - CNRM-CM5
    - all
    - GISS-E2-1-G
    - 24models
    - MPI-ESM1-2-HR
    - CNRM-ESM2-1
    - CNRM-CM6-1
    - CanESM2
    - FGOALS-g3
    - NorESM1-ME
    - IPSL-CM6A-LR
    - CMCC-ESM2
    - pcic12
    - EC-Earth3-Veg-LR
    - ACCESS-ESM1-5
    - MRI-CGCM3
    - MIROC-ESM-CHEM
    - NorESM2-MM
    - bcc-csm1-1-m
    - BNU-ESM
    - UKESM1-0-LL
    - CESM1-CAM5
    - MIROC-ES2L
    - MRI-ESM2-0
    - HadGEM2-ES
    - MIROC6
    - MPI-ESM-MR
    - INM-CM5-0
    - bcc-csm1-1
    - BCC-CSM2-MR
    - ACCESS-CM2
    - NorESM2-LM
    - IPSL-CM5A-LR
    - FGOALS-g2
    - HadGEM2-AO
    - 26models
    - MPI-ESM1-2-LR
    - KIOST-ESM
  - type: array
    items:
      type: enum
      symbols:
      - KACE-1-0-G
      - CCSM4
      - MIROC5
      - EC-Earth3-Veg
      - TaiESM1
      - GFDL-ESM4
      - GFDL-CM3
      - CanESM5
      - HadGEM3-GC31-LL
      - INM-CM4-8
      - IPSL-CM5A-MR
      - EC-Earth3
      - GFDL-ESM2G
      - humidex_models
      - GFDL-ESM2M
      - MIROC-ESM
      - CSIRO-Mk3-6-0
      - MPI-ESM-LR
      - NorESM1-M
      - CNRM-CM5
      - all
      - GISS-E2-1-G
      - 24models
      - MPI-ESM1-2-HR
      - CNRM-ESM2-1
      - CNRM-CM6-1
      - CanESM2
      - FGOALS-g3
      - NorESM1-ME
      - IPSL-CM6A-LR
      - CMCC-ESM2
      - pcic12
      - EC-Earth3-Veg-LR
      - ACCESS-ESM1-5
      - MRI-CGCM3
      - MIROC-ESM-CHEM
      - NorESM2-MM
      - bcc-csm1-1-m
      - BNU-ESM
      - UKESM1-0-LL
      - CESM1-CAM5
      - MIROC-ES2L
      - MRI-ESM2-0
      - HadGEM2-ES
      - MIROC6
      - MPI-ESM-MR
      - INM-CM5-0
      - bcc-csm1-1
      - BCC-CSM2-MR
      - ACCESS-CM2
      - NorESM2-LM
      - IPSL-CM5A-LR
      - FGOALS-g2
      - HadGEM2-AO
      - 26models
      - MPI-ESM1-2-LR
      - KIOST-ESM
  default: all
- id: thresh
  type:
  - 'null'
  - string
  default: 1.0 mm/day
- id: freq
  type:
  - 'null'
  - type: enum
    symbols:
    - YS
    - QS-DEC
    - AS-JUL
    - MS
  default: YS
- id: op
  type:
  - 'null'
  - type: enum
    symbols:
    - '>='
    - '>'
    - gt
    - ge
  default: '>='
- id: month
  type:
  - 'null'
  - int
  - type: array
    items: int
  inputBinding:
    valueFrom: "\n            ${\n                const values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12];\n                if (Array.isArray(self)) {\n                    \n        if (self.every(item => values.includes(item))) {\n            return self;\n        }\n        else {\n            throw \"invalid value(s) in [\" + self + \"] are not all allowed values from [\" + values + \"]\";\n        }\n    \n                }\n                else {\n                    \n        if (values.includes(self)) {\n            return self;\n        }\n        else {\n            throw \"invalid value \" + self + \" is not an allowed value from [\" + values + \"]\";\n        }\n    \n                }\n            }\n        "
- id: season
  type:
  - 'null'
  - type: enum
    symbols:
    - SON
    - MAM
    - JJA
    - DJF
- id: check_missing
  type:
  - 'null'
  - type: enum
    symbols:
    - pct
    - at_least_n
    - wmo
    - skip
    - from_context
    - any
  default: any
- id: missing_options
  type:
  - 'null'
  - File
  format: iana:application/json
- id: cf_compliance
  type:
  - 'null'
  - type: enum
    symbols:
    - raise
    - log
    - warn
  default: warn
- id: data_validation
  type:
  - 'null'
  - type: enum
    symbols:
    - raise
    - log
    - warn
  default: raise
- id: output_name
  type:
  - 'null'
  - string
- id: output_format
  type:
  - 'null'
  - type: enum
    symbols:
    - csv
    - netcdf
  default: netcdf
- id: csv_precision
  type:
  - 'null'
  - int
outputs:
- id: output
  type: stdout
$namespaces:
  iana: https://www.iana.org/assignments/media-types/
  edam: http://edamontology.org/
# /tmp/data.yml
lat: "45.35629610945964"
lon: "-73.98748912005094"
average: "False"
start_date: "1950"
end_date: "1960"
ensemble_percentiles: ""
dataset: "candcs-u6"
scenario: "ssp126"
models: "26models"
freq: "YS"
data_validation: "warn"
output_format: "csv"
csv_precision: 0
thresh: "15 mm\/day"
cwltool /tmp/tool.cwl /tmp/data.yml

INFO /home/francis/dev/conda/envs/weaver/bin/cwltool 3.1.20230906142556
INFO Resolved '/tmp/tool.cwl' to 'file:///tmp/tool.cwl'
WARNING ../../tmp/tool.cwl:222:5: JSHINT:                 const values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12];
../../tmp/tool.cwl:222:5: JSHINT:                 ^
../../tmp/tool.cwl:222:5: JSHINT: W104: 'const' is available in ES. CWL only supports ES5.1
WARNING ../../tmp/tool.cwl:222:5: JSHINT:         if (self.every(item => values.includes(item))) {
../../tmp/tool.cwl:222:5: JSHINT:                             ^
../../tmp/tool.cwl:222:5: JSHINT: W119: 'arrow function syntax (=>)' is only available in ES6. CWL only supports ES5.1
INFO [job tool.cwl] /tmp/ctjoy0z1$ echo > /tmp/ctjoy0z1/263e318e7095de94ae5eaecd5f725701aee65c2d
INFO [job tool.cwl] completed success
{
    "output": {
        "location": "file:///home/francis/263e318e7095de94ae5eaecd5f725701aee65c2d",
        "basename": "263e318e7095de94ae5eaecd5f725701aee65c2d",
        "class": "File",
        "checksum": "sha1$adc83b19e793491b1c6ea0fd8b46cd9f32e592fc",
        "size": 1,
        "path": "/home/francis/263e318e7095de94ae5eaecd5f725701aee65c2d"
    }
}INFO Final process status is success

If possible, I would like to replicate this behaviour where the type: ["null", <enum>, enum[] ] combined succeeds validation as above.
Implementing proper SchemaDefRequirement would be a follow-up step.
Is there some option to set to achieve this goal?
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants