Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using json-schema/cwl.yaml to validate CWL files inside IDE is not working correctly. #278

Open
alexiswl opened this issue Feb 8, 2024 · 16 comments

Comments

@alexiswl
Copy link

alexiswl commented Feb 8, 2024

Hello,

I was wondering if there is a way to use the json schema specified in cwl.yaml to auto-complete / lint my code before running cwltool validate.

I performed the following commands to get my json schema:

wget \
  --quiet \
  --output-document /dev/stdout \
  "https://raw.githubusercontent.com/common-workflow-language/cwl-v1.2/main/json-schema/cwl.yaml" | \
yq --output-format=json > cwl_mappings.json

In my JetBrains IDE,
Under Languages & Frameworks -> Schemas and DTDs -> _JSON Schema Mappings, I configured the following:

  • Schema file or URL to match the downloaded path above
  • File path pattern set to *.cwl

image

After restarting my IDE, nothing changes.

Having a look at some other schema definitions (that do work in my IDE) such as the [GitHub Actions Workflow Json](https://json.schemastore.org/github-workflow.json, I noticed that this had many more higher level fields.
Notably the additionalProperties field was set to false.

I tried changing the additionalProperties and allOf field in the top level to the following:

allOf:
  - $ref: '#/$defs/CWL'
additionalProperties: false

so that the properties in the top level of the CWL file would need to match one of CWLWorkflow, CWLGraph or CWLAtomic

This worked! ...sort of.

It now thinks we're in the $graph version deployment.

image

This is likely caused because CWLGraphBase has additionalProperties: {}.

After updating all items of $defs that match type object to have additionalProperties: false and all items of $defs that match type array to have additionalItems: false.

Presto!

image

image

However, workflows still have issues for the class definition as it thinks it should be of the type 'CommandLineTool'

image

I'll keep tinkering around but would be great to see if this json schema could also be auto-generated.

@mr-c
Copy link
Member

mr-c commented Feb 8, 2024

Tagging @fmigneault who hand wrote (!!) the existing json schema

Yes, it would be great if schema-salad or another tool could produce the JSON Schema version automatically!

@fmigneault
Copy link
Contributor

I'm using an equivalent config in PyCharm with the reference to the remote YAML schema.
I have not found a reliable way to validate all schemas at the same time. It seems JetBrain's IDEs are not handling additionalProperties: {} or additionalProperties: true properly, as if it is not able to rewind up in the schema hierarchy when a lower level requirement invalidates a specific definition in a oneOf. Disallowing additional properties seems to help, but I did not see anything in the spec that explicitly disallowed unknown fields.

@alexiswl
Copy link
Author

alexiswl commented Feb 10, 2024

Thanks @fmigneault I'm having the same issue using a conditional on the CWL attribute, I'll keep tinkering. I don't think it's a JetBrains issue though, I'm using an online json schema validator to compare and getting the same issue - https://www.jsonschemavalidator.net/

@alexiswl
Copy link
Author

Three days in a rabbit hole! Very excited to see the daylight again!!

I've got the following working for me now!

common-workflow-lab/cwl-ts-auto#27

@fmigneault
Copy link
Contributor

fmigneault commented Feb 16, 2024

@mr-c
If I understand correctly, https://github.com/common-workflow-language/cwl-v1.2/blob/main/Workflow.yml is used with https://github.com/common-workflow-language/schema_salad to generate the definitions of https://github.com/common-workflow-lab/cwl-ts-auto?

Therefore, should the features proposed in common-workflow-lab/cwl-ts-auto#27 be instead triggered by the CI of this repo, to invoke schema_salad and obtain the TS definitions, convert them to JSON-schema, validate the resulting JSON-schema against https://github.com/common-workflow-language/cwl-v1.2/blob/main/.github/workflows/ci.yml, and commit the updated version back to this repo? Not sure it makes much sense to have the JSON-schema live under https://github.com/common-workflow-lab/cwl-ts-auto.

@alexiswl
Copy link
Author

alexiswl commented Feb 18, 2024

I'm not sure how the cwl-ts-auto creation works, the repository's main branch hasn't been updated in a couple of years.

But happy to move it back to this repo instead. Would prefer to keep it as JSON as some of the IDE importers expect a url to a JSON file but can also convert back to YAML in addition for readability.

Would also be great to be able to register it with JSON Schema store as well

@mr-c
Copy link
Member

mr-c commented Feb 18, 2024

@mr-c If I understand correctly, https://github.com/common-workflow-language/cwl-v1.2/blob/main/Workflow.yml is used with https://github.com/common-workflow-language/schema_salad to generate the definitions of https://github.com/common-workflow-lab/cwl-ts-auto?

Yes, but it is manually updated; we don't automatically regenerate when schema-salad and/or the CWL schema is updated. And we use https://github.com/common-workflow-language/cwl-v1.2/blob/main/CommonWorkflowLanguage.yml as the root schema

Therefore, should the features proposed in common-workflow-lab/cwl-ts-auto#27 be instead triggered by the CI of this repo, to invoke schema_salad and obtain the TS definitions, convert them to JSON-schema, validate the resulting JSON-schema against https://github.com/common-workflow-language/cwl-v1.2/blob/main/.github/workflows/ci.yml, and commit the updated version back to this repo? Not sure it makes much sense to have the JSON-schema live under https://github.com/common-workflow-lab/cwl-ts-auto.

I'm happy for the JSON-schema to live here, so it gets published to the website under https://w3id.org/cwl/ a.k.a. https://www.commonwl.org/v1.2/

However I would be uncomfortable about automatically updating it without some automated tests of the result.

@mr-c
Copy link
Member

mr-c commented Feb 18, 2024

Would also be great to be able to register it with JSON Schema store as well

Cool, I didn't know about the JSON Schema Store. I would request that a URL like https://w3id.org/cwl/v1.2/cwl-schema.json (which would redirect to https://www.commonwl.org/v1.2/cwl-schema.json) or similar be registered once we are publishing the JSON Schema along with the existing v1.2 artifacts.

@mr-c
Copy link
Member

mr-c commented Feb 18, 2024

The other option would be to teach schema-salad how to produce JSON schema directly, without going through the typescript generation→conversion→cleanup process. This will now be a lot easier as you would be able to compare the direct JSON schema serialization to the result of @alexiswl 's converter.

@alexiswl
Copy link
Author

I'm happy for the JSON-schema to live here, so it gets published to the website under w3id.org/cwl a.k.a. commonwl.org/v1.2

Should that be a manual PR for now? Can copy over the outputs from the cwl-ts-auto GH actions workflow

@fmigneault
Copy link
Contributor

The other option would be to teach schema-salad how to produce JSON schema directly

I agree this would be the best option. I didn't go through that path originally because I add a pre-made CWL JSON-schema equivalent, and it was faster to tweak it than implement schema-salad ⇾ JSON-schema from scratch, but it is what I would have done otherwise.

However I would be uncomfortable about automatically updating it without some automated tests of the result.

Once converted, it would be easy to validate the resulting JSON by passing the path here:

CWL_JSON_SCHEMA_FILE = os.path.join(os.path.dirname(__file__), "../../json-schema/cwl.yaml")

This will run it against all https://github.com/common-workflow-language/cwl-v1.2/blob/main/conformance_tests.yaml

@alexiswl
Copy link
Author

Once converted, it would be easy to validate the resulting JSON by passing the path here:

Ah I missed this, built my own integration in GH actions instead here

@fmigneault
Copy link
Contributor

@mr-c @alexiswl
Has there been any update about publishing the JSON schema?
Since the 1.2.1_proposed branch was merged, the $id and repos using this raw reference are now invalid.

$id: "https://raw.githubusercontent.com/common-workflow-language/cwl-v1.2/1.2.1_proposed/json-schema/cwl.yaml"

I would love to have an official: https://www.commonwl.org/v1.2/cwl-schema.json

@alexiswl
Copy link
Author

I think an automatic PR to this repo whenever the json schema is updated in cwl-ts-auto could be feasible?

@fmigneault
Copy link
Contributor

Maybe something can be configured with a push to gh-pages branch using https://github.com/peaceiris/actions-gh-pages?
Usually, with a https://github.com/common-workflow-language/common-workflow-language.github.io, the https://github.com/common-workflow-language/cwl-v1.2 should be available as https://common-workflow-language.github.io/cwl-v1.2. However, https://github.com/common-workflow-language/cwl-website is also defined, and the site's structure is somewhat unusual.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants