Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

further specify the format of iri-references throughout #1085

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

karenetheridge
Copy link
Member

@karenetheridge karenetheridge commented Mar 30, 2021

  • $id cannot have a fragment at all, so separate it from the common definiiion
  • $ref, $dynamicRef, $recursiveRef must be a iri-reference to a schema
    location: when the fragment is non-empty, it must refer to an $anchor or a
    json-pointer
  • absoluteKeywordLocation in result outputs must use canonical
    IRIs/iri-references: either there is no fragment, or it is non-empty and
    encodes a json pointer

Also fixed the output schema which erroneously still identifies absoluteSchemaLocation as a uri, not an iri.

Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this change is necessary, but I'm ok with it. I assume we aren't talking about a change to the already released meta-schema, but rather that this would be part of the next draft?

schema.json Outdated
@@ -51,7 +51,7 @@
},
"$recursiveRef": {
"$comment": "\"$recursiveRef\" has been replaced by \"$dynamicRef\".",
"$ref": "meta/core#/$defs/uriReferenceString",
"$ref": "meta/core#/$defs/uriReferenceToSchemaString",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only defined value for $recursiveRef is #, so this could be "const": "#". On the other hand, if there is no $recursiveAnchor, then it's supposed to behave like a $ref which allows any URI, so ¯\_(ツ)_/¯.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only defined value for $recursiveRef is #

Not really. The 201909 spec leaves the behaviour for other uri-references somewhat open -- it doesn't actually restrict (i.e. with a MUST) to the # value. I've gotten other $recursiveRef values to work just fine in my implementation, with no extra code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. It's not forbidden, it's just undefined.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge
Copy link
Member Author

karenetheridge commented Mar 31, 2021

I'm not sure this change is necessary, but I'm ok with it. I assume we aren't talking about a change to the already released meta-schema, but rather that this would be part of the next draft?

There's no reason why this can't go into the current draft's metaschema, because it is not adding an additional restriction that wasn't already there, but rather making the metaschema more conformant to existing language.

That is -- it lets implementations catch problems earlier by applying a schema against the metaschema, rather than having to catch problems with custom code during evaluation.

@jdesrosiers
Copy link
Member

There's no reason why this can't go into the current draft's metaschema

We generally don't consider meta-schemas living documents. Once they are released, we don't change them any more than we would change the spec. We have made exceptions for fixing bugs in meta-schemas, but this isn't fixing anything, it's just making it more strict. I don't think we have a rule written down anywhere about how to deal with this kind of thing (we probably should), but historical precedent seems to indicate that this should go in the next draft.

@karenetheridge
Copy link
Member Author

We generally don't consider meta-schemas living documents.

We have in the past, however, fixed inconsistencies between the written spec and the metaschema, where the written spec prevails (because it very definitely is versioned and cannot be revised before the next release).

@jdesrosiers
Copy link
Member

We have in the past, however, fixed inconsistencies between the written spec and the metaschema, where the written spec prevails

Right, which is essentially bug fixes. This is not a bug fix. I don't feel strongly enough about this to try to block it, but I do feel that it's important for more voices to express opinions on whether we should be making this kind of change for meta-schemas before this gets approved. Even better would be if we decide on guidelines for when we want to allow updating meta-schemas and document that decision.

@handrews
Copy link
Contributor

@karenetheridge @jdesrosiers it would probably be a good idea to start a discussion in the community repo on the meta-schema process. It can complement my spec release process discussion (json-schema-org/community#7) but please start a new discussion! The meta-schema process should be separate. I do not have an opinion on it (or rather, we tried my opinion, but no one liked it- which I say with humor, it's just time to try someone else's ideas).

In the meantime let's put this on hold while we sort out the process concerns.

@handrews handrews changed the base branch from master to draft-next May 19, 2021 19:07
@handrews handrews added this to the draft-next milestone May 19, 2021
@jdesrosiers jdesrosiers changed the base branch from draft-next to main July 8, 2022 15:33
@jdesrosiers
Copy link
Member

The draft-next branch has been merged and is now closed. The merge target for this PR has been changed to main. Here are the recommended steps to get your branch reabsed properly.

  1. Make sure your remote for the json-schema-org/json-schema-spec repo is up-to-date. (Example: git fetch upstream).
  2. Rebase your commits onto main. (Example: git rebase --onto upstream/main abcd123~1 (replace abcd123 with the commit hash of the first commit in your PR)).
  3. Force push the rebased branch to your fork. (Example: git push --force origin my-branch).

@handrews
Copy link
Contributor

@karenetheridge I think this would need to be reworked now that JSON Schema uses IRIs rather than URIs, and I'm not quite sure how the regexes would work in that case. Along with the branch changes, perhaps this is best closed and a new PR submitted?

@karenetheridge
Copy link
Member Author

I have rebased against the updates to main.

@karenetheridge karenetheridge changed the title further specify the format of uri-references throughout further specify the format of iri-references throughout Sep 5, 2022
- $id cannot have a fragment at all, so separate it from the common definiiion
- $ref, $dynamicRef, $recursiveRef must be a iri-reference to a schema
location: when the fragment is non-empty, it must refer to an $anchor or a
json-pointer
- absoluteKeywordLocation in result outputs must use canonical
IRIs/iri-references: either there is no fragment, or it is non-empty and
encodes a json pointer

Also fixed the output schema which erroneously still identifies
absoluteSchemaLocation as a uri, not an iri.
@gregsdennis
Copy link
Member

Good reminder that I need to update the output schema to the new structure. 👍

This should go through first, then I'll do that update.

"format": "iri-reference"
"format": "iri-reference",
"$comment": "any fragment must be empty, or match anchor or json-pointer syntax",
"pattern": "^[^#]*(#([A-Za-z_][-A-Za-z0-9.:_]*|/([^~]|~[01])*))?$"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"pattern": "^[^#]*(#([A-Za-z_][-A-Za-z0-9.:_]*|/([^~]|~[01])*))?$"
"pattern": "^[^#]*(#([A-Za-z_][-A-Za-z0-9.:_]*|(/([^~/]|~[01])*)*))?$"

The second half of the fragment match, for JSON Pointer, won't match the empty pointer ($ref: "#", $ref: "foo#"). The suggested change follows the JSON Pointer ABNF.

This is still incomplete though. Fragments may contain %xx encoded characters. In a pointer their use may be necessary, but in either a pointer or anchor they are expected to be treated like the characters they represent, even when those characters don't actually need to be %-encoded.

Suggested change
"pattern": "^[^#]*(#([A-Za-z_][-A-Za-z0-9.:_]*|/([^~]|~[01])*))?$"
"pattern": "^[^#]*(#(([A-Za-z_]|%[A-Fa-f0-9]{2})([-A-Za-z0-9.:_]|%[A-Fa-f0-9]{2})*|((/|%2[fF])([^~/%]|~[01]|%[A-Fa-f0-9]{2})*)*))?$"

Not too pretty, but I think it allows anything that's valid. It won't reject some invalid anchors or pointers if they're %-encoded - it would be possible to express the same character ranges as allowed combinations of %-encoded digits, but I'm not masochistic enough to do it.

I'm a little uncertain describing a uri fragment with a regex isn't more trouble than it's worth.

"format": "uri"
"format": "iri",
"$comment": "any fragment must be non-empty, and use json-pointer syntax",
"pattern": "^[^#]*(#(/([^~]|~[01])*)*)?$"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says must be non-empty, but this does allow an empty fragment.

Suggested change
"pattern": "^[^#]*(#(/([^~]|~[01])*)*)?$"
"pattern": "^[^#]*(#(/([^~/]|~[01])*)+)?$"

The change to + requires a non-empty pointer; disallowing the / from the reference-token character range is better consistent with its spec's ABNF (and iriReferenceToSchemaString above), though it doesn't ultimately change what the regex matches.

And of course %-encoding needs fixing here too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I think we need to allow empty fragments, so it's the comment that is wrong here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge Agreed, you need to be able to represent the root with a JSON Pointer fragment, which would be an empty fragment.

@gregsdennis
Copy link
Member

@karenetheridge please address @notEthan's regex concerns, but I think this is good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants