Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Case sensitivity in EXPRESS schema #54

Open
termoshtt opened this issue May 5, 2021 · 4 comments
Open

Case sensitivity in EXPRESS schema #54

termoshtt opened this issue May 5, 2021 · 4 comments
Labels
RFC Request for comments

Comments

@termoshtt
Copy link
Contributor

As far as I read ISO-10303-11, user defined identifiers should be snake_case, but some identifier copied into schema/ does contains non snake_case identifier like degree_Celsius

@termoshtt termoshtt added the RFC Request for comments label May 5, 2021
@g-rauhoeft
Copy link
Contributor

ISO-10303-11 actually says the following about case sensitivity:

EXPRESS may be written using upper, lower or mixed case letters

and

The case of letters is significant only within explicit string literals

As such, restricting any identifiers to lower case goes against the specification. I attempted to change it within the letter parser, but it causes a plethora of unit test failures. I'm still attempting to find out why:

pub fn letter(input: &str) -> RawParseResult<char> { satisfy(|c| matches!(c, 'A'..='Z' | 'a'..='z')).parse(input) }

This might be the wrong approach though, as the specification clearly says that we can disregard the case of anything except for string literals. Perhaps we should transform any inputs except string literals to lower case by default.

@g-rauhoeft
Copy link
Contributor

I believe I've found the cause of the test failures if mixed case is allowed:

For blocks without an END_ tag, such as UNIQUE, the parser swallows the END_ tag of the parent block, because it doesn't know how to differentiate between the end of the UNIQUE block and its contents. For this reason I suggest creating either an until_tag combinator, that consumes the input until an END_ tag is reached, or an is_not_reserved combinator, that wraps a nom::combinator::cond combinator, checking whether an identifier is actually a keyword reserved by EXPRESS.

I'm currently working on the latter and hope to make a pull request early next week.

@g-rauhoeft
Copy link
Contributor

I've started implementing the changes under: https://github.com/g-rauhoeft/ruststep/tree/case-sensitivity-fix

Please let me know if you feel this is going in the right direction. I haven't created a pull request yet, because some test cases still fail, but a vast majority are successful.

@termoshtt
Copy link
Contributor Author

#66 (comment)

6.1 The syntax of the specification refers to keywords being in uppercase, but that is only for the syntax of the specification in WSN, not for EXPRESS as such.

This looks true.

As such, restricting any identifiers to lower case goes against the specification.

I agree. We should accept non-lower case identifiers.

or an is_not_reserved combinator,

Detecting reserved keyword in simple_id has been implemented in #87, and improved in #117

For this reason I suggest creating either an until_tag combinator, that consumes the input until an END_ tag is reached,

The many_till parser in #66 has been merged though #114, but not used in esprc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC Request for comments
Projects
None yet
Development

No branches or pull requests

2 participants