New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to specify repetition count (like in regexps) #30
Comments
I will not implement this feature. The main reason is that there is no room in the PEG.js grammar for the In my experience this kind of limited repetition occurs mainly on the "lexical" parts of the grammar (rules like |
I've reconsidered and I am reopening this issue. It seems that ability to specify arbitrary number of repetitions is wanted a lot by users. I'd like to avoid regexp-like
The biggest question is what the separating character(s) should be and how to mark up ranges. As for the separating character,
As for the range markup, I took inspiration in Ruby. I was also thinking about I am not sure about half-open ranges. Maybe it would be better to mark them up using
Any ideas or comments? |
Really cool that you plan to support this feature! I like your (default) suggestion: I don't like the +/- syntax for half-open ranges, the double-dot syntax is much more intuitive and readable IMO. The only thing I had second thoughts about was using "#" vs "@", because IMO "#" naturally implies numbers/counting, whereas "@" naturally implies a reference, so "#" may be a bit more intuitive and readable (and perhaps you could use the "@" in the future for something?). But that's really a minor issue, and I would be happy with the "@" syntax. Cheers! |
Just a quick comment: I think that |
How about we special case for They aren't likely to be meaningful even if the action is of other languages. |
I like these variants among suggested (but this is up to you of course to choose, since you're the author :) ):
or
but the second is less preferable the
|
Thinking about it again. Will these work?
since |
👍 from me, that looks good. of course, |
I like the |
I agree, the |
If I'm not mistaken, there's no need to add any delimiter, unless you want to allow variable names in the ranges.
are all unambiguous. |
@pygy, the problem with not using a delimiter is that it potentially stifles evolution of the syntax of the language. For example, if we wanted to use comma for something else later on down the road, we would now have issues with syntax collisions all over the place. Constraining it to within Plus, people are used to using the |
I don't feel strongly about the syntax, but I do want this feature, and it'd be great if an expression could be used as a range value. My use case: parsing literals in IMAP server responses, which look like |
How about variables in restrictions? This is very useful for messages with header, containing its length. For example, grammar
must parse
This is useful for many protocols. |
May be use that syntax: |
… and handling in passes.
…, optimized for speed.
… and handling in passes.
…, optimized for speed.
…ries and regenerate parser
…aries and regenerate parser
…s and regenerate parser and add documentation
``` expression| exact | expression| .. | expression|min.. | expression| ..max| expression|min..max| ``` Introduce two new opcodes: * IF_LT <min>, <then part length>, <else part length> * IF_GE <max>, <then part length>, <else part length> Introduce a new AST node -- `repeated`, that contains expression and the minimum and maximum number of it repetition. If `node.min.value` is `null` or isn't positive -- check of the minimum length isn't made. If `node.max.value` is `null`, check of the maximum length isn't made. If `node.min` is `null` then it is equals to the `node.max` (exact repetitions case)
Added two new opcodes: - IF_LT_DYNAMIC: same as IF_LT, but the argument is a reference to the stack variable instead of constant - IF_GE_DYNAMIC: same as IF_GE, but the argument is a reference to the stack variable instead of constant
…ries and regenerate parser
…aries and regenerate parser
…s and regenerate parser
``` expression| exact | expression| .. | expression|min.. | expression| ..max| expression|min..max| ``` Introduce two new opcodes: * IF_LT <min>, <then part length>, <else part length> * IF_GE <max>, <then part length>, <else part length> Introduce a new AST node -- `repeated`, that contains expression and the minimum and maximum number of it repetition. If `node.min.value` is `null` or isn't positive -- check of the minimum length isn't made. If `node.max.value` is `null`, check of the maximum length isn't made. If `node.min` is `null` then it is equals to the `node.max` (exact repetitions case)
Added two new opcodes: - IF_LT_DYNAMIC: same as IF_LT, but the argument is a reference to the stack variable instead of constant - IF_GE_DYNAMIC: same as IF_GE, but the argument is a reference to the stack variable instead of constant
…ries and regenerate parser
…aries and regenerate parser
…s and regenerate parser
…aries and regenerate parser
…s and regenerate parser
…s and regenerate parser
* main: (104 commits) Audit CHANGELOG.md Release prep Update dependencies Ranges (pegjs/pegjs#30): Add documentation, examples and changelog entry Ranges (pegjs/pegjs#30): Add testcases for delimiter support in ranges and regenerate parser Ranges (pegjs/pegjs#30): Add support for delimiters in ranges Ranges (pegjs/pegjs#30): Add testcases for ranges with function boundaries and regenerate parser Ranges (pegjs/pegjs#30): Add ability to use code blocks as range boundaries Ranges (pegjs/pegjs#30): Add testcases for ranges with dynamic boundaries and regenerate parser Ranges (pegjs/pegjs#30): Add ability for use labels as range boundaries Ranges (pegjs/pegjs#30): Add testcases for ranges and regenerate parser Ranges (pegjs/pegjs#30): Implement ranges support. Range syntax: ``` expression| exact | expression| .. | expression|min.. | expression| ..max| expression|min..max| ``` Typo Update the testTimeout, so Windows doesn't fail on slow-ass CI hardware Fix rollup issues with web tests Update deps in dependent projects as well Add changelog entry for updating node version BREAKING: update min node version to 14, because of jest. Update package-lock, using npm install --legacy-peer-deps to get around @rollup/plugin-node-resolve issue Update dependencies, make small changes to accomodate, re-build. ...
It would be helpful if the PEG.js grammar allowed something like range expressions of POSIX basic regular expressions to be used. E.g.:
matches
a
,aa
, ...,aaaaaaa
matches the empty string and
a
matches a string with up to (and including) six
a
'smatches a string of six or more
a
'smatches only
aaa
, being equivalent to"a"\{3,3\}
The text was updated successfully, but these errors were encountered: