New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add spec text for RegExp Modifiers #3221
base: main
Are you sure you want to change the base?
Conversation
<emu-grammar>Atom :: `(?` RegularExpressionFlags `:` Disjunction `)`</emu-grammar> | ||
<ul> | ||
<li> | ||
It is a Syntax Error if the source text matched by |RegularExpressionFlags| contains any code point other than `i`, `m`, or `s`, or if it contains the same code point more than once. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a Syntax Error if the source text matched by |RegularExpressionFlags| contains any code point other than `i`, `m`, or `s`, or if it contains the same code point more than once. | |
It is a Syntax Error if the source text matched by |RegularExpressionFlags| contains any code point other than `i`, `m`, or `s`, or contains the same code point more than once. |
(for consistency with the others)
1. If _add_ contains *"i"*, set _ignoreCase_ to *true*. | ||
1. If _add_ contains *"m"*, set _multiline_ to *true*. | ||
1. If _add_ contains *"s"*, set _dotAll_ to *true*. | ||
1. If _remove_ contains *"i"*, set _ignoreCase_ to *false*. | ||
1. If _remove_ contains *"m"*, set _multiline_ to *false*. | ||
1. If _remove_ contains *"s"*, set _dotAll_ to *false*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. If _add_ contains *"i"*, set _ignoreCase_ to *true*. | |
1. If _add_ contains *"m"*, set _multiline_ to *true*. | |
1. If _add_ contains *"s"*, set _dotAll_ to *true*. | |
1. If _remove_ contains *"i"*, set _ignoreCase_ to *false*. | |
1. If _remove_ contains *"m"*, set _multiline_ to *false*. | |
1. If _remove_ contains *"s"*, set _dotAll_ to *false*. | |
1. If _add_ contains *"i"*, set _ignoreCase_ to *true*. | |
1. Else if _remove_ contains *"i"*, set _ignoreCase_ to *false*. | |
1. If _add_ contains *"m"*, set _multiline_ to *true*. | |
1. Else if _remove_ contains *"m"*, set _multiline_ to *false*. | |
1. If _add_ contains *"s"*, set _dotAll_ to *true*. | |
1. Else if _remove_ contains *"s"*, set _dotAll_ to *false*. |
Use else-if to further emphasise that add
/remove
are disjoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, this doesn't indicate disjointedness any more than the independent Ifs do. Also, this suggestion prioritizes modifiers in add, while the existing text prioritizes modifiers in remove. Existing implementations in other languages that don't error on (?i-i:)
, such as C#/.NET, prioritize removal (i.e., set and then unset). While the consensus was to be more restrictive and issue an error, I'd still prefer we still maintain remove priority in the specification text for the sake of consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you would still prefer the "Else if", I would prefer to rewrite it as follows:
1. If _add_ contains *"i"*, set _ignoreCase_ to *true*. | |
1. If _add_ contains *"m"*, set _multiline_ to *true*. | |
1. If _add_ contains *"s"*, set _dotAll_ to *true*. | |
1. If _remove_ contains *"i"*, set _ignoreCase_ to *false*. | |
1. If _remove_ contains *"m"*, set _multiline_ to *false*. | |
1. If _remove_ contains *"s"*, set _dotAll_ to *false*. | |
1. If _remove_ contains *"i"*, set _ignoreCase_ to *false*. | |
1. Else if _add_ contains *"i"*, set _ignoreCase_ to *true*. | |
1. If _remove_ contains *"m"*, set _multiline_ to *false*. | |
1. Else if _add_ contains *"m"*, set _multiline_ to *true*. | |
1. If _remove_ contains *"s"*, set _dotAll_ to *false*. | |
1. Else if _add_ contains *"s"*, set _dotAll_ to *true*. |
Such that remove continues to take precedence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine too.
<dl class="header"> | ||
</dl> | ||
<emu-alg> | ||
1. Let _ignoreCase_ be _rer_.[[IgnoreCase]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. Let _ignoreCase_ be _rer_.[[IgnoreCase]]. | |
1. Assert: _add_ and _remove_ have no elements in common. | |
1. Let _ignoreCase_ be _rer_.[[IgnoreCase]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we use the RegularExpressionFlags
production and restrict it with an early error instead of introducing a new, more restricted production?
Is there a reason not to? Modifiers are a strict subset of |
I believe |
How would you propose it be written so as not to require early errors? Modifiers can appear in any order, cannot be duplicated within one or both of the modifier locations, and I do plan to extend the set of allowed modifiers over time with things like
The more modifiers we add, the more production parameters we need, and the more complex the production becomes. Plus, this doesn't avoid the need for an early error to forbid |
@rbuckton I'm not saying we can't use any early errors at all on the production, just that we don't need to use early errors to enforce the allowed flag characters when an alternative grammar could do the job. I'm happy to continue using early errors to check for duplicates. |
This adds the specification text for the Stage 3 RegExp Modifiers proposal.
Test262 tests can be found at tc39/test262#3960.