Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add syntax for subpatterns as subroutines #190

Open
slevithan opened this issue Apr 26, 2017 · 1 comment
Open

Add syntax for subpatterns as subroutines #190

slevithan opened this issue Apr 26, 2017 · 1 comment

Comments

@slevithan
Copy link
Owner

slevithan commented Apr 26, 2017

This pseudo-AST structure described in #179 could also be the foundation of a useful advanced feature from PCRE, Perl, etc.: The ability to reference the entire contents of a named or numbered group (including nested parens) from later in the pattern, enabling support for subpattern reuse via (?&name) and (?n).

This would simply require generic syntax tokens for ) and any ( that isn't part of a self-contained token like (?#...) to mark subsequent tokens as children until the closing ) arrives. Then the generated pattern contents of each named group could be derived when needed.

Perhaps this would look like:

[
  {
    type: 'named-capture-start',
    name: 'name',
    output: '(',
    children: [
      {type: 'x-ignored', output: ''},
      {type: 'native-token', output: '.'},
    ],
  },
  {type: 'native-token', output: ')'},
]

Notes:

  • An error would need to be thrown if the group name referenced by (?&name) or group number with (?n) was not yet closed.
  • Make sure to handle things like (?<$1>.)(?<$2>(?&$1))(?&$2).
  • Some of the use cases are already handled by XRegExp.build and XRegExp.tag, but this would still be cleaner and or more robust in some cases, and the foundation created for it would make potential future XRegExp syntax addons more powerful.
@slevithan
Copy link
Owner Author

This would also enable (?<DEFINE>(?<name1>...)(?<name2>...)) blocks that make subpattern reuse via (?&name) and (?n) more robust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant