Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-greedy operators for * , + , and ? #57

Closed
richb-hanover opened this issue Oct 7, 2011 · 7 comments
Closed

Non-greedy operators for * , + , and ? #57

richb-hanover opened this issue Oct 7, 2011 · 7 comments
Assignees

Comments

@richb-hanover
Copy link

I have a language where there are repeated instances of the same pattern where I only care about the first symbol. For example:

          system       OBJECT IDENTIFIER ::= { mib-2 1 }
          interfaces   OBJECT IDENTIFIER ::= { mib-2 2 }
          at           OBJECT IDENTIFIER ::= { mib-2 3 }
          ip           OBJECT IDENTIFIER ::= { mib-2 4 }
          icmp         OBJECT IDENTIFIER ::= { mib-2 5 }
          tcp          OBJECT IDENTIFIER ::= { mib-2 6 }
          udp          OBJECT IDENTIFIER ::= { mib-2 7 }
          egp          OBJECT IDENTIFIER ::= { mib-2 8 }

This simple example could be matched by this pattern (where _ is whitespace):

identifier _ "OBJECT IDENTIFIER" _ "::=" _ "{" _ identifier _ number _ "}"

This isn't such a big deal in this case (I already typed the pattern :-) But the language has a set of other big hairy constructs that don't warrant the full parsing (I only want the initial identifier on each line to do the job I have in mind).

I would like to type something like this pattern:

identifier _ "OBJECT IDENTIFIER" .*? "}"

where the ".*?" is non-greedy - it only consumes to the first occurrence of the terminal. Could this be on the list for PEG.js? Many thanks.

@richb-hanover
Copy link
Author

Update: This could be satisfied by a repetition count (which is a generalization of my initial thought) as suggested in Google Groups at: http://groups.google.com/group/pegjs/browse_thread/thread/2bea15581be45187

@dmajda
Copy link
Contributor

dmajda commented Oct 7, 2011

In PEG formalism, you can easily match until a terminator by using a predicate together with the . metacharacter. Something like:

"OBJECT IDENTIFIER" (!"}" .)* "}"

Is that sufficient for you?

@richb-hanover
Copy link
Author

Yes, that works perfectly. Thanks!

@rymohr
Copy link

rymohr commented Jan 9, 2013

@dmajda What's the recommended practice for stripping out the empty char returned by the !"}" expression?

For example:

rule
   = chars:(!"-suffix" .)+ "-suffix"

"foo-suffix" => [[ '', 'f' ], ['', 'o' ], ['', 'o' ]]  // result
"foo-suffix" => ['f', 'o', 'o' ] // desired result

I was able to achieve this by breaking !"-suffix" . into its own rule that just returns the . result, but I'm curious if there's a better way.

@curvedmark
Copy link

#66 will fix this.

I think in the mean while you can use:

rule
    = chars:(!"-suffix" c:. {return c})+ "-suffix"

@dmajda
Copy link
Contributor

dmajda commented Jan 10, 2013

@islandr Please don't use issues as a place to ask questions about PEG.js usage. Especially when they are closed and especially when you are asking something that other people beside me can help you with. The proper channel is the Google Group.

@rymohr
Copy link

rymohr commented Jan 10, 2013

Sorry David. Thought this would have been a good place since it was
directly related to the example you'd given.

On Wed, Jan 9, 2013 at 9:51 PM, David Majda notifications@github.comwrote:

@islandr https://github.com/islandr Please don't use issues as a place
to ask questions about PEG.js usage. Especially when they are closed and
especially when you are asking something that other people beside me can
help you with. The proper channel is the Google Grouphttp://groups.google.com/group/pegjs
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/57#issuecomment-12083927.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants