Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow returning match result of a specific expression in a rule without an action #427

Closed
alanmimms opened this issue May 23, 2016 · 10 comments
Labels

Comments

@alanmimms
Copy link
Contributor

It's very common to need to return a value from one of the non-terminals in a rule or inside of a parenthesized sub-rule. For example:

varDecl = type:type id:ID init:( EQ e:expr {return e} )?
                { return scopedAST('VARDECL', {type, id, init}) }

In this case, I needed the expr labeled as e inside the init parenthesis level for an optional phrase in the language. I didn't need the "noise word" EQ as part of the returned value.

If the PEGjs language had a symbol to be used to mark terminals like the expr above so that they, and only they, are the value returned from a grammar rule or sub-rule this case would be simpler.

To rewrite my example above:

varDecl = type:type id:ID init:( EQ ^expr )?
                { return scopedAST('VARDECL', {type, id, init}) }

Note the use of the ^ to mark the expr value inside the init parenthesized optional phrase sub-rule to designate what is bound to init. This simplifies many situations both with and without the parenthesized sub-rule shown in this example.

Thanks for making such a wonderfully simple, elegant, and powerful tool. I love PEGjs! 😄

@opatut
Copy link

opatut commented May 25, 2016

Love that idea! ^ is very intuitive, too.

This could work on non-nested rules too:

WhiteSpacedIdentifier = WhiteSpace? identifier:Identifier WhiteSpace {return identifier;}
// becomes
WhiteSpacedIdentifier = WhiteSpace? ^Identifier WhiteSpace?

@grrrwaaa
Copy link

grrrwaaa commented Jul 5, 2016

Very readable! Presumably, use of multiple ^ would also work, such that:

a = ^b  c  ^d  e

Would return [b, d]? Seems to make sense.

Likewise, seems to make sense that if mixed with named captures, the ^ rules are ignored, so

x = a ^b foo:c { return foo; }

Would return only c.

@alanmimms
Copy link
Contributor Author

Oh that multiples idea is excellent. Mixing with named captures should be
an error.

On Tue, Jul 5, 2016, 01:07 Graham Wakefield notifications@github.com
wrote:

Very readable! Presumably, use of multiple ^ would also work, such that:

a = ^b c ^d e

Would return [b, d]? Seems to make sense.

Likewise, seems to make sense that if mixed with named captures, the ^
rules are ignored:

x = a ^b foo:c { return foo; }


You are receiving this because you authored the thread.

Reply to this email directly, view it on GitHub
#427 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ABC26k8v0DIzuWUlkoDZGm2ep10Y5bcMks5qShDAgaJpZM4IkuA9
.

@dmajda dmajda changed the title Simpler mechanism to return value from inside ( ) in grammar rule actions Allow returning match result of a specific expression in a rule without an action Jul 31, 2016
@dmajda dmajda added the feature label Jul 31, 2016
@dmajda dmajda added this to the post-1.0.0 milestone Jul 31, 2016
@dmajda
Copy link
Contributor

dmajda commented Jul 31, 2016

I totally agree the described pattern is a quite common. Having a way to express it without an action makes sense.

What I’m not so sure about the proposed solution (the ^ operator). Using a special character whose meaning is not immediately obvious is always problematic and adds to the learning curve. It’s also possible the character would be better used for some other purpose. Last but not least, I don’t like the idea of putting things that don’t directly influence parsing into expressions much. One can argue there is already one instance of this — the $ operator — and I’d agree. But I’m not sure whether addition of $ wasn’t a (small) mistake. If so, I’d like to avoid making it again.

I’ll think about this more deeply after 1.0.0.

@opatut
Copy link

opatut commented Aug 1, 2016

Some more food for thought: since ^ and labeled expressions kind of collide (@grrrwaaa suggest ignoring the ^), how about instead of marking the result, one could mark the ignored expressions, for example (syntax suggestion!) by providing an empty label:

WhiteSpacedIdentifier = WhiteSpace? identifier:Identifier WhiteSpace {return identifier;}
// becomes
WhiteSpacedIdentifier = :WhiteSpace? Identifier :WhiteSpace?

There, no new syntax (we have : already), only a bit of extension on the semantics:

  • allow empty labels (call these "anonymous" expressions?)
  • if only one non-anonymous capture exists, do not generate an array of expression matches, instead return the only match

@Mingun
Copy link
Contributor

Mingun commented Aug 1, 2016

In that case more consistent will mark with empty "labels" those expressions which will need to be returned as a result. It, by the way, not to break the existing semantics: the label exists, but it is unnamed; as labels are introduced for access to result, it is quite logical that unnamed labels automatically become result. Simultaneous existence of automatic and concrete labels shall be forbidden. If only one automatic label exist, then the single result, but not an array with one element must be returned since such behavior is more demanded.

@nedzadarek
Copy link

@Mingun

Why not just return any label?
start = "{" :expr "}" // return expr
start = "{" label:expr "}" // return label
I think it makes sense that if you "label" something then you want to do something with it (e.g. return it).

On the other hand, why rules like start = ex:expr :expr should rise an error?
Maybe it should do something similar to javascript's functions' arguments variable? For example start = ex:expr :expr should return [ex, expr]. When you have an action, there should be labeled & arguments variables (start = ex:expr :expr { return [ex, arguments[0], ex] } )

@alanmimms I like this idea. We don't have to create a name (a variable/label) just to return simple value.
I think unnamed label (:expr) would be better than ^expr

@Mingun
Copy link
Contributor

Mingun commented Jan 31, 2017

Why not just return any label?

@nedzadarek because if you give a name to expression it is more likely that you wont to use it in some no-trivial expression. At least, the name is important for you, otherwise you wouldn't give it, truly? Also, mixing named and unnamed labels more likely are mistake than conscious action, so it will be safer if it will be forbidden. If you give one name why not provide another?

Unfortunately, it is necessary to recognize that automatic labels in that look in what they are offered by @opatut, it is impossible to implement since it creates ambiguity in grammar. The elementary example:

start = a :b;// `a` - it is rule reference or label?
a = .;
b = .;

So, for this purpose need select another character. At the moment there is a choice from: ~, (backslash), @, #, %, ^, -, |, \ and ,.


Another solution -- introduce some pseudo-actions -- a shortcuts for creation of simple functions for return, for example, {=>[]} can mean "collect the labeled results from the sequence and to return them in the array", and {=>{}} -- the same, but to return of an object, with the keys equal to names of labels. But implementation of this behavior doesn't require extension of grammar and can be quite realized by plug-ins. I would even tell that it is more preferable to have such implementation by plug-ins:

start1 = a:'a' b c d:. {=>[]};// returns ['a', <d value>]
start2 = a:'a' b c d:. {=>{}};// returns { a: 'a', d: <d value> }

@nedzadarek
Copy link

nedzadarek commented Feb 1, 2017

@Mingun

because if you give a name to expression it is more likely that you wont to use it in some no-trivial expression. At least, the name is important for you, otherwise you wouldn't give it, truly?

Yes, the name is important => I want to use it => I want to return it.
What's the problem with non-trivial expressions?

Unfortunately, it is necessary to recognize that automatic labels in that look in what they are offered by @opatut, it is impossible to implement since it creates ambiguity in grammar. The elementary example:

Yes.
I guess ::expression is confusing too? @dmajda

@futagoza
Copy link
Member

futagoza commented Jan 22, 2018

Closed as duplicate of #235

Edit: Added note to OP's comment on #235 that references this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants