Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for picking fields with @ #38

Closed
hildjj opened this issue Apr 15, 2021 · 27 comments
Closed

Add support for picking fields with @ #38

hildjj opened this issue Apr 15, 2021 · 27 comments
Labels
enhancement New feature or request

Comments

@hildjj
Copy link
Contributor

hildjj commented Apr 15, 2021

as discussed at pegjs/pegjs#11,

foo = @bar _ @baz
bar = $"bar"i
baz = $"baz"i
_ = " "*
parse('barbaz') // returns [ 'bar', 'baz' ]

I use this syntax frequently when it's available.

@hildjj hildjj added the enhancement New feature or request label Apr 15, 2021
@phpnode
Copy link
Contributor

phpnode commented Apr 15, 2021

Like the idea in principle. Will this syntax clash if we decide to use @decorators or @annotations for rules?

@hildjj
Copy link
Contributor Author

hildjj commented Apr 15, 2021

Oh, great question. Let's mock those in when we first implement this, and make sure the grammar doesn't clash. In peg.js, it's pretty straightforward, and applies only on the RHS of rules: https://github.com/pegjs/pegjs/blob/b7b87ea8aeeaa1caf096e2da99fd95a971890ca1/src/parser.pegjs#L148

@Mingun
Copy link
Member

Mingun commented Apr 16, 2021

I personally prefer to reserve @ sign for the picking operator, because it is very short and take attraction to itself, and for annotations/decorators/attributes use another syntax, for example, Rust-like: #[attribute]. Symbol # now is unused in the grammar so no clash.

Having @ both as picking operator and as annotation operator definitely will create clash on the grammar until semicolons for delimiting rules are optional

@hildjj
Copy link
Contributor Author

hildjj commented Apr 16, 2021

Are there any other ASCII-7 characters available? I'd like to make sure we're not misaligned with https://github.com/tc39/proposal-decorators as it goes to stage 3, if possible.

@phpnode
Copy link
Contributor

phpnode commented Apr 16, 2021

# is used to indicate private in JS, maybe that's a better fit? it would imply inverting the syntax though:

foo = bar #_ baz
bar = $"bar"i
baz = $"baz"i
_ = " "*

@TymurGubayev
Copy link

How about "any rule starting with an underscore".

Since this could potentially break existing code, hide this behind an opt-in.

Next step is auto-generating such rules, i.e.

foo = bar _ws baz
bar = $"bar"i
baz = $"baz"i
ws = " "*

where _ws isn't explicitly defined.

@phpnode
Copy link
Contributor

phpnode commented Apr 16, 2021

@TymurGubayev syntax changes should be purely additive imo. It's really important that we don't break the grammars that exist in the wild. _ is a valid identifier character so we can't use it for this. If we do decide that we want to introduce breaking changes then we should first add a syntax directive (rather than a compiler flag) in the same way protobuf does it.

@Mingun
Copy link
Member

Mingun commented Apr 16, 2021

Another possible way of evolution -- change how the rule rule itself is defined and thus make it impossible to create clash.

At least two options can be suggested:

  • change the rule definition to something like this (the https://github.com/kevinmehall/rust-peg way):
    rule rule_name() = ...
    The empty brackets are not allowed in the current grammar, so no clash.
  • the other way is just make the optional semicolon in the end of the rule definition required. Personally I think that is easest and right way to solve this problem, because modern JS rules also requires semicolons to make syntax consistent.

@hildjj
Copy link
Contributor Author

hildjj commented Apr 16, 2021

* the other way is just make the optional semicolon in the end of the rule definition required. 
Personally I think that is easest and right way to solve this problem, because 
[modern JS rules](https://eslint.org/docs/rules/semi#require-or-disallow-semicolons-instead-of-asi-semi) 
also requires semicolons to make syntax consistent.

I'm -1 on this. A more nuanced reading of the link above might be "There are problems with ASI no matter which way you choose. Make sure to use lint rules to protect yourself either way." I really don't want us to pick a side on this, because it will alienate a bunch of people either way we pick for not enough benefit.

@StoneCypher
Copy link
Contributor

  • the other way is just make the optional semicolon in the end of the rule definition required.

This has been repeatedly suggested throughout the years. This breaks javascript and must not occur.

@StoneCypher
Copy link
Contributor

because modern JS rules also requires semicolons to make syntax consistent.

No they don't. Your preferred eslint config is not "modern JS rules."

@StoneCypher
Copy link
Contributor

@hildjj - In the linked issue, you suggested that this could be used to ignore productions. At this time I do not yet understand how.

The ability to ignore productions would be huge for me. I maintain a compiler behind peg mostly just for this.

Would you show me how this syntax enables that, or did I misunderstand?

@StoneCypher
Copy link
Contributor

Another possible way of evolution -- change how the rule rule itself is defined and thus make it impossible to create clash.

If there are breaking changes, this is a meaningless anti-fork, and I will just resurrect my fork.

No, we're not going to throw away the entire community on a whim to get a minor feature that can be had other ways. Don't be irresponsible.

@hildjj
Copy link
Contributor Author

hildjj commented Apr 18, 2021

@hildjj - In the linked issue, you suggested that this could be used to ignore productions. At this time I do not yet understand how.

It would allow it by solving a slightly different problem: marking the productions you want to save. If the syntax is small enough, it seems like an approach you might like just as well as marking the ones you want to ignore.

@hildjj
Copy link
Contributor Author

hildjj commented Apr 18, 2021

No, we're not going to throw away the entire community on a whim to get a minor feature that can be had other ways. Don't be irresponsible.

Let's all assume good intent from everyone involved, please. We'll make decisions together, but ideas that we don't take may kick off new ideas that we do like.

@StoneCypher
Copy link
Contributor

We'll make decisions together, but ideas that we don't take may kick off new ideas that we do like.

Any significant breaking change means this library is useless to me, and the original must be re-saved. This will almost certainly be true of every other user of the old library.

I'm letting you know, right now, that people who clung to the old library for a year and a half after its old maintainer murdered it through negligence are doing so because they don't have a choice.

If you make breaking changes, you're creating the exact same extermination choice.

It's important to get this through everyone's head, early, or else you'll lose the whole thing over whimsy.

The way to fork a dying library to save the old community is well understood. Don't re-murder it.

@StoneCypher
Copy link
Contributor

It would allow it by solving a slightly different problem: marking the productions you want to save. If the syntax is small enough, it seems like an approach you might like just as well as marking the ones you want to ignore.

Well, okay, that's fine, maybe even useful

But please understand that it's critically different than the issue you tagged it from, and doesn't do anything at all for people who need to ignore productions

@hildjj
Copy link
Contributor Author

hildjj commented Apr 18, 2021

I'm committed to backward compatibility. I'm committed to doing good semantic versioning to ensure that people can make good decisions about when to upgrade. I'm committed to being a steward for the community's consensus (see my work at the IETF for examples of how much I value consensus over my own opinions).

Over time, hopefully we can build enough trust in those commitments.

@StoneCypher
Copy link
Contributor

I'm committed to backward compatibility

❤️

 

I'm committed to doing good semantic versioning to ensure that people can make good decisions about when to upgrade.

👍

 

my work at the IETF

🦾

@Mingun
Copy link
Member

Mingun commented Apr 18, 2021

This has been repeatedly suggested throughout the years. This breaks javascript and must not occur.

I'm confusing: how is slight change of the PEG syntax can break (generated) js? Furthermore, requiring semicolons doesn't really introduce new syntax, but reuses existing one. I can't imagine any situation where you won't be able to switch from the optional semicolons to the required ones. That is why I think that this is the most safe way to resolve ambiguity (except using the other symbol, of course).

Your preferred eslint config is not "modern JS rules."

That is not my preferred config, honestly, I'm even don't use JS regularly. It is just ESLint default. And I think that people had some good reasons to make defaulted exactly that variant. So I think, that choice could be called "modern rules".

@hildjj
Copy link
Contributor Author

hildjj commented Apr 18, 2021

Let's not get bogged down in any of the syntax wars here, please. Any valid JS should be able to go into an action block.

@Mingun
Copy link
Member

Mingun commented Apr 18, 2021

Any valid JS should be able to go into an action block.

Of course, of course, that is holy cow... My suggestion doesn't touch any code blocks at all, it is about semicolons in the grammar, not in the code blocks. Code blocks are always were opaque things and that remains unchanged.

@hildjj
Copy link
Contributor Author

hildjj commented Apr 18, 2021

Nod, that's a helpful distinction.

@hildjj
Copy link
Contributor Author

hildjj commented Apr 21, 2021

PR #89 is pretty close at this point. Now it's time to decide if we like it enough to land it.

I do like it; I use @ enough that i've gone out of my way in a couple of non-production projects to rely on odd versions of pegjs in the past. The implementation is adequate (if a little confusing because of the way the stack manipulation has to happen). Tests gave adequate coverage. I've prototyped ahead the few other places I might like us to use @ and am not concerned about overlaps from a grammar perspective.

Note that the example I added to the end of the documentation is:

list = head:word tail:(_ "," _ @word)* { return [head].concat(tail); }
word = $[a-z]i+
_ = [ \t]*

which shows off how @ can pluck from parens, which is its killer feature, imo.

@Mingun
Copy link
Member

Mingun commented Apr 22, 2021

I think, that feature is ready to merging, clash with the proposed annotations syntax actually not so big and I think is easy solvable (if not by requiring semicolons, then by using, for example @@annotation, or rust-style annotations)

@hildjj hildjj closed this as completed Apr 22, 2021
@StoneCypher
Copy link
Contributor

Needs a version bump for a release

@hildjj
Copy link
Contributor Author

hildjj commented Apr 22, 2021

Yep, I'm going to do that on #101 since this isn't the only feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants