Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potentially incorrect conflict; suggest inline? #728

Open
nikomatsakis opened this issue Mar 25, 2023 · 2 comments
Open

potentially incorrect conflict; suggest inline? #728

nikomatsakis opened this issue Mar 25, 2023 · 2 comments
Milestone

Comments

@nikomatsakis
Copy link
Collaborator

This grammar that I was playing with gets a ton of errors. They all go away if you uncomment the #[inline] on Generics; but I am not convinced that should be required for the grammar to be LR(1). We should investigate if this is a bug in the lane-table algorithm or legit; if legit, there really ought to be a way to suggest #[inline] as a way to resolve the problem.

Also, the error messages here are a mix of "good and bad", it'd be nice to investigate why the nice errors aren't triggering.

@nikomatsakis nikomatsakis added this to the 1.0 milestone Mar 25, 2023
@dburgener
Copy link
Contributor

I think this conflict is correct, with inlining being a good fix. Digging through the error messages, I found good ones like:

in.txt:31:5: 31:16: Local ambiguity detected

  The problem arises after having observed the following symbols in the input:
    Flags
  At that point, if the next token is a `"short"`, then the parser can proceed
  in two different ways.

  First, the parser could execute the production at in.txt:31:5: 31:16, which
  would consume the top 0 token(s) from the stack and produce a `Generics`. This
  might then yield a parse tree like
    Flags ╷          ╷ Type Id "(" Comma<Type> ")" Throws ";" Descriptor
    │     └─Generics─┘                                                 │
    └─Method───────────────────────────────────────────────────────────┘

  Alternatively, the parser could shift the `"short"` token and later use it to
  construct a `ScalarType`. This might then yield a parse tree like
    Flags "short"      ╷ Id ";"
    │     ├─ScalarType─┤      │
    │     └─Type───────┘      │
    └─Field───────────────────┘

  See the LALRPOP manual for advice on making your grammar LR(1).

That points at the problem being that Method and Field are ambiguous. After a Flags, if we see a Type, we don't know whether or not to insert a () Generics in the stack, since we won't resolve everything else until several tokens later. Inlining addresses this by allowing the parser to put off the decision and track each inlined rule as separate options.

This is essentially exactly the situation described in this test.

So I think the real issue here is if there is error message improvement possible.

there really ought to be a way to suggest #[inline] as a way to resolve the problem.

I'd love to look into this and submit a PR if someone could point me in the right direction to get started.

it'd be nice to investigate why the nice errors aren't triggering.

Is it that they aren't triggering, or that they're lost in the noise?

@yannham
Copy link
Contributor

yannham commented May 15, 2024

I have experienced multiple conflicts recently that seemed to be artificial. Adding an inline, as suggested, did the trick each time, but I'm still clueless as to why there was conflict to begin with. One of the most recent one is that I split a rule Ident used throughout the Nickel grammar in two different disjoint rules: RestrictedIdent and ContextualKeyword. LALRPOP reported a conflict and the example didn't make sense. I'll try to show a repro once I get the version with the #[inline] merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants