Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indentation error lost in alternative #527

Open
Lev135 opened this issue May 29, 2023 · 3 comments
Open

Indentation error lost in alternative #527

Lev135 opened this issue May 29, 2023 · 3 comments

Comments

@Lev135
Copy link
Contributor

Lev135 commented May 29, 2023

Suppose we have the following parsers:

pInd, pB :: Parsec Void String String
pInd = indentGuard (hidden space) EQ (pos1 <> pos1) *> string "a"
pB   = string "b"                                   *> string "a"

If we try to parse the string "a" with them we'll get the following errors (which seems to be good enough for me):

ghci> parseTest (pInd <* hidden eof) "a"
1:1:
  |
1 | a
  | ^
incorrect indentation (got 1, should be equal to 2)
ghci> parseTest (pB <* hidden eof) "a"  
1:1:
  |
1 | a
  | ^
unexpected 'a'
expecting 'b'

However, if we'll try them as optional, the second one preserve a useful error message:

ghci> parseTest (optional (try pB) <* hidden eof) "a"  
1:1:
  |
1 | a
  | ^
unexpected 'a'
expecting 'b'

while the first loses it at all:

ghci> parseTest (optional (try pInd) <* hidden eof) "a"  
1:1:
  |
1 | a
  | ^
unexpected 'a'

Why megaparsec behaves so? Are there places where preserving indentation error is undesirable or maybe I'm using wrong combinators here?

@mrkkrp
Copy link
Owner

mrkkrp commented May 30, 2023

optional (try pInd) in the second example succeeds because it is the same as:

(Just <$> pInd) <|> pure Nothing

Since hints are only for non-fancy errors (it is a collection of ErrorItems) there is no way the error about indentation could be persisted in this case.

@Lev135
Copy link
Contributor Author

Lev135 commented May 30, 2023

Since hints are only for non-fancy errors (it is a collection of ErrorItems) there is no way the error about indentation could be persisted in this case.

However, why do we need such behavior? Maybe for really user-defined errors there is no visible way to translate them into hints, but for indentation errors in particular it doesn't seems to be complicated to add special hints. These are just some extra Int values in the parser state, so I hope it shouldn't affect performance in any visible way. Maybe the example with optional doesn't worth to add them, but in case of multiline indented blocks (which is much more common case) the same problem exists.

Maybe it's not so trivial to compose indentation hints. For example, if we have nested indented blocks:

1A
    1B
        1C
        2C
         X

what's the correct indentation of X? Of course, we can't say this confidently, but we could suppose, that it should belong to Cs column. However, with the current behavior (assuming that all Bs and Cs are arbitrary many and we are using some/many to parse them) we get "Unexpected X, expecting end of input" or (if we're checking the next element two be nonIndented) "Incorrect indentation, got 10, should be 1" while I think that much more sensible would be "Incorrect indentation, got 10, should be 9 (or 5, or 1)" or something like this.

@mrkkrp
Copy link
Owner

mrkkrp commented Jun 2, 2023

There is no strong conceptual reason why this is not supported. I think it is just not many people use the indentation features much and there was not enough demand for the library to evolve in that direction. There is also the question of balance between features vs complexity of the implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants