Indentation error lost in alternative #527

Lev135 · 2023-05-29T20:19:07Z

Suppose we have the following parsers:

pInd, pB :: Parsec Void String String
pInd = indentGuard (hidden space) EQ (pos1 <> pos1) *> string "a"
pB   = string "b"                                   *> string "a"

If we try to parse the string "a" with them we'll get the following errors (which seems to be good enough for me):

ghci> parseTest (pInd <* hidden eof) "a"
1:1:
  |
1 | a
  | ^
incorrect indentation (got 1, should be equal to 2)
ghci> parseTest (pB <* hidden eof) "a"  
1:1:
  |
1 | a
  | ^
unexpected 'a'
expecting 'b'

However, if we'll try them as optional, the second one preserve a useful error message:

ghci> parseTest (optional (try pB) <* hidden eof) "a"  
1:1:
  |
1 | a
  | ^
unexpected 'a'
expecting 'b'

while the first loses it at all:

ghci> parseTest (optional (try pInd) <* hidden eof) "a"  
1:1:
  |
1 | a
  | ^
unexpected 'a'

Why megaparsec behaves so? Are there places where preserving indentation error is undesirable or maybe I'm using wrong combinators here?

The text was updated successfully, but these errors were encountered:

mrkkrp · 2023-05-30T15:17:35Z

optional (try pInd) in the second example succeeds because it is the same as:

(Just <$> pInd) <|> pure Nothing

Since hints are only for non-fancy errors (it is a collection of ErrorItems) there is no way the error about indentation could be persisted in this case.

Lev135 · 2023-05-30T16:59:26Z

Since hints are only for non-fancy errors (it is a collection of ErrorItems) there is no way the error about indentation could be persisted in this case.

However, why do we need such behavior? Maybe for really user-defined errors there is no visible way to translate them into hints, but for indentation errors in particular it doesn't seems to be complicated to add special hints. These are just some extra Int values in the parser state, so I hope it shouldn't affect performance in any visible way. Maybe the example with optional doesn't worth to add them, but in case of multiline indented blocks (which is much more common case) the same problem exists.

Maybe it's not so trivial to compose indentation hints. For example, if we have nested indented blocks:

what's the correct indentation of X? Of course, we can't say this confidently, but we could suppose, that it should belong to Cs column. However, with the current behavior (assuming that all Bs and Cs are arbitrary many and we are using some/many to parse them) we get "Unexpected X, expecting end of input" or (if we're checking the next element two be nonIndented) "Incorrect indentation, got 10, should be 1" while I think that much more sensible would be "Incorrect indentation, got 10, should be 9 (or 5, or 1)" or something like this.

mrkkrp · 2023-06-02T14:23:24Z

There is no strong conceptual reason why this is not supported. I think it is just not many people use the indentation features much and there was not enough demand for the library to evolve in that direction. There is also the question of balance between features vs complexity of the implementation.

mrkkrp added the question label May 30, 2023

mrkkrp added the feature-request label Jun 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Indentation error lost in alternative #527

Indentation error lost in alternative #527

Lev135 commented May 29, 2023

mrkkrp commented May 30, 2023

Lev135 commented May 30, 2023

mrkkrp commented Jun 2, 2023 •

edited

Indentation error lost in alternative #527

Indentation error lost in alternative #527

Comments

Lev135 commented May 29, 2023

mrkkrp commented May 30, 2023

Lev135 commented May 30, 2023

mrkkrp commented Jun 2, 2023 • edited

mrkkrp commented Jun 2, 2023 •

edited