Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't require lexer and parser errors to be the same type #707

Open
marcospb19 opened this issue Feb 18, 2023 · 5 comments
Open

Don't require lexer and parser errors to be the same type #707

marcospb19 opened this issue Feb 18, 2023 · 5 comments

Comments

@marcospb19
Copy link

marcospb19 commented Feb 18, 2023

Fallible actions allow for returning a custom error.

Num: i32 = {
    r"[0-9]+" =>? i32::from_str(<>)
        .map_err(|_| ParseError::User {
            error: YOUR_CUSTOM_ERROR_HERE
        })
};

However, if you use a custom lexer, LALRPOP requires that the error returned by your Lexer is the same as the one returned by your fallible actions.

I believe this requirement is unnecessary.

Fixing this would probably require changing https://docs.rs/lalrpop-util/0.19.8/lalrpop_util/enum.ParseError.html.

@Pat-Lafon
Copy link
Contributor

I'm kinda curious about this, do you have a full example? I'm wondering why this wouldn't be supported with something like impl From<LexError> for CustomParseError?

@marcospb19
Copy link
Author

marcospb19 commented Apr 27, 2023

I don't have a simple example but this is the code I was messing with.

My Parser crate depends on my Lexer crate, so the Lexer crate can't access CustomParseError to implement impl From<LexError> for CustomParseError and call it inside of its impl Iterator.

Sooo, to make this work, my parser error is actually declared in my lexer's lib.rs.

But even with this, impl From forces you to wrap your error type in some enum variant, making it a little more verbose in your .lalrpop file.

So yeah, ignoring the multi-crate issue, impl From in user code should work but still that's a small inconvenience.


I also think it would be cool if the error enum had a variant to tell us where the error was found by the parser.

pub enum ParseError<L, T, E> {
    ...
    UserParser {
        error: E,
    },
    UserLexer {
        error: E,
    },
}

Perhaps:

pub enum ParseError<L, T, PE, LE> {
    ...
    UserParser {
        error: PE,
    },
    UserLexer {
        error: LE,
    },
}

@Pat-Lafon
Copy link
Contributor

Maybe your code you linked is private? I couldn't follow the link.

@marcospb19
Copy link
Author

@Pat-Lafon :O oops, can you try again?

(this comment will self-destruct soon)

@Pat-Lafon
Copy link
Contributor

Ah, thanks for sharing. I'm a bit shaky on this usage of lalrpop but it's interesting to poke around.

I see what your saying, I wonder if there is a more "principled" approach here in that I think ParseError::InvalidToken can only come from Lalrpop's Lexer(As in if you provide a custom lexer, this is not a reachable variant). If so, then maybe that should instead be a field that is parameterized as a generic lexer error where people who just use the builtin lexer get an error that is just InvalidToken like http://lalrpop.github.io/lalrpop/lexer_tutorial/004_external_lib.html#implement-the-lexer .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants