Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle parsing errors in moo.states() #91

Open
moranje opened this issue Aug 16, 2018 · 8 comments
Open

Handle parsing errors in moo.states() #91

moranje opened this issue Aug 16, 2018 · 8 comments
Labels

Comments

@moranje
Copy link
Contributor

moranje commented Aug 16, 2018

Hi,

I'm looking for a way to get the offending in a moo states object. The following doesn't seem to work and a regular error is still thrown:

  moo.states({
    main: {
      // throws the error instead of tokenizing it
      myError: moo.error
    },
    // This throws a moo configuration erro
    myError: moo.error,
  });

What would be the correct way to get the error or offending token in a stateful lexer?

@moranje moranje changed the title Handle parsing errors in a moo.states() Handle parsing errors in moo.states() Aug 16, 2018
@tjvr tjvr added the question label Aug 18, 2018
@tjvr
Copy link
Collaborator

tjvr commented Aug 18, 2018

The following works fine for me:

  moo.states({
    main: {
      // throws the error instead of tokenizing it
      myError: moo.error
    },
  });

I don't think it would make sense to allow configuring an error at the toplevel? Tokens must always be defined inside a state.

@nathan
Copy link
Collaborator

nathan commented Aug 18, 2018

The following works fine for me:

That only works for me if I have another token type in the list; otherwise it generates the regex /(?:)/my and then fails when it can't find the group that matched. If there are no tokens that match anything, instead of generating /(?:)/ (an irrefutable match), we should generate /(?!)/ (an impossible match).

I don't think it would make sense to allow configuring an error at the toplevel?

I think it might. Usually lexer states are opaque to the parser and it just sees a stream of tokens, so you very rarely want a) only certain states to have error tokens or b) different states to have different names for the error token. But I don't think the syntax @moranje provided makes sense—if we go this route, we should probably have a more general notion of state inheritance and/or a special state from which other states automatically inherit; then a global error token would be as simple as a { myError: moo.error } prototype state.

@moranje
Copy link
Contributor Author

moranje commented Aug 19, 2018

I think it might. Usually lexer states are opaque to the parser and it just sees a stream of tokens, so you very rarely want a) only certain states to have error tokens or b) different states to have different names for the error token. But I don't think the syntax @moranje provided makes sense—if we go this route, we should probably have a more general notion of state inheritance and/or a special state from which other states automatically inherit; then a global error token would be as simple as a { myError: moo.error } prototype state.

I agree on both accounts. Since a parsing error a 'global' failure it would make more sense to handle that in a single location rather than redoing it over and over again. Preferably there would a way itself to having access to the offset, col and line parameters of the offending token. That and the syntax above in nonsensical.

@nathan
Copy link
Collaborator

nathan commented Aug 19, 2018

@moranje

Preferably there would a way itself to having access to the offset, col and line parameters of the offending token.

The moo.error notation already gives you that information:

const moo = require('moo')

const lexer = moo.states({
  main: {
    id: /\w+/,
    err: moo.error,
  },
})

lexer.reset('hello!')
lexer.next() // { type: 'id', value: 'hello', text: 'hello', offset: 0, lineBreaks: 0, line: 1, col: 1 }
lexer.next() // { type: 'err', value: '!', text: '!', offset: 5, lineBreaks: 0, line: 1, col: 6 }

@moranje
Copy link
Contributor Author

moranje commented Aug 20, 2018

The moo.error notation already gives you that information

Thanks! Here's an update to the README to represent that #95.

@tjvr
Copy link
Collaborator

tjvr commented Sep 19, 2018

Nathan added support for including states in other states, and support for $all, in #93.

It still needs documentation and some tests 🙂

@tjvr
Copy link
Collaborator

tjvr commented Sep 20, 2018

@moranje If you're interested in trying out the latest master and seeing how it works for you, that would be really useful feedback! 😊

@moranje
Copy link
Contributor Author

moranje commented Sep 20, 2018

Great! I have limited time to spare at the moment, but am excited to try out these additions. I'll try to implement the changes somewhere this week. I'll get back to you on this, great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants