Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse(specific rule) makes no sense! #469

Closed
valtih1978 opened this issue Nov 2, 2016 · 2 comments
Closed

Parse(specific rule) makes no sense! #469

valtih1978 opened this issue Nov 2, 2016 · 2 comments
Labels

Comments

@valtih1978
Copy link

valtih1978 commented Nov 2, 2016

I tried to parse with something different than start but it fails with "Can't start parsing from rule myrule" because

        var startRule = options.startRule || "start";
        if (["start"].indexOf(startRule) < 0) {
          throw new Error("Can't start parsing from rule " + quote(startRule) + ".");
        }

Why do you define the grammar but cannot test anything but only one of them? Do you write programs such that only main method can be run in them? What about other procedures? Don't you understand that a grammar is a library, that you may use every parser separately, outside of the scope of the main? I have defined an integer parser somewhere at the bottom of my grammar. Why I cannot use/test it separately, outside of my main program? You say that I am allowed to parse with any start rule but limit their name to 'start'. What the craze? Why indexOf('start'), why not startRule == 'start'? That would give more freedom and understanding to the user!

@valtih1978 valtih1978 changed the title Can't start parsing from rule Parse(specific rule) makes no sense! Nov 3, 2016
@valtih1978
Copy link
Author

valtih1978 commented Nov 3, 2016

Ok, I have found that you do that you limit the set of start rules for performance reasons

The list of allowed start rules of a generated parser now has to be specified explicitly using the allowedStartRules option of the PEG.buildParser method or the --allowed-start-rule option on the command-line. This will make certain optimizations like rule inlining easier in the future.

I offer that you enable full functionality by default and limit the start rules on performance demand.

Meantime, I have found a workaround

function *execAllGen(re, text){
    for (let match; (match = re.exec(text)) !== null;) 
      yield match;
} ; const execAll = (re, text) => [...execAllGen(re, text)]

  const allRules = execAll(/^\s*(\w+)/mg, grammar).map(m=>m[1])
  const parser = peg.generate(grammar, {allowedStartRules:allRules});

I argue that generate could produce a number of parser, a parser per every start rule.

Edit: Wait, do you advise that I generate a parser for every rule separately?

@dmajda
Copy link
Contributor

dmajda commented Nov 29, 2016

@valtih1978 Please tone down your aggressive voice. Just because something isn’t done the way you would do it doesn’t mean there aren’t good reasons for doing it the way it is.

The view PEG.js takes is that the whole grammar (or rather the parser generated form it) is the atomic unit, not a rule. The entry point to a parser is its start rule (the first one by default). If you want to have multiple entry points, fine, you just have to explicitly designate them.

Think of the parser as an object where each rule is a separate method. A sane advice in OOP is to keep all methods private except those which form the interface to the object. And this is exactly what PEG.js does for parsers.

This part of PEG.js philosophy is justified by two things:

  1. Parsers can maintain state. Allowing the parsing to start from any rule by default could easily lead to users violating some invariants the parser relies on, unless the parser author was careful to explicitly specify allowed start rules.

  2. It makes sense to have freedom to inline rules. This is currently done only in very specific circumstances (proxy rules), but the architecture doesn’t prevent expanding this and improving parser performance without parser authors doing anything.

Both reasons have their equivalent in the OOP analogy I mentioned.

Now, you are right that PEG.js’s approach makes testing slightly harder. In my experience, this is not a big problem. You just need to use test cases that are valid input of some start rule. I was able to test the PEG.js grammar and some other grammars just fine using this approach.

As for libraries, I don’t see how the current approach prevents building them. One just needs to be explicit about the “exported” rules. But if building rule libraries is your main use case, you may be better off using some parser combinator-based library, which are naturally better suited to this task than grammar-based parser generators.

In any case, I’d not changing the default. As for having a way to easily allow all rules as start rules, see #234.

@dmajda dmajda closed this as completed Nov 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants